Skip to content
Sarthak Grover edited this page Nov 3, 2013 · 16 revisions

HTTP Tor Pluggable Transport (HTPT) Design

High Level Overview

This document describes the design of HTPT, a Pluggable Transport for Tor. As a Pluggable Transport, this system will send data between a Tor client and a Tor bridge. To evade detection, the transport will hide data within HTTP requests and responses with the following goals:

  • Secure - the transport should be difficult for the censor to block access to our transport without incurring sizable collateral damage or material costs

  • High performance - the transport should provide the maximum speed for the appropriate level of security

Protocol Overview

In the following section, the structure and mechanics of the Pluggable Transport are described. At a high level, the client will send traffic to the server which looks like HTTP traffic. However, this traffic actually contains Tor packets hidden within parts of the HTTP protocol. The graphic below illustrates this in more detail:

A graphic of the flow between client and server

As this diagram shows, the Tor client will send data as usual, obfsproxy will take the Tor packet and hand it to our Pluggable transport. After hiding the Tor packet in HTTP requests and data, HTPT will send the data using headless WebKit. On the server side, the data will be received by Apache, handed off to the Web Server Gateway Interface (WSGI), and passed up to the pluggable transport if it has data, or to the image gallery if it is not Pluggable Transport Traffic.

Design Details

This section contains details about how HTPT will tunnel traffic through the HTTP protocol. Communication is broken down into the following steps:

Steps of the Client- Server Interaction

Step 1: Server IP and Password

Before initiating a connection with the server, the client must know the IP address of the HTPT bridge and the password of the bridge. We assume that the IP address and the password are exchanged out of band (eg. BridgeDB)

Step 2: Authentication

Before the client and server can exchange traffic, a) the server should validate that the client is a valid user.

The client performs a HTTP Auth with the password acquired out of band. The server verifies the password and sends the client a session id.

Step 3: Client Uploads Data

Steps 3 and 4 are the heart of HTPT and rely upon several obfuscation protocols to hide data within the HTTP stream. We are currently considering 3 types vectors for steganography: HTML content, URLs, and images. Steps 3 and 4 provide asynchronous data transfer, initiated by the client for both directions.

Structure of the Pluggable Transport

Though communication is always initiated by the client, sending and receiving traffic look similar for the client and the server. In both cases, a data packet is received from Tor through the obfsproxy interface. Upon receiving the Tor packet, HTPT decides which vector to encode the data with and breaks the Tor cells into a series of frames. The length of these frames may differ, but each frame also includes a header (copied from IP's fragmentation protocol) which details which Tor packet this came from, the size of the frame, and the offset of the frame from the beginning of the packet.

After breaking the packet into appropriately sized frames, the sender calls the appropriate method to encode and send the frame. Within the encoding protocol, multiple frames may be added to the same medium if there is space. This is only really an issue with images, which may contain multiple Tor packets in one image.

Request Encoding

In this stego scheme, data will be hidden within a HTTP request. Obviously, this is an upload only vector. To hide data with a request, the URL and cookie field will be manipulated. We will create a variety of different URL expressions and the exact URL expression will be chosen by the encoder at runtime. For instance, an example expression is click.<domain>?qs=<80 char hash> where domain is some domain name and 80 character hash is 80 characters of hex. In this case, <80 char hash > will be replaced with 40 bytes of data. This is a common format used for email personalization (detecting when a user has opened an advertising email) so it should not arouse suspicion.

Image encoding

This stego scheme is very simple and not really stego at all. For this scheme, we feed an image compression algorithm the frames we want to send. Though there will be little if any compression, this will create a valid image with our data. We assume that it would be too difficult for the censor to do anything beyond telling if the images is valid or not so this scheme will not hold up to stringent analysis.

Step 4: Client Downloads Data

As with client uploads, client downloads are asynchronous and initiated by the client. Every time the client makes a request, the server responds with either data or a padded message. If the server has any frames queued for transmission, it can indicate that it has more data to send to the client by setting the MORE flag in the frame header. Since the client must initiate any communication with the server, the client will periodically poll the server to check if there is data to be sent.

Details

Various steps are more detailed in the following links:

Clone this wiki locally