Browse Source

Merge remote-tracking branch 'mikeperry/padding_spec'

Nick Mathewson 1 year ago
parent
commit
2ffe30f0ef
4 changed files with 367 additions and 25 deletions
  1. 40 0
      dir-spec.txt
  2. 291 0
      padding-spec.txt
  3. 16 20
      proposals/254-padding-negotiation.txt
  4. 20 5
      tor-spec.txt

+ 40 - 0
dir-spec.txt

@@ -1158,6 +1158,46 @@
         Pluggable transports are only relevant to bridges, but these entries
         can appear in non-bridge relays as well.
 
+    "padding-counts" YYYY-MM-DD HH:MM:SS (NSEC s) key=val key=val ... NL
+        [At most once.]
+
+        YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+        interval of length NSEC seconds (86400 seconds by default). Counts
+        are reset to 0 at the end of this interval.
+
+        The keyword list is currently as follows:
+
+         bin-size
+           - The current rounding value for cell count fields (10000 by
+             default)
+         write-drop
+           - The number of RELAY_DROP cells this relay sent
+         write-pad
+           - The number of CELL_PADDING cells this relay sent
+         write-total
+           - The total number of cells this relay cent
+         read-drop
+           - The number of RELAY_DROP cells this relay received
+         read-pad
+           - The number of CELL_PADDING cells this relay received
+         read-total
+           - The total number of cells this relay received
+         enabled-read-pad
+           - The number of CELL_PADDING cells this relay received on
+             connections that support padding
+         enabled-read-total
+           - The total number of cells this relay received on connections
+             that support padding
+         enabled-write-pad
+           - The total number of cells this relay received on connections
+             that support padding
+         enabled-write-total
+           - The total number of cells sent by this relay on connections
+             that support padding
+         max-chanpad-timers
+           - The maximum number of timers that this relay scheduled for
+             padding in the previous NSEC interval
+
     "router-sig-ed25519"
         [As in router descriptors]
 

+ 291 - 0
padding-spec.txt

@@ -0,0 +1,291 @@
+                         Tor Padding Specification
+
+                                Mike Perry
+
+Note: This is an attempt to specify Tor as currently implemented.  Future
+versions of Tor will implement improved algorithms.
+
+This document tries to cover how Tor chooses to use cover traffic to obscure
+various traffic patterns from external and internal observers. Other
+implementations MAY take other approaches, but implementors should be aware of
+the anonymity and load-balancing implications of their choices.
+
+                          THIS SPEC ISN'T DONE YET.
+
+      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+      NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
+      "OPTIONAL" in this document are to be interpreted as described in
+      RFC 2119.
+
+
+1. Overview
+
+  Tor supports two classes of cover traffic: connection-level padding, and
+  circuit-level padding.
+
+  Connection-level padding uses the CELL_PADDING cell command for cover
+  traffic, where as circuit-level padding uses the RELAY_COMMAND_DROP relay
+  command. CELL_PADDING is single-hop only and can be differentiated from
+  normal traffic by Tor relays ("internal" observers), but not by entities
+  monitoring Tor OR connections ("external" observers).
+
+  RELAY_COMMAND_DROP is multi-hop, and is not visible to intermediate Tor
+  relays, because the relay command field is covered by circuit layer
+  encryption. Moreover, Tor's 'recognized' field allows RELAY_COMMAND_DROP
+  padding to be sent to any intermediate node in a circuit (as per Section
+  6.1 of tor-spec.txt).
+
+  Currently, only single-hop CELL_PADDING is used by Tor. It is described in
+  Section 2. At a later date, further sections will be added to this document
+  to describe various uses of multi-hop circuit-level padding.
+
+
+2. Connection-level padding
+
+2.1. Background
+
+  Tor clients and relays make use of CELL_PADDING to reduce the resolution of
+  connection-level metadata retention by ISPs and surveillance infrastructure.
+
+  Such metadata retention is implemented by Internet routers in the form of
+  Netflow, jFlow, Netstream, or IPFIX records.  These records are emitted by
+  gateway routers in a raw form and then exported (often over plaintext) to a
+  "collector" that either records them verbatim, or reduces their granularity
+  further[1].
+
+  Netflow records and the associated data collection and retention tools are
+  very configurable, and have many modes of operation, especially when
+  configured to handle high throughput. However, at ISP scale, per-flow records
+  are very likely to be employed, since they are the default, and also provide
+  very high resolution in terms of endpoint activity, second only to full packet
+  and/or header capture.
+
+  Per-flow records record the endpoint connection 5-tuple, as well as the
+  total number of bytes sent and received by that 5-tuple during a particular
+  time period. They can store additional fields as well, but it is primarily
+  timing and bytecount information that concern us.
+
+  When configured to provide per-flow data, routers emit these raw flow
+  records periodically for all active connections passing through them
+  based on two parameters: the "active flow timeout" and the "inactive
+  flow timeout".
+
+  The "active flow timeout" causes the router to emit a new record
+  periodically for every active TCP session that continuously sends data. The
+  default active flow timeout for most routers is 30 minutes, meaning that a
+  new record is created for every TCP session at least every 30 minutes, no
+  matter what. This value can be configured from 1 minute to 60 minutes on
+  major routers.
+
+  The "inactive flow timeout" is used by routers to create a new record if a
+  TCP session is inactive for some number of seconds. It allows routers to
+  avoid the need to track a large number of idle connections in memory, and
+  instead emit a separate record only when there is activity. This value
+  ranges from 10 seconds to 600 seconds on common routers. It appears as
+  though no routers support a value lower than 10 seconds.
+
+  For reference, here are default values and ranges (in parenthesis when
+  known) for common routers, along with citations to their manuals.
+
+  Some routers speak other collection protocols than Netflow, and in the
+  case of Juniper, use different timeouts for these protocols. Where this
+  is known to happen, it has been noted.
+
+                            Inactive Timeout              Active Timeout
+    Cisco IOS[3]              15s (10-600s)               30min (1-60min)
+    Cisco Catalyst[4]         5min                        32min
+    Juniper (jFlow)[5]        15s (10-600s)               30min (1-60min)
+    Juniper (Netflow)[6,7]    60s (10-600s)               30min (1-30min)
+    H3C (Netstream)[8]        60s (60-600s)               30min (1-60min)
+    Fortinet[9]               15s                         30min
+    MicroTik[10]              15s                         30min
+    nProbe[14]                30s                         120s
+    Alcatel-Lucent[2]         15s (10-600s)               30min (1-600min)
+
+  The combination of the active and inactive netflow record timeouts allow us
+  to devise a low-cost padding defense that causes what would otherwise be
+  split records to "collapse" at the router even before they are exported to
+  the collector for storage. So long as a connection transmits data before the
+  "inactive flow timeout" expires, then the router will continue to count the
+  total bytes on that flow before finally emitting a record at the "active
+  flow timeout".
+
+  This means that for a minimal amount of padding that prevents the "inactive
+  flow timeout" from expiring, it is possible to reduce the resolution of raw
+  per-flow netflow data to the total amount of bytes send and received in a 30
+  minute window. This is a vast reduction in resolution for HTTP, IRC, XMPP,
+  SSH, and other intermittent interactive traffic, especially when all
+  user traffic in that time period is multiplexed over a single connection
+  (as it is with Tor).
+
+2.2. Implementation
+
+  Tor clients currently maintain one TLS connection to their Guard node to
+  carry actual application traffic, and make up to 3 additional connections to
+  other nodes to retrieve directory information.
+
+  We pad only the client's connection to the Guard node, and not any other
+  connection. We treat Bridge node connections to the Tor network as client
+  connections, and pad them, but otherwise not pad between normal relays.
+
+  Both clients and Guards will maintain a timer for all application (ie:
+  non-directory) TLS connections. Every time a non-padding packet is sent or
+  received by either end, that endpoint will sample a timeout value from
+  between 1.5 seconds and 9.5 seconds using the max(X,X) distribution
+  described in Section 2.3. The time range is subject to consensus
+  parameters as specified in Section 2.6.
+
+  If the connection becomes active for any reason before this timer
+  expires, the timer is reset to a new random value between 1.5 and 9.5
+  seconds. If the connection remains inactive until the timer expires, a
+  single CELL_PADDING cell will be sent on that connection.
+
+  In this way, the connection will only be padded in the event that it is
+  idle, and will always transmit a packet before the minimum 10 second inactive
+  timeout.
+
+2.3. Padding Cell Timeout Distribution Statistics
+
+  It turns out that because the padding is bidirectional, and because both
+  endpoints are maintaining timers, this creates the situation where the time
+  before sending a padding packet in either direction is actually
+  min(client_timeout, server_timeout).
+
+  If client_timeout and server_timeout are uniformly sampled, then the
+  distribution of min(client_timeout,server_timeout) is no longer uniform, and
+  the resulting average timeout (Exp[min(X,X)]) is much lower than the
+  midpoint of the timeout range.
+
+  To compensate for this, instead of sampling each endpoint timeout uniformly,
+  we instead sample it from max(X,X), where X is uniformly distributed.
+
+  If X is a random variable uniform from 0..R-1 (where R=high-low), then the
+  random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R).
+
+  Then, when both sides apply timeouts sampled from Y, the resulting
+  bidirectional padding packet rate is now a third random variable:
+  Z = min(Y,Y).
+
+  The distribution of Z is slightly bell-shaped, but mostly flat around the
+  mean. It also turns out that Exp[Z] ~= Exp[X]. Here's a table of average
+  values for each random variable:
+
+     R       Exp[X]    Exp[Z]    Exp[min(X,X)]   Exp[Y=max(X,X)]
+     2000     999.5    1066        666.2           1332.8
+     3000    1499.5    1599.5      999.5           1999.5
+     5000    2499.5    2666       1666.2           3332.8
+     6000    2999.5    3199.5     1999.5           3999.5
+     7000    3499.5    3732.8     2332.8           4666.2
+     8000    3999.5    4266.2     2666.2           5332.8
+     10000   4999.5    5328       3332.8           6666.2
+     15000   7499.5    7995       4999.5           9999.5
+     20000   9900.5    10661      6666.2           13332.8
+
+  In this way, we maintain the property that the midpoint of the timeout range
+  is the expected mean time before a padding packet is sent in either
+  direction.
+
+2.4. Maximum overhead bounds
+
+  With the default parameters and the above distribution, we expect a
+  padded connection to send one padding cell every 5.5 seconds. This
+  averages to 103 bytes per second full duplex (~52 bytes/sec in each
+  direction), assuming a 512 byte cell and 55 bytes of TLS+TCP+IP headers.
+  For a client connection that remains otherwise idle for its expected
+  ~50 minute lifespan (governed by the circuit available timeout plus a
+  small additional connection timeout), this is about 154.5KB of overhead
+  in each direction (309KB total).
+
+  With 2.5M completely idle clients connected simultaneously, 52 bytes per
+  second amounts to 130MB/second in each direction network-wide, which is
+  roughly the current amount of Tor directory traffic[11]. Of course, our
+  2.5M daily users will neither be connected simultaneously, nor entirely
+  idle, so we expect the actual overhead to be much lower than this.
+
+2.5. Reducing or Disabling Padding via Negotiation
+
+  To allow mobile clients to either disable or reduce their padding overhead,
+  the CELL_PADDING_NEGOTIATE cell (tor-spec.txt section 7.2) may be sent from
+  clients to relays. This cell is used to instruct relays to cease sending
+  padding.
+
+  If the client has opted to use reduced padding, it continues to send
+  padding cells sampled from the range [9000,14000] milliseconds (subject to
+  consensus parameter alteration as per Section 2.6), still using the
+  Y=max(X,X) distribution. Since the padding is now unidirectional, the
+  expected frequency of padding cells is now governed by the Y distribution
+  above as opposed to Z. For a range of 5000ms, we can see that we expect to
+  send a padding packet every 9000+3332.8 = 12332.8ms.  We also half the
+  circuit available timeout from ~50min down to ~25min, which causes the
+  client's OR connections to be closed shortly there after when it is idle,
+  thus reducing overhead.
+
+  These two changes cause the padding overhead to go from 309KB per one-time-use
+  Tor connection down to 69KB per one-time-use Tor connection. For continual
+  usage, the maximum overhead goes from 103 bytes/sec down to 46 bytes/sec.
+
+  If a client opts to completely disable padding, it sends a
+  CELL_PADDING_NEGOTIATE to instruct the relay not to pad, and then does not
+  send any further padding itself.
+
+2.6. Consensus Parameters Governing Behavior
+
+  Connection-level padding is controlled by the following consensus parameters:
+
+    * nf_ito_low
+      - The low end of the range to send padding when inactive, in ms.
+      - Default: 1500
+
+    * nf_ito_high
+      - The high end of the range to send padding, in ms.
+      - Default: 9500
+      - If nf_ito_low == nf_ito_high == 0, padding will be disabled.
+
+    * nf_ito_low_reduced
+      - For reduced padding clients: the low end of the range to send padding
+        when inactive, in ms.
+      - Default: 9000
+
+    * nf_ito_high_reduced
+      - For reduced padding clients: the high end of the range to send padding,
+        in ms.
+      - Default: 14000
+
+    * nf_conntimeout_clients
+      - The number of seconds to keep circuits opened and available for
+        clients to use. Note that the actual client timeout is randomized
+        uniformly from this value to twice this value. This governs client
+        OR conn lifespan. Reduced padding clients use half the consensus
+        value.
+      - Default: 1800
+
+    * nf_pad_before_usage
+      - If set to 1, OR connections are padded before the client uses them
+        for any application traffic. If 0, OR connections are not padded
+        until application data begins.
+      - Default: 1
+
+    * nf_pad_relays
+      - If set to 1, we also pad inactive relay-to-relay connections
+      - Default: 0
+
+    * nf_conntimeout_relays
+      - The number of seconds that idle relay-to-relay connections are kept
+        open.
+      - Default: 3600
+
+
+1. https://en.wikipedia.org/wiki/NetFlow
+2. http://infodoc.alcatel-lucent.com/html/0_add-h-f/93-0073-10-01/7750_SR_OS_Router_Configuration_Guide/Cflowd-CLI.html
+3. http://www.cisco.com/en/US/docs/ios/12_3t/netflow/command/reference/nfl_a1gt_ps5207_TSD_Products_Command_Reference_Chapter.html#wp1185203
+4. http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/70974-netflow-catalyst6500.html#opconf
+5. https://www.juniper.net/techpubs/software/erx/junose60/swconfig-routing-vol1/html/ip-jflow-stats-config4.html#560916
+6. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html
+7. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html
+8. http://www.h3c.com/portal/Technical_Support___Documents/Technical_Documents/Switches/H3C_S9500_Series_Switches/Command/Command/H3C_S9500_CM-Release1648%5Bv1.24%5D-System_Volume/200901/624854_1285_0.htm#_Toc217704193
+9. http://docs-legacy.fortinet.com/fgt/handbook/cli52_html/FortiOS%205.2%20CLI/config_system.23.046.html
+10. http://wiki.mikrotik.com/wiki/Manual:IP/Traffic_Flow
+11. https://metrics.torproject.org/dirbytes.html
+12. http://freehaven.net/anonbib/cache/murdoch-pet2007.pdf
+13. https://gitweb.torproject.org/torspec.git/tree/proposals/188-bridge-guards.txt
+14. http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf

+ 16 - 20
proposals/254-padding-negotiation.txt

@@ -41,31 +41,27 @@ Proposal 251[1]).
 
 In that scenario, the primary negotiation mechanism we need is a way for
 mobile clients to tell their Guards to stop padding, or to pad less
-often. The following Trunnel payloads should cover the needed
+often. The following Trunnel payload should cover the needed
 parameters:
 
-    const CELL_PADDING_COMMAND_STOP = 1;
-    const CELL_PADDING_COMMAND_START = 2;
+  const CHANNELPADDING_COMMAND_STOP = 1;
+  const CHANNELPADDING_COMMAND_START = 2;
 
-    /* This command tells the relay to stop sending any periodic
-       CELL_PADDING cells. */
-    struct cell_padding_stop {
-      u8 command IN [CELL_PADDING_COMMAND_STOP];
-    };
-
-    /* This command tells the relay to alter its min and max netflow
-       timeout range values, and send padding at that rate (resuming
-       if stopped). */
-    struct cell_padding_start {
-      u8 command IN [CELL_PADDING_COMMAND_START];
+  /* The start command tells the relay to alter its min and max netflow
+     timeout range values, and send padding at that rate (resuming
+     if stopped). The stop command tells the relay to stop sending
+     link-level padding. */
+  struct channelpadding_negotiate {
+    u8 version IN [0];
+    u8 command IN [CHANNELPADDING_COMMAND_START, CHANNELPADDING_COMMAND_STOP];
 
-      /* Min must not be lower than the current consensus parameter
-         nf_ito_low. */
-      u16 ito_low_ms;
+    /* Min must not be lower than the current consensus parameter
+       nf_ito_low. Ignored if command is stop. */
+    u16 ito_low_ms;
 
-      /* Max must not be lower than ito_low_ms */
-      u16 ito_high_ms;
-    };
+    /* Max must not be lower than ito_low_ms. Ignored if command is stop. */
+    u16 ito_high_ms;
+  };
 
 More complicated forms of link-level padding can still be specified
 using the primitives in Section 3, by using "leaky pipe" topology to

+ 20 - 5
tor-spec.txt

@@ -428,6 +428,7 @@ see tor-design.pdf.
          9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6)
          10 -- CREATE2    (Extended CREATE cell)    (See Sec 5.1)
          11 -- CREATED2   (Extended CREATED cell)    (See Sec 5.1)
+         12 -- PADDING_NEGOTIATE   (Padding negotiation)    (See Sec 7.2)
 
     Variable-length command values are:
          7 -- VERSIONS    (Negotiate proto version) (See Sec 4)
@@ -539,6 +540,7 @@ see tor-design.pdf.
           variable-length cells.
      3 -- Uses the in-protocol handshake.
      4 -- Increases circuit ID width to 4 bytes.
+     5 -- Adds support for link padding and negotiation (padding-spec.txt).
 
 
 4.2. CERTS cells
@@ -1535,11 +1537,24 @@ see tor-design.pdf.
    long-range padding.  The contents of a PADDING, VPADDING, or DROP
    cell SHOULD be chosen randomly, and MUST be ignored.
 
-   Currently nodes are not required to do any sort of link padding or
-   dummy traffic. Because strong attacks exist even with link padding,
-   and because link padding greatly increases the bandwidth requirements
-   for running a node, we plan to leave out link padding until this
-   tradeoff is better understood.
+   If the link protocol is version 5 or higher, link level padding is
+   enabled as per padding-spec.txt. On these connections, clients may
+   negotiate the use of padding with a CELL_PADDING_NEGOTIATE command
+   whose format is as follows:
+         Version           [1 byte]
+         Command           [1 byte]
+         ito_low_ms        [2 bytes]
+         ito_high_ms       [2 bytes]
+
+   Currently, only version 0 of this cell is defined. In it, the command
+   field is either 1 (stop padding) or 2 (start padding). For the start
+   padding command, a pair of timeout values specifying a low and a high
+   range bounds for randomized padding timeouts may be specified as unsigned
+   integer values in milliseconds. The ito_low_ms field must not be lower than
+   the current consensus parameter value for nf_ito_low (default: 1500).
+   ito_high_ms must be greater than ito_low_ms.
+
+   For more details on padding behavior, see padding-spec.txt.
 
 7.3. Circuit-level flow control