Deploying FlashFlow

Generating Keys

FlashFlow coordinators and measurers all maintain TLS identify keys. scripts/gen-cert.sh should be used to help generate them. Each measurer needs a unique organizationName in its Subject; this script sets the organizationName to be the argument you provide to it.

Coordinator

$ ./scripts/gen-cert.sh coord
Generating a RSA private key
............................+++++
..+++++
writing new private key to 'coord.pem'
-----
$ cat coord.pem
-----BEGIN PRIVATE KEY-----
[... base64 stuff ...]
-----END PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
[... base64 stuff ...]
-----END CERTIFICATE-----

As you can see, both the private key and the certificate are in the same file. This is how Python’s SSL library like it (devs: they can be separate, but this is easier).

Put this file in your key directory. By default your key directory is a subdirectory of your data directory. By default it is thus data-coord/keys/.

You need the certificates for all the measurers you trust. When the measurers run this script, they should provide you with the bottom half of their output file: just the certificate part. You put each measurer’s certificate in its own file in your keys directory in a file ending with .pem.

$ ls data-coord/keys
coord.pem  # By default this is the file read for our own key/cert.
           # It contains our own private key and cert.
measurer1.pem           # Contains measurer 1's cert
measurer2.pem           # Contains measurer 2's cert
measurer3.pem.disabled  # Not read
notes.txt               # Not read

Running FlashFlow as a coordinator with the above keys directory loads two measurer certs. The third measurer’s cert file was skipped because the file name doesn’t end with .pem. Measurer 3, were it to try to connect, would not be allowed to complete the TLS handshake with us.

Measurer

Run the same gen-cert.sh script.

$ ./scripts/gen-cert.sh measurer1
Generating a RSA private key
............................+++++
..+++++
writing new private key to 'measurer1.pem'
-----
$ cat measurer1.pem
-----BEGIN PRIVATE KEY-----
[... base64 stuff ...]
-----END PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
[... base64 stuff ...]
-----END CERTIFICATE-----

You need to hold on to the entire file: you need your private key. But the coordinator needs your certificate. Copy the certificate part of the file into a new file and send it to the coordinator. That means just the lines between BEGIN CERTIFICATE and END CERTIFICATE, inclusively.

Disk Usage Mangement

FlashFlow can use a significant amount of disk space if you let it. TODO: how much? For per-second result storage, you can address this with logrotate. TODO: what about logs? What about v3bw files?

Per-second Results

Note: The info in this section is only partially true until pastly/flashflow#4 is implemented. If the issue is closed and this message still exists, this section needs updating such that it definitely matches the actual implemented reality.

For Debian 10 (Buster), logrotate should come on your system and already be running daily. If not, install its package and ensure it’s running daily.

$ sudo systemctl list-timers
NEXT                         LEFT       LAST                         PASSED       UNIT                         ACTIVATES
[...]
Sat 2020-06-27 00:00:00 EDT  12h left   Fri 2020-06-26 00:00:27 EDT  11h ago      logrotate.timer              logrotate.service
[...]

You can run it manually like this (--debug performs a dry run and lets you see what would happen):

$ sudo /usr/sbin/logrotate --debug /etc/logrotate.conf

Or like this, which will run the logrotate.service file, probably located at /lib/systemd/system/logrotate.service:

sudo systemctl start logrotate

Here is an example logrotate configuration file. Copy this into, for example, /etc/logrotate.d/flashflow:

# The filename to which flashflow writes its per-second results
/home/matt/work/flashflow/data-coord/results/results.log {
   # keep 30 historic logs (if daily rotation, then 30 days)
   rotate 30
   # rotate daily
   daily
   # but don't bother rotating if the file is empty
   notifempty
   # gz compress when rotating
   compress
   # if log is missing, that's not an error, just skip
   missingok
}

And that’s it. Logrotate will see the new configuration file and use it next time it runs. See logrotate’s man page for possible options; for example, you can rotate based on file size instead of time.

When generating v3bw files, FlashFlow reads the most recent few per-second results files until it has gone far enough back into history. TODO: how far back? It’s expected to be no more than a few days. Recent is defined as the files’ modification times, not the lexicographic sort order of the filenames. FlashFlow does this so that it doesn’t have to care what your rotate naming scheme is: configuring logrotate to append integers (e.g. results.log.1.gz) results in newer files sorting sooner, while configuring logrotate to append a date (e.g. results.log.20200630.gz) results in newer files sorting later. Doing it based on modification time means FlashFlow doesn’t care your preference.

The one thing FlashFlow does care about is that simply appending a * after your configured results file path will correctly glob all results files. Don’t configure logrotate to move them somewhere else.

With this config:

[coord]
datadir = data-coord
resultsdir = ${datadir}/results
results_log = ${resultsdir}/results.log

Then ls data-coord/results/results.log* will find all results files:

$ ls -l data-coord/results/results.log*
-rw-r--r-- 1 matt matt    0 Jun 26 11:34 data-coord/results/results.log
-rw-r--r-- 1 matt matt  634 Jun 26 11:34 data-coord/results/results.log.1.gz
-rw-r--r-- 1 matt matt 8619 Jun 26 03:50 data-coord/results/results.log.2.gz

Estimating Necessary Capacity

Bandwidth per Day

Assume the Tor network has 500 Gbit/s of capacity [0]. Assume the measurement period is 1 day. If each measurement lasts 30 seconds, then 500 * 30 == 15000 Gbit/day will be consumed, or 15000 / 8 == 1875 Gbyte/day (not GiB).

This number is total across an entire deployment. If a deployment has two measurers, each will see roughly half of the above number.

Maximum Burst Capacity Necessary

FlashFlow needs to be able to consume the full capacity of the fastest relay in the network for 30 seconds. If the fastest relay is capable of 1 Gbit/s, then FlashFlow needs to be capable of at least that. The paper further chose a multiplier of 2.25 [1], meaning, for example, a deployment of 2.5 machines each capable of 1 Gbit/s would serve fine.

Being unable to fully consume the fastest relay in the network will artificially cap its weight, which may be acceptable. A deployment of a single 1 Gbit/s measurer may be fine, only suffering accuracy in the small number of very high capacity relays.

[0]: https://metrics.torproject.org/bandwidth-flags.html (the amount of advertised bandwidth across the entire network)

[1]: https://arxiv.org/pdf/2004.09583.pdf Appendix E.2

Adding a Measurer

  1. Generate a TLS key for the measurer as described in that section.

  2. Add the measurer’s ID to the comma-separated list of measurer IDs in your coordinator’s [coord] section. For example, if you’re adding a new measurer with ID “london”:

    [coord]
    ....
    measurers = 'newyork,london'
    
  3. Create a new section in the config for this measurer’s information. Use the [measr_default] section as a template. The section must be named [measr_FOO] where FOO is the measurer’s ID (london, in the running example). For example, to tell the coordinator to assume the measurer has 75 mega bytes/second capacity:

    [measr_london]
    measr_bw = 75000000
    

Configuration Options

Here’s the default configuration file, with all possible configuration options documented within.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
[coord]
datadir = data-coord
# Where FlashFlow TLS keys are stored. One of them is our key, any others are
# assumed to be certs for measurers.
keydir = ${datadir}/keys
# Our key and certificate (in one file). All other .pem files in keydir is
# treated as certs for measurers.
key = ${keydir}/coord.pem
# Where we store measurement results and v3bw files
resultsdir = ${datadir}/results
# Where we write in every per-second result that we get. You should configure
# logrotate, as this will grow to be quite large over time.
results_log = ${resultsdir}/results.log
# It's unfortunate this goes here, but it's so the [tor] section can not care
# about whether it's a part of a coord config file or a measurer config file.
tor_datadir = ${datadir}/tor
# Store long-term state in gzipped CBOR format
state = ${datadir}/state.cbor.gz
# hostname:port or ip:port on which to listen for connections from measurers
listen_addr = localhost:12934
# hostname:port or ip:port on which to listen for `flashflow ctrl` commands
ctrl_addr = localhost:12935
# duration of a measurement period, in seconds. The intention is to measure
# each relay once per period. 86400 seconds is one day.
meas_period = 86400
# comma-separated list of measurer identities that we expect our measurers to
# use. Measurer IDs are the organizationName strings in the measurer's client
# TLS certificates.
measurers =
# relay filter list. File containing fingerprints of relays we should measure
relay_filter_list = ${datadir}/relays.txt

# Defaults for coordinator's measr_* sections
#
# If you have a measurer with ID "foo" that you want to customize, you should
# create a new section for it. For example:
#
#     [measr_foo]
#     measr_bw = 75000000
#
# The coordinator will apply this measr_default section first, then the
# specific section, if it exists
[measr_default]
# The amount of bandwidth, in BYTES/second, that this measurer is capable of
measr_bw = 125000000

[v3bw]
# Maximum number of seconds into the past that a measurement can have started
# in order for us to consider its results for inclusion in a v3bw file. 86,400
# seconds in a day; 604,800 seconds in a week.
max_results_age = 604800
# Path at which to write each new v3bw file. This will actually be a symlink
# to the latest v3bw file, and the actual file will be this path with a date
# suffix.
v3bw = ${coord:resultsdir}/v3bw

[meas_params]
# The duration, in seconds, that the relay will be actively measured. AKA how
# long echo traffic will be sent back/forth with it. Since bw reports are
# per-second, this is *also* the number of reports we expect to get from each
# party.
meas_duration = 30
# Relays are limiting the amount of background traffic they carry during a
# measurement to some fraction of total traffic. They get this from the
# consensus parameter, their torrc, or the default value in tor. This is
# the percentage, as a fraction of 1, that we assume all relays are using.
bg_percent = 0.25
# The number of circuits the measurers, in aggregate, should open with the
# target relay
num_circs = 160

[measurer]
datadir = data-meas
# Where FlashFLow TLS keys are stored. One of them is our key, and the other is
# the cert for the coordinator.
keydir = ${datadir}/keys
# Our key and certificate (in one file).
key = ${keydir}/measurer.pem
# The coordinator's certificate
coord_cert = ${keydir}/coord.pem
# It's unfortunate this goes here, but it's so the [tor] section can not care
# about whether it's a part of a coord config file or a measurer config file.
tor_datadir = ${datadir}/tor
# The hostname:port of the coordinator. We will connect to them.
coord_addr = localhost:12934

[ctrl]
coord_addr = localhost:12935

[tor]
# Either something that looks like an executable that should be searched for in
# your $PATH (e.g. simply 'tor') or the path to your desired tor executable
# (e.g. '/usr/bin/tor' or '../tor/src/appt/tor')
tor_bin = tor
torrc_extra_lines =
    # Put extra lines here
    # in YOUR config.ini