Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
1''' Helper functions for writing per-second measurement results to a file that
2might rotate, as well as classes for reading those results from files later.
4**Note: The information here is only partially true until pastly/flashflow#4 is
5implemented and this message is removed.**
7Results are "logged" via :mod:`logging` at level ``INFO``. It is important that
8the user does not edit the way these messages are logged.
9If the user would like to rotate the output file, e.g. with `logrotate
10<https://linux.die.net/man/8/logrotate>`_, they can do that because by default
11(*and this should not be changed lightly*) these "log messages" get "logged"
12via a :class:`logging.handlers.WatchedFileHandler`, which handles this
18Call :meth:`write_begin` once at the beginning of the active measurement phase.
19As measurement results come in every second from measurers, call
20:meth:`write_meas` for each. Likewise for per-second background traffic reports
21and :meth:`write_bg`. As soon as active measurement is over, call
27Output is line based. Multiple measurements can take place simultaneously, in
28which case per-second results from measurements of different relays can be
31A **BEGIN** line signals the start of data for the measurement of a relay. An
32**END** line signals the end. Between these lines there are zero or more result
33lines for the measurement of this relay, each with a per-second result from
34either a measurer measuring that relay or that relay itself reporting the
35amount of background traffic it saw that second.
42 <meas_id> <time> BEGIN <fp>
46- ``meas_id``: the measurement ID for this measurement
47- ``time``: the integer unix timestamp at which active measurement began.
48- ``fp``: the fingerprint of the relay this BEGIN message is for.
52 58234 1591979504 BEGIN B0430D21D6609459D141078C0D7758B5CA753B6F
59 <meas_id> <time> END
63- ``meas_id``: the measurement ID for this measurement
64- ``time``: the integer unix timestamp at which active measurement ended.
68 58234 1591979534 END B0430D21D6609459D141078C0D7758B5CA753B6F
76 <meas_id> <time> <is_bg> GIVEN=<given> TRUSTED=<trusted>
80- ``meas_id``: the measurement ID for this measurement
81- ``time``: the integer unix timestamp at which this result was received.
82- ``is_bg``: 'BG' if this result is a report from the relay on the number of
83 background bytes it saw in the last second, or 'MEASR' if this is a result
84 from a measurer
85- ``given``: the number of bytes reported
86- ``trusted``: if a bg report from the relay, the maximum `given` is trusted to
87 be; or if a measurer result, then the same as `given`.
89Both ``given`` and ``trusted`` are in bytes. Yes, for measurer lines it is
90redundant to specify both.
92Background traffic reports from the relay include the raw actual reported value
93in ``given``; if the relay is malicious and claims 8 TiB of background traffic
94in the last second, you will see that here. ``trusted`` is the **max** that
95``given`` can be. When reading results from this file, use ``min(given,
96trusted)`` as the trusted number of background bytes this second.
100 # bg report from relay, use GIVEN b/c less than TRUSTED
101 58234 1591979083 BG GIVEN=744904 TRUSTED=1659029
102 # bg report from relay, use TRUSTED b/c less than GIVEN
103 58234 1591979042 BG GIVEN=671858 TRUSTED=50960
104 # result from measurer, always trusted
105 58234 1591979083 MEASR GIVEN=5059082 TRUSTED=5059082
108from statistics import median
109from typing import Optional, List
112log = logging.getLogger(__name__)
115def _try_parse_int(s: str) -> Optional[int]:
116 ''' Try to parse an integer from the given string. If impossible, return
117 ``None``. '''
119 return int(s)
120 except (ValueError, TypeError):
121 return None
124def _ensure_len(lst: List[int], min_len: int):
125 ''' Ensure that the given list is at least ``min_len`` items long. If it
126 isn't, append zeros to the right until it is. '''
127 if len(lst) < min_len:
128 lst +=  * (min_len - len(lst))
132 ''' Accumulate ``MeasLine*`` objects into a single measurement summary.
134 The first measurement line you should see is a :class:`MeasLineBegin`;
135 create a :class:`Meas` object with it. Then pass each :class:`MeasLineData`
136 that you encounter to either :meth:`Meas.add_measr` or :meth:`Meas.add_bg`
137 based on where it came from. Finally pass the :class:`MeasLineEnd` to tell
138 the object it has all the data.
140 Not much is done to ensure you're using this data storage class correctly.
141 For example:
143 - You can add more :class:`MeasLineData` after marking the end.
144 - You can pass untrusted :class:`MeasLineData` from the relay to the
145 :meth:`Meas.add_measr` function where they will be treated as
147 - You can get the :meth:`Meas.result` before all data lines have been
149 - You can provide data from different measurements for different
152 **You shouldn't do these things**, but you can. It's up to you to use your
153 tools as perscribed.
155 _begin: 'MeasLineBegin'
156 _end: Optional['MeasLineEnd']
157 _data: List[int]
159 def __init__(self, begin: 'MeasLineBegin'):
160 self._begin = begin
161 self._end = None
162 self._data = 
165 def relay_fp(self) -> str:
166 ''' The relay measured, as given in the initial :class:`MeasLineBegin`.
168 return self._begin.relay_fp
171 def meas_id(self) -> int:
172 ''' The measurement ID, as given in the initial :class:`MeasLineBegin`.
174 return self._begin.meas_id
177 def start_ts(self) -> int:
178 ''' The integer timestamp for when the measurement started, as given in
179 the initial :class:`MeasLineBegin`. '''
180 return self._begin.ts
182 def _ensure_len(self, data_len: int):
183 ''' Ensure we can store at least ``data_len`` items, expanding our data
184 list to the right with zeros as necessary. '''
185 if len(self._data) < data_len:
186 self._data +=  * (data_len - len(self._data))
188 def add_measr(self, data: 'MeasLineData'):
189 ''' Add a :class:`MeasLineData` to our results that came from a
192 As it came from a measurer, we trust it entirely (and there's no
193 ``trusted_bw`` member) and simply add it to the appropriate second.
195 idx = data.ts - self.start_ts
196 _ensure_len(self._data, idx + 1)
197 self._data[idx] += data.given_bw
199 def add_bg(self, data: 'MeasLineData'):
200 ''' Add a :class:`MeasLineData` to our results that came from the relay
201 and is regarding the amount of background traffic.
203 As it came from the relay, we do not a ``given_bw > trusted_bw``. Thus
204 we add the minimum of the two to the appropriate second.
206 idx = data.ts - self.start_ts
207 _ensure_len(self._data, idx + 1)
208 assert data.trusted_bw is not None # for mypy, bg will have this
209 self._data[idx] += min(data.given_bw, data.trusted_bw)
211 def set_end(self, end: 'MeasLineEnd'):
212 ''' Indicate that there is no more data to be loaded into this
213 :class:`Meas`. '''
214 self._end = end
216 def have_all_data(self) -> bool:
217 ''' Check if we still expect to be given more data '''
218 return self._end is not None
220 def result(self) -> float:
221 ''' Calculate and return the result of this measurement '''
222 return median(self._data)
226 ''' Parent class for other ``MeasLine*`` types. You should only ever need
227 to interact with this class directly via its :meth:`MeasLine.parse` method.
229 def __init__(self, meas_id: int, ts: int):
230 self.meas_id = meas_id
231 self.ts = ts
233 def __str__(self):
234 return '%d %d' % (
239 def parse(s: str) -> Optional['MeasLine']:
240 ''' Try to parse a MeasLine subclass from the given line ``s``. If
241 impossible, return ``None``. '''
242 s = s.strip()
243 # ignore comment lines
244 if s.startswith('#'):
245 return None
246 words = s.split()
247 # minimum line length, in words, is 3: end lines have 3 words
248 # maximum line length, in words, is 5: bg data lines have 5
249 MIN_WORD_LEN = 3
250 MAX_WORD_LEN = 5
251 if len(words) < MIN_WORD_LEN or len(words) > MAX_WORD_LEN:
252 return None
253 # split off the prefix words (words common to all measurement data
254 # lines).
255 prefix, words = words[:2], words[2:]
256 # try convert each one, bail if unable
257 meas_id = _try_parse_int(prefix)
258 ts = _try_parse_int(prefix)
259 if meas_id is None or ts is None:
260 return None
261 # now act differently based on what type of line we seem to have
262 if words == 'BEGIN':
263 # BEGIN <fp>
264 if len(words) != 2: 264 ↛ 265line 264 didn't jump to line 265, because the condition on line 264 was never true
265 return None
266 fp = words
267 return MeasLineBegin(fp, meas_id, ts)
268 elif words == 'END':
269 # END
270 return MeasLineEnd(meas_id, ts)
271 elif words == 'MEASR':
272 # MEASR GIVEN=1234
273 if len(words) != 2 or _try_parse_int(words) is None:
274 return None
275 res = _try_parse_int(words)
276 assert isinstance(res, int) # for mypy
277 return MeasLineData(res, None, meas_id, ts)
278 elif words == 'BG':
279 # BG GIVEN=1234 TRUSTED=5678
280 if len(words) != 3 or \
281 _try_parse_int(words) is None or \
282 _try_parse_int(words) is None:
283 return None
284 given = _try_parse_int(words)
285 trusted = _try_parse_int(words)
286 assert isinstance(given, int) # for mypy
287 assert isinstance(trusted, int) # for mypy
288 return MeasLineData(given, trusted, meas_id, ts)
289 return None
293 def __init__(self, fp: str, *a, **kw):
294 super().__init__(*a, **kw)
295 self.relay_fp = fp
297 def __str__(self):
298 prefix = super().__str__()
299 return prefix + ' BEGIN ' + self.relay_fp
303 def __init__(self, *a, **kw):
304 super().__init__(*a, **kw)
306 def __str__(self):
307 prefix = super().__str__()
308 return prefix + ' END'
312 def __init__(self, given_bw: int, trusted_bw: Optional[int], *a, **kw):
313 super().__init__(*a, **kw)
314 self.given_bw = given_bw
315 self.trusted_bw = trusted_bw
317 def is_bg(self) -> bool:
318 return self.trusted_bw is not None
320 def __str__(self):
321 prefix = super().__str__()
322 if self.trusted_bw is None:
323 # result from a measurer
324 return prefix + ' MEASR %d' % (self.given_bw,)
325 # result from relay
326 return prefix + ' BG %d %d' % (self.given_bw, self.trusted_bw)
329def write_begin(fp: str, meas_id: int, ts: int):
330 ''' Write a log line indicating the start of the given relay's measurement.
332 :param fp: the fingerprint of the relay
333 :param meas_id: the measurement ID
334 :param ts: the unix timestamp at which the measurement began
336 log.info(MeasLineBegin(fp, meas_id, ts))
339def write_end(meas_id: int, ts: int):
340 ''' Write a log line indicating the end of the given relay's measurement.
342 :param meas_id: the measurement ID
343 :param ts: the unix timestamp at which the measurement ended
345 log.info(MeasLineEnd(meas_id, ts))
348def write_meas(meas_id: int, ts: int, res: int):
349 ''' Write a single per-second result from a measurer to our results.
351 :param meas_id: the measurement ID
352 :param ts: the unix timestamp at which the result came in
353 :param res: the number of measured bytes
355 log.info(MeasLineData(res, None, meas_id, ts))
358def write_bg(meas_id: int, ts: int, given: int, trusted: int):
359 ''' Write a single per-second report of bg traffic from the relay to our
362 :param meas_id: the measurement ID
363 :param ts: the unix timestamp at which the result came in
364 :param given: the number of reported bg bytes
365 :param trusted: the maximum given should be (from our perspective in this
366 logging code, it's fine if given is bigger than trusted)
368 log.info(MeasLineData(given, trusted, meas_id, ts))