forked from bifurcation/anti-pervasive
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathdraft-iab-privsec-confidentiality-threat-07.txt
1344 lines (959 loc) · 59.2 KB
/
draft-iab-privsec-confidentiality-threat-07.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Network Working Group R. Barnes
Internet-Draft
Intended status: Informational B. Schneier
Expires: November 29, 2015
C. Jennings
T. Hardie
B. Trammell
C. Huitema
D. Borkmann
May 28, 2015
Confidentiality in the Face of Pervasive Surveillance: A Threat Model
and Problem Statement
draft-iab-privsec-confidentiality-threat-07
Abstract
Since the initial revelations of pervasive surveillance in 2013,
several classes of attacks on Internet communications have been
discovered. In this document we develop a threat model that
describes these attacks on Internet confidentiality. We assume an
attacker that is interested in undetected, indiscriminate
eavesdropping. The threat model is based on published, verified
attacks.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 29, 2015.
Barnes, et al. Expires November 29, 2015 [Page 1]
Internet-Draft Confidentiality Threat Model May 2015
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. An Idealized Passive Pervasive Attacker . . . . . . . . . . . 5
3.1. Information subject to direct observation . . . . . . . . 6
3.2. Information useful for inference . . . . . . . . . . . . 6
3.3. An illustration of an ideal passive pervasive attack . . 7
3.3.1. Analysis of IP headers . . . . . . . . . . . . . . . 7
3.3.2. Correlation of IP addresses to user identities . . . 8
3.3.3. Monitoring messaging clients for IP address
correlation . . . . . . . . . . . . . . . . . . . . . 8
3.3.4. Retrieving IP addresses from mail headers . . . . . . 9
3.3.5. Tracking address usage with web cookies . . . . . . . 9
3.3.6. Graph-based approaches to address correlation . . . . 10
3.3.7. Tracking of Link Layer Identifiers . . . . . . . . . 10
4. Reported Instances of Large-Scale Attacks . . . . . . . . . . 11
5. Threat Model . . . . . . . . . . . . . . . . . . . . . . . . 13
5.1. Attacker Capabilities . . . . . . . . . . . . . . . . . . 14
5.2. Attacker Costs . . . . . . . . . . . . . . . . . . . . . 17
6. Security Considerations . . . . . . . . . . . . . . . . . . . 19
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19
8. IAB Members at the Time of Approval . . . . . . . . . . . . . 20
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 20
10.1. Normative References . . . . . . . . . . . . . . . . . . 20
10.2. Informative References . . . . . . . . . . . . . . . . . 20
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23
Barnes, et al. Expires November 29, 2015 [Page 2]
Internet-Draft Confidentiality Threat Model May 2015
1. Introduction
Starting in June 2013, documents released to the press by Edward
Snowden have revealed several operations undertaken by intelligence
agencies to exploit Internet communications for intelligence
purposes. These attacks were largely based on protocol
vulnerabilities that were already known to exist. The attacks were
nonetheless striking in their pervasive nature, both in terms of the
volume of Internet traffic targeted, and in terms of the diversity of
attack techniques employed.
To ensure that the Internet can be trusted by users, it is necessary
for the Internet technical community to address the vulnerabilities
exploited in these attacks [RFC7258]. The goal of this document is
to describe more precisely the threats posed by these pervasive
attacks, and based on those threats, lay out the problems that need
to be solved in order to secure the Internet in the face of those
threats.
The remainder of this document is structured as follows. In
Section 3, we describe an idealized passive pervasive attacker, one
which could completely undetectably compromise communications at
Internet scale. In Section 4, we provide a brief summary of some
attacks that have been disclosed, and use these to expand the assumed
capabilities of our idealized attacker. Note that we do not attempt
to describe all possible attacks, but focus on those which result in
undetected eavesdropping. Section 5 describes a threat model based
on these attacks, focusing on classes of attack that have not been a
focus of Internet engineering to date.
2. Terminology
This document makes extensive use of standard security and privacy
terminology; see [RFC4949] and [RFC6973]. Terms used from [RFC6973]
include Eavesdropper, Observer, Initiator, Intermediary, Recipient,
Attack (in a privacy context), Correlation, Fingerprint, Traffic
Analysis, and Identifiability (and related terms). In addition, we
use a few terms that are specific to the attacks discussed in this
document. Note especially that "passive" and "active" below do not
refer to the effort used to mount the attack; a "passive attack" is
any attack that accesses a flow but does not modify it, while an
"active attack" is any attack that modifies a flow. Some passive
attacks involve active interception and modifications of devices,
rather than simple access to the medium. The introduced terms are:
Pervasive Attack: An attack on Internet communications that makes
use of access at a large number of points in the network, or
Barnes, et al. Expires November 29, 2015 [Page 3]
Internet-Draft Confidentiality Threat Model May 2015
otherwise provides the attacker with access to a large amount of
Internet traffic; see [RFC7258].
Passive Pervasive Attack: An eavesdropping attack undertaken by a
pervasive attacker, in which the packets in a traffic stream
between two endpoints are intercepted, but in which the attacker
does not modify the packets in the traffic stream between two
endpoints, modify the treatment of packets in the traffic stream
(e.g. delay, routing), or add or remove packets in the traffic
stream. Passive pervasive attacks are undetectable from the
endpoints. Equivalent to passive wiretapping as defined in
[RFC4949]; we use an alternate term here since the methods
employed are wider than those implied by the word "wiretapping",
including the active compromise of intermediate systems.
Active Pervasive Attack: An attack undertaken by a pervasive
attacker, which in addition to the elements of a passive pervasive
attack, also includes modification, addition, or removal of
packets in a traffic stream, or modification of treatment of
packets in the traffic stream. Active pervasive attacks provide
more capabilities to the attacker at the risk of possible
detection at the endpoints. Equivalent to active wiretapping as
defined in [RFC4949].
Observation: Information collected directly from communications by
an eavesdropper or observer. For example, the knowledge that
<[email protected]> sent a message to <[email protected]> via SMTP
taken from the headers of an observed SMTP message would be an
observation.
Inference: Information derived from analysis of information
collected directly from communications by an eavesdropper or
observer. For example, the knowledge that a given web page was
accessed by a given IP address, by comparing the size in octets of
measured network flow records to fingerprints derived from known
sizes of linked resources on the web servers involved, would be an
inference.
Collaborator: An entity that is a legitimate participant in a
communication, and provides information about that communication
to an attacker. Collaborators may either deliberately or
unwittingly cooperate with the attacker, in the latter case
because the attacker has subverted the collaborator through
technical, social, or other means.
Key Exfiltration: The transmission of cryptographic keying material
for an encrypted communication from a collaborator, deliberately
or unwittingly, to an attacker.
Barnes, et al. Expires November 29, 2015 [Page 4]
Internet-Draft Confidentiality Threat Model May 2015
Content Exfiltration: The transmission of the content of a
communication from a collaborator, deliberately or unwittingly, to
an attacker
3. An Idealized Passive Pervasive Attacker
In considering the threat posed by pervasive surveillance, we begin
by defining an idealized passive pervasive attacker. While this
attacker is less capable than those which we now know to have
compromised the Internet from press reports, as elaborated in
Section 4, it does set a lower bound on the capabilities of an
attacker interested in indiscriminate passive surveillance while
interested in remaining undetectable. We note that, prior to the
Snowden revelations in 2013, the assumptions of attacker capability
presented here would be considered on the border of paranoia outside
the network security community.
Our idealized attacker is an indiscriminate eavesdropper on an
Internet-attached computer network that:
o can observe every packet of all communications at any hop in any
network path between an initiator and a recipient;
o can observe data at rest in any intermediate system between the
endpoints controlled by the initiator and recipient; and
o can share information with other such attackers; but
o takes no other action with respect to these communications (i.e.,
blocking, modification, injection, etc.).
The techniques available to our ideal attacker are direct
observation and inference. Direct observation involves taking
information directly from eavesdropped communications, such as
URLs identifying content or email addresses identifying
individuals from application- layer headers. Inference, on the
other hand, involves analyzing observed information to derive new
information, such as searching for application or behavioral
fingerprints in observed traffic to derive information about the
observed individual. The use of encryption is generally
sufficient to provide confidentiality by preventing direct
observation of content, assuming of course, uncompromised
encryption implementations and cryptographic keying material.
However, encryption provides less complete protection against
inference, especially inferences based only on plaintext portions
of communications, such as IP and TCP headers for TLS-protected
traffic [RFC5246]).
Barnes, et al. Expires November 29, 2015 [Page 5]
Internet-Draft Confidentiality Threat Model May 2015
3.1. Information subject to direct observation
Protocols which do not encrypt their payload make the entire content
of the communication available to the idealized attacker along their
path. Following the advice in [RFC3365], most such protocols have a
secure variant which encrypts payload for confidentiality, and these
secure variants are seeing ever-wider deployment. A noteworthy
exception is DNS [RFC1035], as DNSSEC [RFC4033] does not have
confidentiality as a requirement.
This implies that, in the absence of changes to the protocol as
presently under development in the IETF's DNS Private Exchange
(DPRIVE) working group [I-D.ietf-dprive-problem-statement], all DNS
queries and answers generated by the activities of any protocol are
available to the attacker.
When store-and-forward protocols are used, (e.g. SMTP [RFC5321])
intermediaries leave this data subject to observation by an attacker
that has compromised these intermediaries, unless the data is
encrypted end-to-end by the application layer protocol, or the
implementation uses an encrypted store for this data.
3.2. Information useful for inference
Inference is information extracted from later analysis of an observed
or eavesdropped communication, and/or correlation of observed or
eavesdropped information with information available from other
sources. Indeed, most useful inference performed by the attacker
falls under the rubric of correlation. The simplest example of this
is the observation of DNS queries and answers from and to a source
and correlating those with IP addresses with which that source
communicates. This can give access to information otherwise not
available from encrypted application payloads (e.g., the Host:
HTTP/1.1 request header when HTTP is used with TLS).
Protocols which encrypt their payload using an application- or
transport-layer encryption scheme (e.g. TLS) still expose all the
information in their network and transport layer headers to the
attacker, including source and destination addresses and ports.
IPsec ESP [RFC4303] further encrypts the transport-layer headers, but
still leaves IP address information unencrypted; in tunnel mode,
these addresses correspond to the tunnel endpoints. Features of the
security protocols themselves, e.g. the TLS session identifier, may
leak information that can be used for correlation and inference.
While this information is much less semantically rich than the
application payload, it can still be useful for the inferring an
individual's activities.
Barnes, et al. Expires November 29, 2015 [Page 6]
Internet-Draft Confidentiality Threat Model May 2015
Inference can also leverage information obtained from sources other
than direct traffic observation. Geolocation databases, for example,
have been developed that map IP addresses to a location, in order to
provide location-aware services such as targeted advertising. This
location information is often of sufficient resolution that it can be
used to draw further inferences toward identifying or profiling an
individual.
Social media provide another source of more or less publicly
accessible information. This information can be extremely
semantically rich, including information about an individual's
location, associations with other individuals and groups, and
activities. Further, this information is generally contributed and
curated voluntarily by the individuals themselves: it represents
information which the individuals are not necessarily interested in
protecting for privacy reasons. However, correlation of this social
networking data with information available from direct observation of
network traffic allows the creation of a much richer picture of an
individual's activities than either alone.
We note with some alarm that there is little that can be done at
protocol design time to limit such correlation by the attacker, and
that the existence of such data sources in many cases greatly
complicates the problem of protecting privacy by hardening protocols
alone.
3.3. An illustration of an ideal passive pervasive attack
To illustrate how capable the idealized attacker is even given its
limitations, we explore the non-anonymity of encrypted IP traffic in
this section. Here we examine in detail some inference techniques
for associating a set of addresses with an individual, in order to
illustrate the difficulty of defending communications against our
idealized attacker. Here, the basic problem is that information
radiated even from protocols which have no obvious connection with
personal data can be correlated with other information which can
paint a very rich behavioral picture, that only takes one unprotected
link in the chain to associate with an identity.
3.3.1. Analysis of IP headers
Internet traffic can be monitored by tapping Internet links, or by
installing monitoring tools in Internet routers. Of course, a single
link or a single router only provides access to a fraction of the
global Internet traffic. However, monitoring a number of high
capacity links or a set of routers placed at strategic locations
provides access to a good sampling of Internet traffic.
Barnes, et al. Expires November 29, 2015 [Page 7]
Internet-Draft Confidentiality Threat Model May 2015
Tools like IPFIX [RFC7011] allow administrators to acquire statistics
about sequences of packets with some common properties that pass
through a network device. The most common set of properties used in
flow measurement is the "five-tuple" of source and destination
addresses, protocol type, and source and destination ports. These
statistics are commonly used for network engineering, but could
certainly be used for other purposes.
Let's assume for a moment that IP addresses can be correlated to
specific services or specific users. Analysis of the sequences of
packets will quickly reveal which users use what services, and also
which users engage in peer-to-peer connections with other users.
Analysis of traffic variations over time can be used to detect
increased activity by particular users, or in the case of peer-to-
peer connections increased activity within groups of users.
3.3.2. Correlation of IP addresses to user identities
The correlation of IP addresses with specific users can be done in
various ways. For example, tools like reverse DNS lookup can be used
to retrieve the DNS names of servers. Since the addresses of servers
tend to be quite stable and since servers are relatively less
numerous than users, an attacker could easily maintain its own copy
of the DNS for well-known or popular servers, to accelerate such
lookups.
On the other hand, the reverse lookup of IP addresses of users is
generally less informative. For example, a lookup of the address
currently used by one author's home network returns a name of the
form "c-192-000-002-033.hsd1.wa.comcast.net". This particular type
of reverse DNS lookup generally reveals only coarse-grained location
or provider information, equivalent to that available from
geolocation databases.
In many jurisdictions, Internet Service Providers (ISPs) are required
to provide identification on a case by case basis of the "owner" of a
specific IP address for law enforcement purposes. This is a
reasonably expedient process for targeted investigations, but
pervasive surveillance requires something more efficient. This
provides an incentive for the attacker to secure the cooperation of
the ISP in order to automate this correlation.
3.3.3. Monitoring messaging clients for IP address correlation
Even if the ISP does not cooperate, user identity can often be
obtained via inference. POP3 [RFC1939] and IMAP [RFC3501] are used
to retrieve mail from mail servers, while a variant of SMTP is used
to submit messages through mail servers. IMAP connections originate
Barnes, et al. Expires November 29, 2015 [Page 8]
Internet-Draft Confidentiality Threat Model May 2015
from the client, and typically start with an authentication exchange
in which the client proves its identity by answering a password
challenge. The same holds for the SIP protocol [RFC3261] and many
instant messaging services operating over the Internet using
proprietary protocols.
The username is directly observable if any of these protocols operate
in cleartext; the username can then be directly associated with the
source address.
3.3.4. Retrieving IP addresses from mail headers
SMTP [RFC5321] requires that each successive SMTP relay adds a
"Received" header to the mail headers. The purpose of these headers
is to enable audit of mail transmission, and perhaps to distinguish
between regular mail and spam. Here is an extract from the headers
of a message recently received from the "perpass" mailing list:
"Received: from 192-000-002-044.zone13.example.org (HELO
?192.168.1.100?) (xxx.xxx.xxx.xxx) by lvps192-000-002-219.example.net
with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 27 Oct
2013 21:47:14 +0100 Message-ID: <[email protected]> Date:
Sun, 27 Oct 2013 20:47:14 +0000 From: Some One <[email protected]>
"
This is the first "Received" header attached to the message by the
first SMTP relay; for privacy reasons, the field values have been
anonymized. We learn here that the message was submitted by "Some
One" on October 27, from a host behind a NAT (192.168.1.100)
[RFC1918] that used the IP address 192.0.2.44. The information
remained in the message, and is accessible by all recipients of the
"perpass" mailing list, or indeed by any attacker that sees at least
one copy of the message.
An attacker that can observe sufficient email traffic can regularly
update the mapping between public IP addresses and individual email
identities. Even if the SMTP traffic was encrypted on submission and
relaying, the attacker can still receive a copy of public mailing
lists like "perpass".
3.3.5. Tracking address usage with web cookies
Many web sites only encrypt a small fraction of their transactions.
A popular pattern is to use HTTPS for the login information, and then
use a "cookie" to associate following clear-text transactions with
the user's identity. Cookies are also used by various advertisement
services to quickly identify the users and serve them with
"personalized" advertisements. Such cookies are particularly useful
Barnes, et al. Expires November 29, 2015 [Page 9]
Internet-Draft Confidentiality Threat Model May 2015
if the advertisement services want to keep tracking the user across
multiple sessions that may use different IP addresses.
As cookies are sent in clear text, an attacker can build a database
that associates cookies to IP addresses for non-HTTPS traffic. If
the IP address is already identified, the cookie can be linked to the
user identify. After that, if the same cookie appears on a new IP
address, the new IP address can be immediately associated with the
pre-determined identity.
3.3.6. Graph-based approaches to address correlation
An attacker can track traffic from an IP address not yet associated
with an individual to various public services (e.g. websites, mail
servers, game servers), and exploit patterns in the observed traffic
to correlate this address with other addresses that show similar
patterns. For example, any two addresses that show connections to
the same IMAP or webmail services, the same set of favorite websites,
and game servers at similar times of day may be associated with the
same individual. Correlated addresses can then be tied to an
individual through one of the techniques above, walking the "network
graph" to expand the set of attributable traffic.
3.3.7. Tracking of Link Layer Identifiers
Moving back down the stack, technologies like Ethernet or Wi-Fi use
MAC Addresses to identify link-level destinations. MAC Addresses
assigned according to IEEE-802 standards are globally-unique
identifiers for the device. If the link is publicly accessible, an
attacker can eavesdrop and perform tracking. For example, the
attacker can track the wireless traffic at publicly accessible Wi-Fi
networks. Simple devices can monitor the traffic, and reveal which
MAC Addresses are present. Also, devices do not need to be connected
to a network to expose link-layer identifiers. Active service
discovery always discloses the MAC address of the user, and sometimes
the SSIDs of previously visited networks. For instance, certain
techniques such as the use of "hidden SSIDs" require the mobile
device to broadcast the network identifier together with the device
identifier. This combination can further expose the user to
inference attacks, as more information can be derived from the
combination of MAC address, SSID being probed, time and current
location. For example, a user actively probing for a semi-unique
SSID on a flight out of a certain city can imply that the user is no
longer at the physical location of the corresponding AP. Given that
large-scale databases of the MAC addresses of wireless access points
for geolocation purposes have been known to exist for some time, the
attacker could easily build a database linking link-layer
identifiers, time and device or user identities, and use it to track
Barnes, et al. Expires November 29, 2015 [Page 10]
Internet-Draft Confidentiality Threat Model May 2015
the movement of devices and of their owners. On the other hand, if
the network does not use some form of Wi-Fi encryption, or if the
attacker can access the decrypted traffic, the analysis will also
provide the correlation between link-layer identifiers such as MAC
Addresses and IP addresses. Additional monitoring using techniques
exposed in the previous sections will reveal the correlation between
MAC addresses, IP addresses, and user identity. For instance,
similarly to the use of web cookies, MAC addresses provide identity
information that can be used to associate a user to different IP
addresses.
4. Reported Instances of Large-Scale Attacks
The situation in reality is more bleak than that suggested by an
analysis of our idealized attacker. Through revelations of sensitive
documents in several media outlets, the Internet community has been
made aware of several intelligence activities conducted by US and UK
national intelligence agencies, particularly the US National Security
Agency (NSA) and the UK Government Communications Headquarters
(GCHQ). These documents have revealed methods that these agencies
use to attack Internet applications and obtain sensitive user
information. There is little reason to suppose that only the US or
UK governments are involved in these sorts of activities; the
examples are just ones that were disclosed. We note that these
reports are primarily useful as an illustration of the types of
capabilities fielded by pervasive attackers as of the date of the
Snowden leaks in 2013.
First, they confirm the deployment of large-scale passive collection
of Internet traffic, which confirms the existence of pervasive
passive attackers with at least the capabilities of our idealized
attacker. For example [pass1][pass2][pass3][pass4]:
o NSA's XKEYSCORE system accesses data from multiple access points
and searches for "selectors" such as email addresses, at the scale
of tens of terabytes of data per day.
o GCHQ's Tempora system appears to have access to around 1,500 major
cables passing through the UK.
o NSA's MUSCULAR program has tapped cables between data centers
belonging to major service providers.
o Several programs appear to perform wide-scale collection of
cookies in web traffic and location data from location-aware
portable devices such as smartphones.
Barnes, et al. Expires November 29, 2015 [Page 11]
Internet-Draft Confidentiality Threat Model May 2015
However, the capabilities described by these reports go beyond those
of our idealized attacker. They include the compromise of
cryptographic protocols, including decryption of TLS-protected
Internet sessions [dec1][dec2][dec3]. For example, the NSA BULLRUN
project worked to undermine encryption through multiple approaches,
including covert modifications to cryptographic software on end
systems.
Reported capabilities include the direct compromise of intermediate
systems and arrangements with service providers for bulk data and
metadata access [dir1][dir2][dir3], bypassing the need to capture
traffic on the wire. For example, the NSA PRISM program provides the
agency with access to many types of user data (e.g., email, chat,
VoIP).
The reported capabilities also include elements of active pervasive
attack, including:
o Insertion of devices as a man-in-the-middle of Internet
transactions [TOR1][TOR2]. For example, NSA's QUANTUM system
appears to use several different techniques to hijack HTTP
connections, ranging from DNS response injection to HTTP 302
redirects.
o Use of implants on end systems to undermine security and anonymity
features [dec2][TOR1][TOR2]. For example, QUANTUM is used to
direct users to a FOXACID server, which in turn delivers an
implant to compromise browsers of Tor users.
o Use of implants on network elements from many major equipment
providers, including Cisco, Juniper, Huawei, Dell, and HP, as
provided by the NSA's Advanced Network Technology group.
[spiegel1]
o Use of botnet-scale collections of compromised hosts [spiegel3].
The scale of the compromise extends beyond the network to include
subversion of the technical standards process itself. For example,
there is suspicion that NSA modifications to the DUAL_EC_DRBG random
number generator were made to ensure that keys generated using that
generator could be predicted by NSA. This RNG was made part of
NIST's SP 800-90A, for which NIST acknowledges NSA's assistance.
There have also been reports that the NSA paid RSA Security for a
related contract with the result that the curve became the default in
the RSA BSAFE product line.
We use the term "pervasive attack" [RFC7258] to collectively describe
these operations. The term "pervasive" is used because the attacks
Barnes, et al. Expires November 29, 2015 [Page 12]
Internet-Draft Confidentiality Threat Model May 2015
are designed to indiscriminately gather as much data as possible and
to apply selective analysis on targets after the fact. This means
that all, or nearly all, Internet communications are targets for
these attacks. To achieve this scale, the attacks are physically
pervasive; they affect a large number of Internet communications.
They are pervasive in content, consuming and exploiting any
information revealed by the protocol. And they are pervasive in
technology, exploiting many different vulnerabilities in many
different protocols.
Again, it's important to note that, although the attacks mentioned
above were executed by NSA and GCHQ, there are many other
organizations that can mount pervasive surveillance attacks. Because
of the resources required to achieve pervasive scale, these attacks
are most commonly undertaken by nation-state actors. For example,
the Chinese Internet filtering system known as the "Great Firewall of
China" uses several techniques that are similar to the QUANTUM
program, and which have a high degree of pervasiveness with regard to
the Internet in China. Therefore, legal restrictions in any one
jurisdiction on pervasive monitoring activities cannot eliminate the
risk of pervasive attack to the Internet as a whole.
5. Threat Model
Given these disclosures, we must consider a broader threat model.
Pervasive surveillance aims to collect information across a large
number of Internet communications, analyzing the collected
communications to identify information of interest within individual
communications, or inferring information from correlated
communications. This analysis sometimes benefits from decryption of
encrypted communications and deanonymization of anonymized
communications. As a result, these attackers desire both access to
the bulk of Internet traffic and to the keying material required to
decrypt any traffic that has been encrypted. Even if keys are not
available, note that the presence of a communication and the fact
that it is encrypted may both be inputs to an analysis, even if the
attacker cannot decrypt the communication.
The attacks listed above highlight new avenues both for access to
traffic and for access to relevant encryption keys. They further
indicate that the scale of surveillance is sufficient to provide a
general capability to cross-correlate communications, a threat not
previously thought to be relevant at the scale of the Internet.
Barnes, et al. Expires November 29, 2015 [Page 13]
Internet-Draft Confidentiality Threat Model May 2015
5.1. Attacker Capabilities
+--------------------------+-------------------------------------+
| Attack Class | Capability |
+--------------------------+-------------------------------------+
| Passive observation | Directly capture data in transit |
| | |
| Passive inference | Infer from reduced/encrypted data |
| | |
| Active | Manipulate / inject data in transit |
| | |
| Static key exfiltration | Obtain key material once / rarely |
| | |
| Dynamic key exfiltration | Obtain per-session key material |
| | |
| Content exfiltration | Access data at rest |
+--------------------------+-------------------------------------+
Security analyses of Internet protocols commonly consider two classes
of attacker: Passive pervasive attackers, who can simply listen in on
communications as they transit the network, and active pervasive
attackers, who can modify or delete packets in addition to simply
collecting them.
In the context of pervasive passive surveillance, these attacks take
on an even greater significance. In the past, these attackers were
often assumed to operate near the edge of the network, where attacks
can be simpler. For example, in some LANs, it is simple for any node
to engage in passive listening to other nodes' traffic or inject
packets to accomplish active pervasive attacks. However, as we now
know, both passive and active pervasive attacks are undertaken by
pervasive attackers closer to the core of the network, greatly
expanding the scope and capability of the attacker.
Eavesdropping and observation at a larger scale make passive
inference attacks easier to carry out: a passive pervasive attacker
with access to a large portion of the Internet can analyze collected
traffic to create a much more detailed view of individual behavior
than an attacker that collects at a single point. Even the usual
claim that encryption defeats passive pervasive attackers is
weakened, since a pervasive flow access attacker can infer
relationships from correlations over large numbers of sessions, e.g.,
pairing encrypted sessions with unencrypted sessions from the same
host, or performing traffic fingerprinting between known and unknown
encrypted sessions. Reports on the NSA XKEYSCORE system would
indicate it is an example of such an attacker.
Barnes, et al. Expires November 29, 2015 [Page 14]
Internet-Draft Confidentiality Threat Model May 2015
An active pervasive attacker likewise has capabilities beyond those
of a localized active attacker. Flow modification attacks are often
limited by network topology, for example by a requirement that the
attacker be able to see a targeted session as well as inject packets
into it. A pervasive flow modification attacker with access at
multiple points within the core of the Internet is able to overcome
these topological limitations and perform attacks over a much broader
scope. Being positioned in the core of the network rather than the
edge can also enable an active pervasive attacker to reroute targeted
traffic, amplifying the ability to perform both eavesdropping and
traffic injection. Active pervasive attackers can also benefit from
passive pervasive collection to identify vulnerable hosts.
While not directly related to pervasiveness, attackers that are in a
position to mount a active pervasive attack are also often in a
position to subvert authentication, a traditional protection against
such attacks. Authentication in the Internet is often achieved via
trusted third party authorities such as the Certificate Authorities
(CAs) that provide web sites with authentication credentials. An
attacker with sufficient resources may also be able to induce an
authority to grant credentials for an identity of the attacker's
choosing. If the parties to a communication will trust multiple
authorities to certify a specific identity, this attack may be
mounted by suborning any one of the authorities (the proverbial
"weakest link"). Subversion of authorities in this way can allow an
active attack to succeed in spite of an authentication check.
Beyond these three classes (observation, inference, and active),
reports on the BULLRUN effort to defeat encryption and the PRISM
effort to obtain data from service providers suggest three more
classes of attack:
o Static key exfiltration
o Dynamic key exfiltration
o Content exfiltration
These attacks all rely on a collaborator providing the attacker with
some information, either keys or data. These attacks have not
traditionally been considered in scope for the Security
Considerations sections of IETF protocols, as they occur outside the
protocol.
The term "key exfiltration" refers to the transfer of keying material
for an encrypted communication from the collaborator to the attacker.
By "static", we mean that the transfer of keys happens once, or
rarely, typically of a long-lived key. For example, this case would
Barnes, et al. Expires November 29, 2015 [Page 15]
Internet-Draft Confidentiality Threat Model May 2015
cover a web site operator that provides the private key corresponding
to its HTTPS certificate to an intelligence agency.
"Dynamic" key exfiltration, by contrast, refers to attacks in which
the collaborator delivers keying material to the attacker frequently,
e.g., on a per-session basis. This does not necessarily imply
frequent communications with the attacker; the transfer of keying
material may be virtual. For example, if an endpoint were modified
in such a way that the attacker could predict the state of its
psuedorandom number generator, then the attacker would be able to
derive per-session keys even without per-session communications.
Finally, content exfiltration is the attack in which the collaborator
simply provides the attacker with the desired data or metadata.
Unlike the key exfiltration cases, this attack does not require the
attacker to capture the desired data as it flows through the network.
The exfiltration is of data at rest, rather than data in transit.
This increases the scope of data that the attacker can obtain, since
the attacker can access historical data - the attacker does not have
to be listening at the time the communication happens.
Exfiltration attacks can be accomplished via attacks against one of
the parties to a communication, i.e., by the attacker stealing the
keys or content rather than the party providing them willingly. In
these cases, the party may not be aware that they are collaborating,
at least at a human level. Rather, the subverted technical assets
are "collaborating" with the attacker (by providing keys/content)
without their owner's knowledge or consent.
Any party that has access to encryption keys or unencrypted data can
be a collaborator. While collaborators are typically the endpoints
of a communication (with encryption securing the links),
intermediaries in an unencrypted communication can also facilitate
content exfiltration attacks as collaborators by providing the
attacker access to those communications. For example, documents
describing the NSA PRISM program claim that NSA is able to access
user data directly from servers, where it is stored unencrypted. In
these cases, the operator of the server would be a collaborator, if
an unwitting one. By contrast, in the NSA MUSCULAR program, a set of
collaborators enabled attackers to access the cables connecting data
centers used by service providers such as Google and Yahoo. Because
communications among these data centers were not encrypted, the
collaboration by an intermediate entity allowed NSA to collect
unencrypted user data.
Barnes, et al. Expires November 29, 2015 [Page 16]
Internet-Draft Confidentiality Threat Model May 2015
5.2. Attacker Costs
+--------------------------+-----------------------------------+
| Attack Class | Cost / Risk to Attacker |
+--------------------------+-----------------------------------+
| Passive observation | Passive data access |
| | |
| Passive inference | Passive data access + processing |
| | |
| Active | Active data access + processing |
| | |
| Static key exfiltration | One-time interaction |
| | |
| Dynamic key exfiltration | Ongoing interaction / code change |
| | |
| Content exfiltration | Ongoing, bulk interaction |
+--------------------------+-----------------------------------+
Each of the attack types discussed in the previous section entails
certain costs and risks. These costs differ by attack, and can be
helpful in guiding response to pervasive attack.
Depending on the attack, the attacker may be exposed to several types
of risk, ranging from simply losing access to arrest or prosecution.
In order for any of these negative consequences to occur, however,
the attacker must first be discovered and identified. So the primary
risk we focus on here is the risk of discovery and attribution.
A passive pervasive attack is the simplest to mount in some ways.
The base requirement is that the attacker obtain physical access to a
communications medium and extract communications from it. For
example, the attacker might tap a fiber-optic cable, acquire a mirror
port on a switch, or listen to a wireless signal. The need for these
taps to have physical access or proximity to a link exposes the
attacker to the risk that the taps will be discovered. For example,
a fiber tap or mirror port might be discovered by network operators
noticing increased attenuation in the fiber or a change in switch
configuration. Of course, passive pervasive attacks may be
accomplished with the cooperation of the network operator, in which
case there is a risk that the attacker's interactions with the
network operator will be exposed.
In many ways, the costs and risks for an active pervasive attack are
similar to those for a passive pervasive attack, with a few
additions. An active attacker requires more robust network access
than a passive attacker, since for example they will often need to
transmit data as well as receive it. In the wireless example above,
the attacker would need to act as an transmitter as well as receiver,
Barnes, et al. Expires November 29, 2015 [Page 17]
Internet-Draft Confidentiality Threat Model May 2015
greatly increasing the probability the attacker will be discovered
(e.g., using direction-finding technology). Active attacks are also
much more observable at higher layers of the network. For example,
an active attacker that attempts to use a mis-issued certificate
could be detected via Certificate Transparency [RFC6962].
In terms of raw implementation complexity, passive pervasive attacks
require only enough processing to extract information from the
network and store it. Active pervasive attacks, by contrast, often
depend on winning race conditions to inject packets into active
connections. So active pervasive attacks in the core of the network
require processing hardware to that can operate at line speed
(roughly 100Gbps to 1Tbps in the core) to identify opportunities for
attack and insert attack traffic in a high-volume traffic. Key
exfiltration attacks rely on passive pervasive attack for access to
encrypted data, with the collaborator providing keys to decrypt the
data. So the attacker undertakes the cost and risk of a passive
pervasive attack, as well as additional risk of discovery via the
interactions that the attacker has with the collaborator.
Some active attacks are more expensive than others. For example,
active man-in-the-middle (MITM) attacks require access to one or more
points on a communication's network path that allow visibility of the
entire session and the ability to modify or drop legitimate packets
in favor of the attacker's packets. A similar but weaker form of
attack, called an active man-on-the-side (MOTS), requires access to
only part of the session. In an active MOTS attack, the attacker
need only be able to inject or modify traffic on the network element
the attacker has access to. While this may not allow for full
control of a communication session (as in an MITM attack), the
attacker can perform a number of powerful attacks, including but not
limited to: injecting packets that could terminate the session (e.g.,
TCP RST packets), sending a fake DNS reply to redirect ensuing TCP
connections to an address of the attacker's choice (i.e., winning a
"DNS response race"), and mounting an HTTP Redirect attack by
observing a TCP/HTTP connection to a target address and injecting a
TCP data packet containing an HTTP redirect. For example, the system
dubbed by researchers as China's "Great Cannon" [great-cannon] can
operate in ful MITM mode to accomplish very complex attacks that can
modify content in transit while the well-known Great Firewall of
China is a MOTS system that focuses on blocking access to certain
kinds of traffic and destinations via TCP RST packet injection.
In this sense, static exfiltration has a lower risk profile than