1 Structure Preserving Anonymization of Router Configuration Data David A. Maltz, Jibin Zhan,...

22
1 Structure Preserving Anonymization of Router Configuration Data David A. Maltz, Jibin Zhan, Geoffrey Xie, Hui Zhang Carnegie Mellon University Gisli Hjalmtysson, Albert Greenberg, Jennifer Rexford ATT Labs Research

Transcript of 1 Structure Preserving Anonymization of Router Configuration Data David A. Maltz, Jibin Zhan,...

1

Structure Preserving Anonymization of Router Configuration Data

David A. Maltz,

Jibin Zhan, Geoffrey Xie, Hui ZhangCarnegie Mellon University

Gisli Hjalmtysson, Albert Greenberg, Jennifer RexfordATT Labs Research

2

Why Configuration Files are Valuable

Configuration file = program loaded on each router• Controls operation of router• Controls interactions between routers

Configuration files allow researchers to study of the details of real networks

• The problem is getting access to them• We have developed a technique for

anonymizing configuration files• We have a proposal for how configs could be

made accessible to the research community

3

Why Configuration Files are Valuable - 2

The set of configurations defines the network• Captures many of the network’s properties

– Topology (node degree, interconnectivity)– Policies (CoS, QoS, packet filters, reachability)– Routing (neighbors, OSPF weights, BGP policies)– Security (vulnerabilities, mitigations)

Only source of insight for Enterprise networks• 10K+ networks that are currently a mystery• Interesting! 10 – 1200 routers, global scale• Configs are the only way to look at them

– Networks firewalled, external probes dropped

4

Topology

Router 1 Config Router 2 Config

Internet

interface Serial2/1.5

ip address 1.1.1.2/30

interface Serial1/0.5

ip address 1.1.1.1/30

5

Quality of Service

class-map GoodCustomer  

match access-group 136

policy-map GoldService

class GoodCustomer 

bandwidth 2000

queue-limit 40

class class-default  

fair-queue 16

queue-limit 20

interface Serial0/0

service-policy output GoldServiceCB-WFQ policy name

CB-WFQ parameters

Class definition

6

Routing

router bgp 65501

neighbor EdgeSwitch peer-group

neighbor EdgeSwitch remote-as 64740

neighbor EdgeSwitch distribute-list 11 in

neighbor EdgeSwitch route-map exportRoutes out

neighbor 192.168.96.8 peer-group EdgeSwitch

neighbor 192.168.96.9 peer-group EdgeSwitch

neighbor 10.217.248.14 remote-as 65500

neighbor 10.217.248.14 ebgp-multihop 5

AS Numbers

Policies

Peers

7

Security Issues

access-list 143 deny 53 any anyaccess-list 143 deny 55 any anyaccess-list 143 deny 77 any anyaccess-list 143 permit ip any any

interface Serial0.2 multipoint ip access-group 143 in ip address 66.248.162.13 255.255.255.224 interface Ethernet0 ip address 144.201.41.59 255.255.255.0

Access list 143:

Drops packets that can attack Cisco interfaces

This interface is safe

This interface is not

8

How to Get Configuration Files?

Considered proprietary secrets of network owners• Discloses business strategy• Discloses vulnerabilities

Anonymization breaks tie between data and owner• Anonymized configs will show some network is vulnerable,

but which/where to attack?

We developed method for anonymizing configuration files• Approach convinced some customers of ATT to disclose

their configs to CMU researchers

9

Anonymization Challenges

We don’t know the intended use of the data• Must anonymize entire configuration file• A customized data set is easier to anonymize

Must preserve structure of information in files• Relationships of identifiers inside/between files• IP address subnet relationships

Traditional parsing tools are of no use• No published grammar for Cisco IOS• 200+ different versions seen in 31 networks

10

Anonymize Non-numeric TokensCreated “pass list” of words by string-scraping Cisco’s

web pages• Contains most IOS commands• Other words are generic networking terms (“IETF”)

All tokens not in pass list are hashed with salted SHA1

router bgp 64780 redistribute ospf 64 match route-map NYOffice neighbor 1.2.3.4 remote-as 701route-map NYOffice deny 10 match ip address 4

router bgp 64780 redistribute ospf 64 match route-map 8aTzlvBrbaW neighbor 66.253.160.68 remote-as 701route-map 8aTzlvBrbaW deny 10 match ip address 4

11

Anonymize Specific Numbers

Most numbers are harmless, some reveal identity• Public AS numbers• Phone numbers (NOCs, backup modems)

26 rules used to find and anonymize context-dependent items• "neighbor\\s+$ipAddrPatt\\s+remote-as"• " neighbor\s+\w+\s+remote-as "

router bgp 64780 redistribute ospf 64 match route-map NYOffice neighbor 1.2.3.4 remote-as 701route-map NYOffice deny 10 match ip address 4

router bgp 64780 redistribute ospf 64 match route-map 8aTzlvBrbaW neighbor 66.253.160.68 remote-as 1237route-map 8aTzlvBrbaW deny 10 match ip address 4

12

Limits of Anonymization

Anonymization is a lossy process• Comments & meaningful identifiers removed• (Were they right anyway???)

Anonymizer preserves relationships it knows about• Doesn’t know about IP addr <-> ASN mapping• A packet filter, based on IP address, and route

policy, based on ASN, could target same AS• Post-anonymization: both mechanisms preserved,

but won’t show them targeting same AS• (Router didn’t have that external information either)

13

Potential Vulnerabilities: Textual Attacks

Identifying information left in configs

Heuristics used as double-check• Rules that anonymize public AS numbers

record the public AS numbers they find• Search post-anonymization file for any

remaining occurrences

14

Potential Vulnerabilities:Fingerprinting Attacks

Network characteristics (fingerprint) extracted from anonymized configs matched against public data

Potential fingerprints• BGP community strings• Number of POPs, number of BGP peers• Structure of address space utilization• Others…Evaluation still in progress• Seems like backbone networks are identifiable• Seems like enterprise networks are not

15

A Clearinghouse for Configuration Data

Website enforcing single-blind methodology

Network owners

Researchers

Retrieve Anonymizer

Anonymize & test configs

Upload configs

Run tools on site:Scalable, pictures

Register with site

Retrieve configs

Analyze data Questions

Results

Blinded email

Questions Results

Blinded email

Boot-strap with configs from academic/research institutions?

16

Questions?

17

Fingerprinting Attacks

1. For each anonymized network, compute fingerprint from anonymized config files

• Will be 100% accurate2. Experimentally measure real networks

BGPPeers per POP

POPs (sorted by peers/POP)

Data from networks in repository of anonymized configs

18

Fingerprinting Attacks

Evaluation still in progress• Seems like backbone networks are identifiable• Seems like enterprise networks are not

BGPPeers per POP

POPs (sorted by peers/POP)

Measured network characteristics

19

Anonymize Regular ExpressionsSome AS numbers appear in regular expressions• Expressions w/ only private AS numbers ! no change

• Expressions w/ public AS numbers ! expand and anonymize

ip as-path access-list 101 permit _70 [1-3]_

1234, 543, 21

ip as-path access-list 101 permit _(1234|543|21)_

ip as-path access-list 99 permit _6451[2-9]_

64512, 64513, … 64519

ip as-path access-list 99 permit _6451[2-9]_

Anonymize701, 702, 703

20

Anonymize IP Addresses

Extended Minshall’s prefix-preserving algorithm

Made it class preserving• Class A to Class A, etc.

– RIP and older protocols are class-full

Made it “subnet address” preserving• Assume 128.2.0.0/16 is subnet• We want 128.2.0.0 ! 150.7.0.0• Before extension, 128.2.0.0 ! 150.7.43.66

21

Anonymize IP Addresses - 2

Made it “special address” preserving• Multicast, private address space• Must fix collisions in mapping function

IP Addr Special? Anonymize

Y

N

Special?Y

N

22

Anonymization Overview

Minimize dependence on context• If in-doubt, hash it out

1. Remove all comments

2. Find all IP addresses and hash using specialized prefix-preserving anonymization

3. Hash all non-numeric tokens not known to be safe

4. Anonymize specific numeric tokens using regular expressions

5. Anonymize regular expressions appearing in configs