By John Bethencourt, Jason Franklin, and Mary Vernon Computer Sciences Department University of...

Mapping Internet Sensors with Probe Response Attacks By John Bethencourt, Jason Franklin, and Mary Vernon Computer Sciences Department University of Wisconsin, Madison Published in the Proceedings of the 14th USENIX Security Symposium Presented by: Peter Matthews

Transcript of By John Bethencourt, Jason Franklin, and Mary Vernon Computer Sciences Department University of...

Slide 1

Mapping Internet Sensors with Probe Response AttacksBy John Bethencourt, Jason Franklin, and Mary VernonComputer Sciences DepartmentUniversity of Wisconsin, Madison

Published in the Proceedings of the 14th USENIX Security Symposium

Presented by: Peter Matthews

OutlineInternet Sensor NetworksProbe Response AttackCase Study: Sans Internet Storm CenterSimulation ResultsGeneralizing the attackCountermeasuresRelated WorkConclusionInternet Sensor NetworksA collection of systems which monitor portions of the Internet and produce statistics related to Internet traffic patterns and anomalies.Log collection and analysis centersCollaborative intrusion detection systemsHoneypots / honeynetsNetwork telescopes / darknetsInternet Sensor NetworksInternet sensors are useful for distributed intrusion detection and monitoring such as:Quickly detecting worm outbreaks and new attacks before a large number of vulnerable systems are compromisedProviding useful aggregation of the occurrence of relatively rare eventsDetermining the prevalence of malicious activity like port scans, DoS attacks, and botnetsBlacklisting hosts controlled by malicious usersSensor Privacy is CriticalThe integrity of an Internet sensor network is based upon the assumption that the IP addresses of systems that serve as sensors are secret.If the addresses were known, information obtained by the network could no longer be considered an accurate picture of internet activity because attackers couldAvoid the addresses in malicious activity like port scanning and worm spreading, allowing it to go undetectedSkew sensor statistics to hide malicious activityFlood the addresses with extraneous activity, causing valid information to be lost in noiseDamage is permanentOrganizations cannot easily change the IP addresses available to themInternet sensor networks cannot arbitrarily pick who will participate5Probe an IP address with activity that will be reported if the address is monitoredWait for next report to be published, check for the activity, and decide whether the address was monitoredRepeat for every IP address

General Attack IdeaMaking the Problem TractableThere are simply too many addresses to check sequentially~2.1 billion valid, routable IP addressesMost logs only submitted to the ISC hourlySo, check many in parallelVery small fraction of IP addresses are monitored, so send same probe to many addressesIf no activity is reported then can rule out all of themElse, report provides the number of monitored addressesSince activity reported by port, send probes with different ports to run many independent tests at the same timeDivide and conquerPartition the IP space into search intervals to manage this parallelismOnly one TCP packet necessary for each probe

Probe Response AlgorithmThe basic probe response algorithm operates in two stages.Stage I -- Probe the entire routable IP space to count the number of sensors in each search interval, Si. Drop empty search intervals.Stage II -- Iteratively probe each remaining interval, Ri, discarding empty intervals until all individual sensors are located.First StageBegin with list of 2.1 billion valid IP addresses and n low utilization portsDivide IP range into n search intervals S1, S2, , SnSend TCP SYN packet on port Pi to each address in SiWait time interval and retrieve port reportRule out intervals corresponding to ports with no activityPackets on port P1Packets on port P2Packets on port P3Packets on port PnS1S2S3Sn9Distribute ports among k remaining intervals R1, R2, , Rk, assigning n/k ports to eachFor each RiDivide into n/k + 1 subintervalsSend a probe on port Pj to each address in the jth subintervalNo need to probe last subinterval, as can infer number of monitored addresses it contains from total for parent intervalRepeat second stage with non-empty subintervals until all addresses are marked as monitored or unmonitoredSecond StageP1P2P3P410Sample Run11NoiseWhat if other activity is present in port reports? For a large number of ports, there is a very low average level of activityUse only ports that consistently have less than k reports per time intervalSend k packets in each probeDivide the number of reports by k and round downPortsReports of Activity561 519,364 10

41,357 15

51,959 20

56,305 25

12Case Study:To evaluate the threat of probe response attacks, paper analyzes the feasibility of mapping the SANS Internet Storm CenterOne of the most important examples of systems which collect, analyze, and report data from Internet sensorsA challenging network to mapOver 680,000 IP addresses monitoredThese are broadly scattered across the IP address space

ISC SensorsCurrently, the ISC collects packet filter logsLogs primarily contain failed connection attemptsMore than 2,000 organizations and individuals participateLogs are typically uploaded on an hourly basis

ISC Reports

The ISC publishes a number of reports and statisticsAttack uses port reportsLists the amount of activity on each destination portSimulation ScenariosT1 attacker 1.544 Mbps of upload bandwidthFractional T3 attacker 38.4 Mbps of upload bandwidthOC6 attacker 384 Mbps of upload bandwidthMonitored locations based on accurate, if obfuscated, representation of the IP addresses monitored by the ISC

ResultsBandwidthSet of AddressesData SentTime to MapOC6ISC1300 GB2 days, 22 hoursFractional T3ISC687 GB4 days, 16 hoursT1ISC440 GB33 days, 17 hoursFractional T3Average cluster size 10~600 GB~2 daysFractional T3Totally Random-- No Clustering~860 GB~9 days19Supersets and SubsetsSuperset of the monitored addressesE.g., only interested in simply avoiding detectionIf a T1 attacker allows .001 false negative rate Runtime reduced from 33 days and 17 hrs to 15 days and 18 hrsMisses 26 percent of the sensorsSubset of the monitored addressesE.g., only interested in flooding the monitored addresses with noiseIf a T3 attacker allows a .94 false positive rateRuntime reduced from 112 to 78 hrsReturns 3.5 million false positivesKey Simulation ResultsProbe response attacks are feasible and thus pose a real threatBoth a real set of monitored IP addresses and various synthetic sets can be mapped in reasonable timeMapping is possible even with very limited resourcesAn attacker with only a DSL line could do it in ~4 monthsTime to complete only depends on upload bandwidthDoes not require significant state or complete TCP handshakesThus, botnet utilization would not pose a problem

Generalizing the AttackIn our attack, an attacker gains information by:Sending probes with different destination ports to different ranges of IP addressesNoting for which ports activity is reportedUsing this information to determine the set of IP addresses that could have received probesThe destination port appearing in the probe sent out and later in the port reports is used by the attacker as a covert channel in a message to themselvesCovert ChannelsMany possible fields typically appearing in ISN reports are suitable for use by attackers as covert channels in a message to themselvesUsing one or more of these covert channels, an attacker can encode information about a destination IP address in a packetTime / dateSource IPSource networkSource portDestination networkDestination portProtocol (TCP or UDP)Packet lengthCaptured payload signatureCountermeasuresExisting approaches to hide sensitive report dataHashing, Encryption, and PermutationsHash report fields such as source address to hide data but allow test for matchingPrefix-preserving permutations obscure source / destination host addresses while still allowing useful analysis of relationships between hostsBloom FiltersData structure allows for a space efficient set membership tests with a configurable false positive rateVulnerable to iterative probe response attacksThese do not prevent probe attacks, as sufficient covert channels remainIPv6Increases IP addresses from 32 bits to 128 bits2^128 = 3.4 1038Makes TCP/UDP scanning impracticalEffective countermeasure if widely adoptedHowever, accounts for a tiny fraction of the used addresses and the traffic in the public InternetLike egress filtering, widespread adoption is difficult to achieveLimiting InformationRestrict the information provided in public reports in some wayKeep reports privateEliminate all public access to reportsLimits the usefulness of the sensor networkOnly publish most significant eventsStill provides some useful information, but not a complete depiction of Internet conditionsAttackers may be able to avoid detection by keeping their levels of activity below thresholdsQuery limitingRequire payment, computational tasks, or CAPTCHA to perform queryOnly slows mapping attacksDelayed ReportingRetain the reports for a specified period of time before their releaseFor example, release last week's dataForces attacker to either wait a long period between iterations of attack or use a non-adaptive algorithmDelaying reports greatly reduces the usefulness of the sensor network in providing real-time notification of new attacks and other phenomenaRandom Input SamplingUse a random sample of the logs received in a time interval order to generate reportsE.g., suppose an analysis center has a 80% likelihood of discarding a log it receivesTo reduce the false negative rate, the attacker would have to send multiple probesTo reduce the false negative rate of 80% to 1%, an attacker would need to a twenty-fold increase in bandwidthConclusion Paper StrengthsPresents a simple, clear, and feasible attackPresents a good survey of existing and proposed countermeasures to attacks on sensor anonymityWell written

Conclusion Paper WeaknessesCountermeasures are not explored to same depth as attackNo simulation-based examination of countermeasure effectivenessSimulation may be overly simplisticUnfortunate real-world effects like packet loss might have a fairly significant effect given that each step of iterative algorithm builds on results of the lastFuture WorkNon-adaptive algorithmsMentioned in paper, do not require an iterative approachEffective countermeasuresCombining random sampling, query limiting, and minimal level of randomly delayed log reporting?Is there simply some inevitable trade-off between utility and anonymity?How to quantifyPublic-private approaches significant events made public, the rest requiring privileged accessAn opposite, if idealistic, approach make sensors so pervasive that anonymity is no longer required