The RIPE NCC Internet Measurement Data Repository

41
3 April 2010 The RIPE NCC Internet Measurement Data Repository Shane Alcock

description

Shane Alcock. The RIPE NCC Internet Measurement Data Repository. Introductions. Research Programmer with WAND NOT affiliated with RIPE NCC, just speaking on their behalf Passive measurement Organise packet trace captures Maintainer of the WITS website - PowerPoint PPT Presentation

Transcript of The RIPE NCC Internet Measurement Data Repository

Page 1: The RIPE NCC Internet Measurement Data Repository

3 April 2010

The RIPE NCC Internet Measurement Data Repository

Shane Alcock

Page 2: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 2

Introductions

• Research Programmer with WAND

• NOT affiliated with RIPE NCC, just speaking on their behalf

• Passive measurement

• Organise packet trace captures

• Maintainer of the WITS website

• Experienced in dealing with measurement data sets

Page 3: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 3

Outline

• Sharing Internet datasets

• Challenges

• Case studies

• The RIPE NCC repository

• Available datasets

• Other RIPE datasets that may be added

Page 4: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 4

Sharing Measurement Data

• Internet measurement research requires data

• Often it is difficult to collect suitable data

• Privacy

• Security

• Cost of infrastructure

• Selecting appropriate times and locations

Page 5: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 5

Sharing Measurement Data

• Sharing data with the community is an awesome idea

• Saves time and effort

• Promotes collaboration

• Enables validation of previous results

• Encourages others to share their data as well

Page 6: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 6

Sharing Measurement Data

• WITS – Waikato Internet Traffic Storage

• http://www.wand.net.nz/wits

• CAIDA

• http://www.caida.org/data/

• PREDICT

• https://www.predict.org/

• CRAWDAD

• http://crawdad.cs.dartmouth.edu/data.php

• NLANR

• No longer exists :(

Page 7: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 7

Challenges

• Community awareness

• Datasets are scattered amongst multiple hosts

• Lack of publicity and detailed information about datasets

• Meta-data

• DatCat (CAIDA)

• http://www.datcat.org

• Catalogue of publicly available datasets

• Not an actual repository – data is hosted externally

• Not a comprehensive resource

Page 8: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 8

Challenges

• Repositories often maintained by research groups

• Limited funding, therefore limited resources

• People

• Expertise

• Disk space

• Bandwidth

Page 9: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 9

Case Study: WITS

• Maintenance is intermittent

• Maintainer has many other responsibilities

• Disk space is a huge limitation

• No room on the FTP server to put new data sets

• Adding new disks costs both money and time

• Sanitizing datasets requires even more space as we must retain the original version as well

• Bandwidth

• Cost of commercial bandwidth hinders availability of data

• Enable access via KAREN (NZ national research network) only

• Fortunately, KAREN peers with many international NRENs

Page 10: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 10

Challenges

• Permanence

• Research groups typically depend on competitive funding

• Funding runs out – repository vanishes

• Loss of data is a major issue

• No longer able to replicate and validate previous studies

Page 11: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 11

Case Study: NLANR

• Large public archive of measurement data

• Auckland, Abilene traces (PMA)

• AMP

• US government ceased funding

• Repository no longer maintained

• Domain eventually expired

• CAIDA and WAND salvaged the data

• Traces now available on WITS

• Without intervention, the data could easily have been lost permanently

Page 12: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 12

Challenges

• Avoiding inappropriate disclosure

• Anonymisation of sensitive information, e.g. IP addresses

• Developing policy to cover user access and agreements

• Many datasets have unique restrictions or policies

• Policy that is appropriate for one dataset is not for another

• Personal contact information

• IP addresses

• User payload in packet traces

Page 13: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 13

Challenges

• Communication with users

• Data sharing is often not top priority for collectors

• Collection designed to suit their purposes

• Small changes to the collection process can often make the data more useful to a wider audience

• Encourage users to engage with collectors

Page 14: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 14

Challenges

• Support

• Measurement data is complicated to deal with

• Steep learning curve

• Formats, e.g. PCAP vs ERF vs legacy DAG formats for traces

• Tools / Processing libraries

• Timezones

• Documentation of shared datasets is often poor

• User support is intermittent, due to lack of resources again

Page 15: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 15

Challenges

• Size

• Internet measurement datasets are huge

• Push modern storage technologies to the limit

• Server hosting and maintenance

Page 16: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 16

The RIPE NCC Repository

• RIPE NCC collects a lot of measurement data already

• They want to share this data with the community

• Most is already available through various repositories

• Develop a single common and consistent platform

• Hosting

• Browsing

• Accessing and downloading data

• Open to other collectors who wish to share data

• Still under development

Page 17: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 17

Hardware

• 2 servers – Master and back-up

• Size: 9U

• Disk: 48x 2TB on 2 controllers – 2 cold spares

• CPU: 2x Quad core Xeon L5420 2.5GHz

• Memory: 32GB

• Chassis: Chenbro RM91250

Page 18: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 18

Hardware

Page 19: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 19

Features of the RIPE NCC Repository

• Longevity

• RIPE NCC does not depend on competitive research funding

• Generating and keeping Internet measurement data for ~20 years

• Long time-series data

• Much less likely that the repository will disappear

• Emphasis on mirroring rather than replacing other repositories

• Host anonymized versions of data

Page 20: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 20

Features of the RIPE NCC Repository

• Resources

• RIPE NCC manages servers, infrastructure

• Larger repository can justify a dedicated support staff

• Experience and expertise are important

• Diversity

• Variety of datasets from different collectors

• Increased awareness of new datasets

• One user account can access many different datasets

• Self sign-up for “basic access”

Page 21: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 21

Features of the RIPE NCC Repository

• Communication

• Bridge the gap between data collectors and users

• Raise awareness of existing data

• Gather feedback from the user community

• Develop relationships with other data collectors

• Links to useful tools and libraries for processing data

• Share expertise as well as data

Page 22: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 22

Available Datasets

• Data collected by RIPE NCC

• RIS routing database

• Reverse DNS delegations made by RIRs

• Data from external sources

• WITS

• Ex-NLANR data

Page 23: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 23

Routing Information Server (RIS)

• 16 route collectors peering with 600 BGP routers

• Mostly within the RIPE region

• ~100 peers provide complete routing tables

• Routes are collected and published in MRT format

• Updates every 5 minutes

• Full table dump every 8 hours

• All data collected since 2000 has been retained

Page 24: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 24

Routing Information Server (RIS)

• Other methods of access

• Last 3 months of data exported to MySQL database

• Weekly statistical reports

• Looking Glass queries

• Tools to query and visualise RIS data

Page 25: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 25

Reverse DNS Zones

• (Partial) Reverse DNS delegations made by RIRs

• Generated using RIPE DB reverse DNS objects

• ~410,000 reverse DNS objects

Page 26: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 26

Auckland

• Passive traces taken at the University of Auckland

• Auckland II – VII were previously available through NLANR

• Frequently feature in measurement literature

• Currently available from WITS archive

Page 27: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 27

Waikato

• Passive traces taken at the University of Waikato

• Long duration continuous traces

• Waikato I is available

• Other Waikato sets will be included at a later date

Page 28: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 28

NLANR

• Other NLANR datasets that were preserved by WAND

• IPLS (also known as Abilene)

• Leipzig

• Active Measurement Project (AMP)

• Much of this data is also currently available from WITS

Page 29: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 29

Other Datasets

• Collected by RIPE NCC

• Not currently in the repository but may be added later

• K-root and reverse DNS server statistics and traces

• Hostcount

• TTM

• DNSMON

• AS112

• Other parts of RIPE DB

• These are covered in more detail in the paper

Page 30: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 30

K-root

• Internet root name service operated by RIPE NCC

• PCAP traces of incoming port 53 traffic (DNS queries)

• 50 hours of traces included in CAIDA's DITL project

• DNS Statistics Collector (DSC)

• Summarises DNS traffic into 1 minute bins

• Generate graphs shown on the K-root website

• Raw data exported to DNS-OARC

• SNMP statistics

• Originate from RIPE NCC in Amsterdam

• Summarised and exported to an RRD

Page 31: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 31

Reverse DNS

• 4 reverse DNS servers operated by RIPE NCC

• 50,000 queries per second (3x load of K-root)

• High query rate means regular trace collection is infeasible

• DSC used on each of the rDNS servers

• Raw data and graphs only available within RIPE NCC

• Could be made available if there was a need

Page 32: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 32

AS112

• AS number for RFC 1918 private address space

• http://public.as112.net/

• Dynamic DNS update and rDNS server for AS112

• Hosted by RIPE NCC

• Goal is to measure and analyse DNS updates for invalid addresses

• PCAP trace collected annually and contributed to DITL

• More frequent captures could be scheduled if needed

• DSC data also collected

• Graphs publicly available from RIPE NCC AS112 site

Page 33: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 33

Hostcount

• Monthly DNS scan of ~100 TLDs within the RIPE region

• Count A and PTR records for both forward and reverse Ipv4

• Also count forward AAAA for IPv6 addresses

• Not exhaustive, due to public zone transfers being disabled

• Statistics published via Hostcount website

• Raw data from 1990-2007 is archived off-line

• Current policy is to discard raw data after statistic extraction

• But this could be reversed if there is a need

Page 34: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 34

Test Traffic Measurements (TTM)

• Active measurement system of ~100 probes

• Most probes located at ISPs and universities within Europe

• Not all are included in public measurements

• Regular series of active tests

• UDP one-way delay, traceroute, DNSMON, IPv6 PMTU

• Also supports ad-hoc measurements by authorised users

• Ping, HTTP page fetch

• Can also develop and run arbitrary tests

• Results not released outside of RIPE NCC

Page 35: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 35

Test Traffic Measurements (TTM)

• Bulk data published using CERN ROOT

• Performance graphs on the TTM website

Page 36: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 36

DNSMON

• Measures the reachability and latency of DNS

• Collected using 60 TTM probes

• Root domain, .com, .net, .org, e164.arpa, 24 CC-TLDs measured

• IPv4 and IPv6 performance measured

• Summary statistics and graphs are publicly available

• Only paying subscribers can access most recent graphs

• Raw data also available upon request

Page 37: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 37

RIPE DB

• Internet number registration objects for the RIPE region

• IP addresses and AS numbers

• Reverse DNS objects

• Used to create zone files for the reverse DNS service

• Route registry objects

• Used to provide an Internet Routing Registry

• Conforms to RPSL and RFC 2650

Page 38: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 38

RIPE DB

• Public queries supported via command-line and web

• Daily limit imposed on queries that include personal info

• Bulk data is available via FTP

• Personal details are not included

• Can subscribe to a near real-time mirror of the database

• Restrictions on personal data are very broad

• Can result in inappropriate limitations

• Better access policies and mechanisms should resolve this

Page 39: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 39

Links

RIS http://www.ripe.net/ris

RIPE DB http://www.ripe.net/db

K-root http://k.root-servers.org

TTM http://www.ripe.net/ttm

Hostcount http://www.ripe.net/is/hostcount/stats

DNSMON http://dnsmon.ripe.net/dns-servmon

AS112 http://www.ripe.net/as112

WITS http://www.wand.net.nz/wits

Page 40: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 40

Conclusion

• Repository is a 'beta'

• Server exists and some datasets are available for download

• Interested users can be given access

• Looking for feedback and ideas

• Development of policy, particularly for access

• Data collection

• Improving the RIPE datasets to be more useful to researchers

• Acquiring more external datasets

• Contributions of data, analysis tools

Page 41: The RIPE NCC Internet Measurement Data Repository

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 41

Contact

http://data-repository.ripe.net

[email protected]

[email protected]