Learn how Stubhub blocks bad bot activities like price scrapers, competitive data mining, brute...
-
Upload
distil-networks -
Category
Technology
-
view
841 -
download
6
Transcript of Learn how Stubhub blocks bad bot activities like price scrapers, competitive data mining, brute...
About StubHub
Largest secondary ticket marketplace in the worldAn eBay companyProcesses nearly 500 transactions per second
StubHub is an online marketplace which provides services for buyers and sellers of tickets for sports, concerts, theater and other live entertainment events.
StubHub Bot Challenges
Bot Challenges○Bots were used for brute force account takeovers
○Competitors tried to game the system, scraping prices, and monitoring inventory and customer behavior
○Random spikes in bot traffic were causing increased utilizationof resources
○Tested multiple competitor solutions, but they were difficult to configure and in some cases broke our website
StubHub Bot Selection Criteria
Bot Detection and Mitigation Solution Requirements○Block web scrapers without impacting human visitors
○Accurately identify good bots vs. bad bots
○Cannot solely rely on rule based systemMust include automated learning to “self tune”for defending against emerging and unknown threats
○Needs to include Distil community to improve accuracy of bot detection
○Must seamlessly co-exist with existing solutions(SIEM, CDN, WAF, etc.)
StubHub Results with Distil Networks
Reduced competitive data mining and fraud
Drastically reduced competitive data mining, increased SEO rankings, and protected our marketplace ecosystem
Distil is a key piece of our fraud detection and prevention suite of tools
StubHub Results with Distil Networks
Improved traffic quality and enriched analytic data
Cut pageviews in half, without impacting human users or ad deliveries
Quality of traffic has greatly improved by stopping unwanted bots and limiting site access for trusted bots
Negative Security Model - Blocking Bad Bots
Positive Security Model - Whitelisting Trusted Sources
The Importance of No False Positives / Negative Impact on Humans
Good bots make up over 35% of all traffic to the average website
○ Search engines - Google, Bing, Baidu, etc.,
○ Alexa Crawler
○ Pingdom, Keynote, etc.
○ etc.
Effective solutions block bad bots but leave good bots unhindered
The Importance of Accurately Identifying Good vs Bad Bots
Source: Distil Networks, 2015 Bad Bot Landscape Report
Bot detection should never rely on static signatures or manual rule creation
Automation and machine learning must be performed in real-time
Effective bot mitigation solutions ○ Dynamically classify users by correlating dozens of data points
as well as behavior patterns
○ Constantly “self-tune” to evolve alongside the morphing threats they encounter and protect against
The Importance of Machine Learning and Self Tuning
○Real-time updates from a centralized violators database help protect all sites and improve accuracy
○Data from attacks detected anywhere on the network should be centralized, correlated, and analyzed by a big data analysis platform
○Signatures are then constantly updated to drastically reduce false positives (blocking humans) and false negatives (missing bad bots)
The Importance of Community Supported Centralized Threat Database
Many organizations have complex web environments which may include a multitude of different solutions including
○Content Delivery Networks (CDNs)
○WAFs, FW, IPS
○SIEMs
○Load balancers
○and more..
Bot mitigation must be able to seamlessly deployed alongside these technologies without impacting their performance or usage
The Importance of Seamless Compatibility