Wringing Performance out of Perl

Post on 15-Nov-2014

1.000 views 2 download

Tags:

description

I gave this lightning talk at Yapc 2011. My company uses Perl in a variety of products, some of which have serious performance implications. Here I give a quick overview of some of the tricks we use to squeeze extra performance out of Perl.

Transcript of Wringing Performance out of Perl

Wringing Performance out of Perl

Grant Street Group

• Began as a financial advisor group

Grant Street Group

• Discovered the Internet in 1997

Grant Street Group

• Online Auctions of Property Tax Liens• Web-Based billing system for tax collectors• Conversion of legacy tax-collector databases• Online license / vehicle tag renewals• Online payment processing• Auctions of all types of bonds• And lots, lots more!

Tax Lien Auctions

Tax Lien Auctions

• Absolute feeding frenzy– Our bidders threatened to exhaust TIN numbers– 20 million bidders in 2011– More than 30 billion bids altogether– Average was a 500,000-way tie– About 2,000 auctions closing simultaneously

Tax Lien Auctions

• How do we award auctions performantly?– Random tie-breaking with Crypt::Random– Random row-ID plus MySQL = S L O W– Turns out we can do it much faster in Perl

Tax Lien Auctions

• Net result: auction closing takes 20 seconds– Breaking 2,000 ties, each 500,000-way– Stress-testing indicates can scale by 4x– The IRS definitely cannot scale by 4x

Property Tax Online Payments

Property Tax Online Payments

• Florida residents can pay the property tax• Hosted, customized sites per county• Largest counties have ~1,000,000 parcels• Users are typical Florida residents

Property Tax Online Payments

Property Tax Online Payments

Property Tax Online Payments

Property Tax Online Payments

Property Tax Online Payments• Backend is MySQL and Sphinx• Lightning-fast searches with Perl– Mapping IDs to table, column, PK– Parsing SHOW STATUS LIKE ‘sphinx%’• Lots of useful metadata!

Property Tax Online Payments

• Net results:– Sub-second turnaround times– 9 minute average time on site by payers– 4 minute average time on site overall

Customer Data Conversion

Customer Data Conversion

• Largest county in FL is a customer– Population ~2.4M people– Tax roll of ~900K parcels– History of ~5.6M bills across 6 years

• Full database is large (by our standards)– Data files are ~30-50GB– Full conversion is ~160 hours, using Perl– Might be ~8 hours using pure SQL

Customer Data Conversion

• Problem is we can’t use pure SQL– Ridiculous amounts of business logic– Utterly different data models

• We’re a Perl shop; Perl is our hammer

Customer Data Conversion

• Hugely parallel data conversion– Subdivide conversion into smaller steps– Build hash of dependencies between steps– Construct DAG of work units in MongoDB

• Distribute the actual work– Run lots of Perl worker processes– Workers grab ready work units– Perform the work unit sequentially

Customer Data Conversion

• The end result– Total conversion time ~3 hours with 80 workers– Nightly reloads now very practical– Able to resume incomplete loads

We’re Hiring Telecommuters