Post on 26-Jun-2015
description
why choose a hosted solution THE EDGE OF USING A HOSTED SOLUTION OVER DIY TOOLS
before you begin
EVALUATE YOUR REQUIREMENT
think about
LARGE-SCALE CRAWLS = 100 + WEBSITES
SMALL-SCALE CRAWLS = 5 OR LESS WEBSITES
Data Requirement
Recurring
Large-scale
Small-scale
One-time
Large-scale
Small-scale
Support Required
Recurring
Large-scale
Small-scale
One-time
Large-scale
Small-scale
Scraping on a tool
Convenient since you don’t have to explain needs to a DaaS provider
Works best when sources are simple & few
Ease of use is in indicating fields
CSV files appears with data!
This is neat! But…
…problems appear when…
you increase websites and/or add more fields at one time
you submit the request after having laboriously selected all fields from across websites!
scrapes run till 99% & fail!
Will re-running solve this problem?
Support Centers reply:
“Site has blocked the bots.”
Did it really solve your data requirement?
Scraping via a hosted solution
Up-time
Provider has machines running 24x7
We do!
Scraping tools invariably fail when enough servers are not available to perform crawls
Hosted solution gives you continuous data feeds! All the time. Every time!
Scalability
Providers scale platforms to meet client numbers & sources
Scaling remains smooth as long as design decisions remain constant
Tools get boggled with increase in scale
We had clients who tried running a scraping tool for a complete day to extract data from a huge site. THEIR LAPTOPS DIED.
#TRUESTORY
Monitoring
DIY solutions rarely support monitoring
Example:
Your tool extracts data every week
The site changes structure every month!
Hosted solutions have alerts in place to mitigate any changes
Fail-over & Support
There’s support for everything
Basically, life is easy.
The headache is the provider’s. Trust us, we know.
With DIY Tools, you’re at the mercy of the Support Center. IF your calls get through at all!
SOME REAL QUERIES WE RECEIVED WHEN DIY SCRAPING TOOLS FAILED...not convinced yet?
“”
Is it possible to harvest content according to our specifications…We are using X & we are finding very difficult to get the entire core content from a page…
X IS A PLATFORM AS A SERVICE WHERE YOU CAN WRITE PLUG-INS TO SET UP YOUR CRAWLERS.
I.E. MORE THAN JUST A SOFTWARE
“”
We are currently using Y for crawling & would be interested to understand the advantages you can provide. Is there any way you could frame a work flow & harvest content according to our needs…Y has only been helpful to a limit.
Y IS A DESKTOP SOFTWARE FOR CRAWLING WEB PAGES.
We solved these problems. CLICK
TO SOLVE YOUR.
or e-mail sales@promptcloud.com