Flexible crawler/spider/scraper

We need a crawler which can access a large number web sites, and extract detailed information on a daily basis.

We'd prefer to host the crawler ourselves, but are open to other options. If so, we'd need an SLA.

We'd want to allow our employees to configure the crawler for different sites, but are also interested in knowing if you as bidder can help with that as well. Ease-of-use in this process is important.

This evaluation is a medium project, but our budget is higher in any next step we take, and we're interested in a long-term relationship.

We will evaluate based on these critera:

- Cost in $ and minutes to add one more site

- Ease of use for the person adding the site

- Our ability to ourselves further develop/change crawler, or request changes from you

- Must be able to deal with any site and extract any type of information

- Easy to do integration with other systems, both status/administration and the data output

We'd like to know:

- Your previous experience in this area, including volumes and complexity of data

- Roughly how does it work, platform, language etc

- Sample of how one site (you pick one) is crawled, what does the configuration or code look like

- Costs

- On what terms we'd be allowed to test the system before deciding to proceed

Beceriler: Veri İşleme, Java, Perl, Python, XML

Daha fazlasını gör: web develop cost, next step systems, develop a web crawler in java, what is a crawler, we need employees, medium, language options, evaluate, easy web scraper, data scraper, data crawler, crawler, web data crawler, previous experience, sample perl, develop options, java data integration, perl sample, perl code sample, java configuration, next minutes, platform administration, daily crawler, scraper code, crawled data

İşveren Hakkında:
( 1 değerlendirme ) Stockholm, Sweden

Proje NO: #84851