Need a windows environment that will allow my data scraper crawl a domain without detection

Tamamlanmış İlan edilme: Jul 14, 2014 Teslim sırasında ödenir
Tamamlanmış Teslim sırasında ödenir

I currently have a windows-based data crawler and scraper that uses windows API, AutoIT, and the destination domain's API. The scraper collects data in 3 ways:

1. On first pass the scraper crawls a series of pages to collect some data and URLs that will be crawled in the second [login to view URL] can be up to 200 pages with 25 URLs on each page.

2. On the second pass the URL from the first pass are crawled and more data is mined.

3. Another component of the data scraper is capturing dynamic AJAX data which changes every second from up to 25+ objects.

For several months the data scraper was working fine until the company started taking measures to prevent my data scraping. I hope to continue using the same data scraper as it was the most accurate. I am not sure how they are detecting my presence (IP monitoring, cookies, sessions)m but I would like to learn how I can prevent detection or at least minimize it.

Clues:

1. The domain has blocked several IPs, but later unblocked them.

2. I setup a VPS with 5 IPs and rotated the IPs every hour so that traffic was coming from a new IP. They blocked all these IPs within 24 hours.

3. They have blocked IPs coming from other countries.

Deliverable(s):

1. I'd like a freelance to review the domain and provide feedback on how they might be detecting data mining. This may require you trying different ways to get detected.

2. I'd like to have a windows based VPS environment (I can provide) that will avoid detection (proxies, switch proxies, new sessions, cookie removal, or any other way) while still being able to use the current autoIT data scraper.

3. I will need a way to implement this the protection measures easily on any windows 64bit environment.

Provided:

I can provide a windows server 2008 r2 VPS environment.

I can provide my current autoit scraper for testing. Remember they will block the IP of the scraper within 24 hours of scraping.

I am considering using this service if it can work within 600 https request or less per hour: [login to view URL]

Please message me for the domain I wish to scrape.

Veri Madenciliği İnternet Güvenliği Ağ Yönetimi Test Otomasyonu Windows Sunucu

Proje NO: #6188071

Proje hakkında

3 teklif Uzak proje Aktif Jul 17, 2014

Seçilen:

TexasTel

I'm curious... I had to take this English exam just to ask this lol.. The application you are using, which executes the spider/crawling functionality, is this program an AutoIt compiled script? Meaning, was it develop Daha Fazla

%selectedBids___i_period_sub_7% gün içinde 200%project_currencyDetails_sign_sub_9% %project_currencyDetails_code_sub_10%
(2 Değerlendirme)
3.4

Bu iş için 3 freelancer ortalamada $365 teklif veriyor

mhmhz

Hi What about using private proxies? How many requests you will using per days? Thanks

$631 USD in 4 gün içinde
(3 Değerlendirme)
3.3