We seek a competent programmer who can create a web driving script that will inject a valid UK postcode into a webform and spider the results displayed by the website into a table.
The postcode list will be provided (some 2 million postcodes) that will allow the queries into the webform. Naturally due to the size of the postcode list, the program will need to be highly effecient in order to produce the results quickly enough, e.g. multiple threads of the same.
Also, because the website(s) being spidered may have restrictions on access by the same IP, it may be required to setup the threads to work via a number of valid proxies.
The whole project will need to be run the programmer and we will provide the postcode data and a server only.
Please note, whilst we don't mind which languages are used to achieve the result, the software will need to run on a Windows machine.
Furthermore, we will need a user interface to allow access to get some basic functions such as setting the number of threads and some reporting such as how many records were collected in a 24 hour period. Performance stats by proxy, etc.
We would be grateful if applicants have had experience in this area before, as spidering properly and effeciently can be very difficult.