I need a Windows application (MSVS C++ or C#), that will check a list of URL's using TOR API [url removed, login to view] for making HTTP requests.
The application should support Tor proxy network (on Windows it can connect to Vidalia for example)
Application will use multiple threads to speed up the results.
Options should be to set number of threads.
User will run TOR client and set the information in the application.
User starts a new project by selecting which of the above should be checked. User also selects one mode of running:
1) Directory crawl - user inputs a start URL. The application scans the page for any links within the same domain that are below in the URL structure to the start link.
For example: if the base URL is [url removed, login to view] then [url removed, login to view] will be followed but [url removed, login to view] will not
First two are then opened and recursively scanned. If any links on the page are external (out of this domain) then it is checked
2) Site crawl - same as above but application checks the pages it crawls instead of checking all external links on them
3) Import sitemap - Allows the user to specify URL to the google XML compatible sitemap file and the application will check all URLS in the file
4) Import urls - User can either upload a list of URLs (one per line) or type them into a text area
User can also select maximum number of links to be checked.
User can also input a regular expression for URL exclusion. So if any URL matches the regex it will be skipped.
User then starts the project and it can at any time pause, stop, restart, save current project and load a new one. They can also export the current data as csv.
The application can start by typing the URL and it will start crawling the site checking all external URLs on that page, and crawling pages deeper on that
To bid please state which programming language you will code the application in. Please also confirm you can finish the project within two weeks. Only bids with this information will be considered.