I am needing a bot built that will go to the website allofcraigs com and scrape certain data and store it in a .csv file that can be download/exported.
Inputs: (These are inputs or options that can be set by the user of the bot before scraping begins)
Search Term - When the scraper arrives to allofcraigs com, it will search for the term that was input by the user into the bot)
Unwanted Search Terms
Search In Titles Only
How The Scraper Should Work:
1. The user will open the scraper (application)
2. The user will type a search term into a text input box on the application. (input required)
3. The user will type in "unwanted search terms" into a text input box on the application. Each search term separated by a comma. (input optional)
4. The user will have the option to check/select an input box on the application if they would like the application to select that option on the [url removed, login to view] website. (input optional)
5. The user will have the option to selected the category they want the application to search in. (input optional)
6. The application will open allofcraigs com website
7. The application will type the input "search term" into the search box on the website.
8. The application will hit the search button on the website.
9. The application will enter the "unwanted search terms" into the advance search section of the website if the user requires "unwanted search terms"
10. The application will check/select the "Search in titles only" option in the advance search section of the website if "search in titles only" is required by the user
11. The application will select the "category" to search in on the website if the users selects a category on the application.
*Note: The only categories that need to be built into the application are the categories from the For Sale section.
12. The application will perform a new search if any of the optional search filters are selected by the user (unwanted search terms, search in titles only, category)
13. The application will scrape all of the ad titles on the first page or search results and store them in a database.
14. The application will open each ad link on the first page of search results to scrape the ad body, date posted, time posted, ad id, and any images attached to the ad. The ad body, date posted, time posted, and ad id of each ad should be stored in a database. If images are scraped from the ad, the images should be stored in a folder.
15. The application will perform steps 13 and 14 for every page of search results
16. The application will be complete when all pages of ads have been scraped.
17. The user will be able to go to the applications folder to download a .csv file will all scraped information and download all images that have been scraped.
-There should be a start and pause button on the application.
-If the application opens an ad that has already been removed by craigslist, it should skip over to the next advertisement.
-If this would be easier for you to create in php, please ask me about this option.