For academic research I need a small script which automates grabbing information from a website (Yahoo's "Keyword Selector Tool") and writing it into a textfile. The script should run on the the webserver (shared webhosting service) which my website is hosted on. The server has Perl [url removed, login to view] and MySQL database installed.
The script will be used in an adademic research project for a acquiring a PhD. It might lead to additional projects for the bidder, either for my research of at a business project inside a company.
The script needs to function as follows:
- First I go to a simple website (to be programmed) which contains only a text input field. In this field I input a list of keywords, one keyword (or phrase) per line.
- After clicking on the send button, the keywords should be saved in a text file (input file) on the server and the actual script starts working.
- The script takes the first keyword from the input file, goes to the website
[url removed, login to view]
enters the keyword into the search field and submits the search.
- Then it waits for the search result list, finds the keyword in the list, grabs the according number of searches (column "Count") and saves the information in an output file on the server.
- The script continues with the second keyword from the input list and performs the same steps as described, adding the new information to the same output file. It continues doing so until all keywords from the input list are finished.
The list of keywords which I enter into the text field could be as long as a few hundred keywords. It will take the script up to a few hours to complete the job. It must be programmed to run independently and stable, and be able to handle slowly responding Yahoo servers.
The above website is the English equivalent (just for explaining) but in fact the script will be used on the Japanese version of the Keyword Selector Tool:
[url removed, login to view]
Therefore the script must be able to handle Japanese characters, both in the input and output file.
Optional (not necessary, would pay extra):
- The first version only needs to grab the number of searches (column "Count"). But a second version could grab the complete list of search results and save it in a second output file (in addition to the first one).
- The output file would be downloaded manually from the shared server by me but it would be a bonus if the script would send the output file to a prefixed email address after finishing.
- The most time consuming part is the waiting time for the response from the Yahoo servers. It would be sufficient if the script works on one keyword a time. But if it can be programmed easily to work in multiple instances, on more than one keyword at a time, it would speed up the process significantly.
- It is fine for me if I have to wait until the scipt has finished one job before I submit my next search. But it would be a bonus if I would be able to input additional lists right after each other. The tool could save each list in seperate input files and work on them one at a time or simultaneously, saving the corresponding output files seperately.
The script should be finished within two weeks (not including the optional features). Please feel free to contact me if you have any questions.