Provide a shell script (perl?) something that can run from the linux command line and parse out email addresses for a particular domain. For example: get_emails [url removed, login to view] which will produce [url removed, login to view] email addresses.
Convert search engine result urls to a txt file one per line. Intermediary text file list is a requirement so that results can be compared and re-evaluated and re-run at a later date.
Program should fetch urls in parallel (variable), traverse them, and parse them for email addresses, which are output to individual text files by domain. Each run against a domain will produce one domain temp file and one domain email file.
Parallelism in url fetching and parsing is expected as well. Coding language is flexible but must be accepted. Code must be uncompiled script of some sort that I can read and modify later. Send language proposed with bid.
Minimal configuration options for linux command line include which search engine to use, what base email address to search for (example: [url removed, login to view]), what to call the core output files, how many parallel threads to run, and how deep to traverse a given url, how many search results to include (by #, not by pagecount).
Core output files are expected to be [url removed, login to view] (ex: [url removed, login to view]), [url removed, login to view](for each url), and similar for any additional temp files.
Program must have ability to reload [url removed, login to view] for later processing.
Program must not utilize any search engine api which would require registration or the use of a key of any sort. Program should be able to be copied and run by anyone from anyplace against (minimally) bing/google.
Sample output exercise: go to [url removed, login to view], enter search string "email [url removed, login to view]" which returns a page of urls. File [url removed, login to view] would now contain the returned links. Further execution traverses those links and produces [url removed, login to view] which contains nothing but parsed email addresses.
Lump sum payable upon acceptance.