1. We urgently need a PHP script to run on our Apache server that can automatically take a client's domain name submitted through an online form & crawl through that site in order to find all outbound links to external sites/pages.
2. For each outbound link on the (crawled) domain we then want to spider up to 300 pages on the "external" domain to find a reciprocal link back to the client domain. It should stop if it finds a link back to the client domain & then continue at the next external domain.
3. Finally, we want a summary stating that the originally submitted domain has x amount of outbound links to external pages/sites and y amount of incoming links.
This will be web-based and needs to be as fast & simple yet accurate as possible. It also needs to obey all [url removed, login to view] standards.
We are in a hurry. [[url removed, login to view]] is the closest thing avail we have found but it does not do exactly what we want.
(A significant difference being that all outbound links needs to be entered manually which is not what we want)
**DO NOT WASTE OUR TIME** (or your own) bidding on this project if it is not something you already know how to do quickly & correctly. We don't want any bids from coders who intend to learn as they go. Thanks for your understanding.
Good English is also a requirement. This project needs to be completed by Fri 24th Sept with no room for time-slippage.
**Edit 1 - This project was incorrectly listed at "$500 and above"** - Please advise if you can do it for less than $500. (Apologies for the error)
**Edit 2 - Following are answers to some Qs raised.
A - Results are to be emailed weekly to the email associated with the entered domain but also stored in a MYSQL database.
B - We do not want a clone of [url removed, login to view] - Just something that does as requested.
C - Timeouts are ok to ensure spider stability
D - A lack of multi-threading is not a major issue for us
F - Daily cron jobs are ok to use.
G - Must be written in php
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
Apache/[url removed, login to view]