Devam Ediyor

489808 php crawler

Hi all,

i need a PHP crawler that do the following:

- Crawler have at least 3 important files: [url removed, login to view] , [url removed, login to view] , [url removed, login to view] (i prefer for every site a class (alexa, yahoo, bing, google))

- it should crawl websites in a specified format

- it writes all found data in the database

- crawl a site if timestamp in db is older than 7 days

- input list is a website in the format like this:

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

- HTTPS must be possible

- If a second scan runs and it detects a change, it dont delete the databse record. it creates a new record and only the newest is showed up.

What he need to crawl?

- Website Titel

- Meta Description

- Meta Keywords

- Server Banner (example Server: Apache/[url removed, login to view] (FreeBSD) mod_hcgi/0.8.0 mod_ssl/[url removed, login to view] OpenSSL/[url removed, login to view] DAV/2)

- Alexa Rank (Traffic Rank 1 Month, World Traffic Rank, Review count, Average Load Time)

- Google (Pagerank, indexed Pages[site:[url removed, login to view]])

- Add wappalyzer and store the details also in our DB

- Whois Information (Nameservers, Owner, street, city)

- Ripe information (DNS Lookup ip, ripe database query)

- IPlookup of the Site

The Config File:

- possible way to activate deactivate classes ( Services )

- possible way to edit the Database configuration

- possible way to edit the expiring time of a domain

- my brain is bad today... so there could be more during the project

Example output PHP is needed

Beceriler: Her şey Kabul, Apache, Düzenleme, PHP, SEO, Web Güvenliği, Web Sitesi Yönetimi

Daha fazlasını gör: web config php, php owner, php is, bad websites list, all important website list in world, yahoo web services, what is a crawler, apache web services, website crawler, openssl, freebsd, deactivate, dav, database crawler, data crawler, crawler, Apache Traffic Server, web data crawler, crawler description meta, apache crawler, php load data, php class found, output php, time today php, apache php config

İşveren Hakkında:
( 33 değerlendirme ) Wil, Switzerland

Proje NO: #2235719