I am looking for a solid web crawler, that has one task, and one task only...
Identify different page layouts on a site.
Some site, especially webshops have category pages, subcategory pages, product pages, checkout pages...
This crawler, should not identify the purpose of the page, but be able to take a site with 500.000 pages, and identify how many different page layouts there are.
In the end, it should end up making a list of each url, and add a layout ID (XML)
Performance and speed of the scraper - as well as how it will intelligently view one page appart from the other is a main ingredient of this scraper.
Some sites have very similar pages, however making the scraper identify an element as a menu, submenu or navigation - thereby making it ignore the element is very much wanted...
I dont want to scrape a site with 200.000 pages, and the scraper comes up with 110.000 different category's of pages.
Bu iş için 7 freelancer ortalamada $427 teklif veriyor
I have been working as a .net developer for last six years. I also have experience on sharepoint. I think i suit well for this work. my core skillset includes: C#, SQL Server, .Net framework, Sharepoint and html.
Let me help you out in this task. I done similar kind of task in a semester project of mine BS(CS) degree.
I have MS in CS and 10 years working experience in web and search engine fields, I am experienced in web crawler development.
We are a team of .Net experts. We can do this project for you.
Hello, I have experience in web page analysis and I can do it good. Please see PM
Hi, We are the group of people working from both India and US with knowledge in PHP, C#, ASP.NET, Data processing, Sql Server, MSSql, DB2, Joomla, Drupal did several projects as the same and we are really interested in Daha Fazla