Crawling and scraping job posts


I need a software who will do two main things, which are both “MUST HAVE”:

- scrap / parse / aggregate / job postings mainly from career pages but also from newspapers or job boards;

- scrap / parse / aggregate content from documents in MS WORD or PDF format – mainly CV’s and export the data in a structured format that I can use to upload the data into my website (XML, CSV…).

„Must have” features of the project:

Spider jobs from websites (HTML or XML), ATS or via FTP.

Jobs taxonomy for categorization based on job titles & keywords.

Then the job ads should be converted into the correct format for my job board. Semantic analysis should be used, to ensure the content is accurately mapped onto my job board’s category schemes

The capture has to be at the highest possible data to be parsed on a job site and it would contain the entire job advertisement.

Incremental scraping feature only downloads new jobs.

Synchronize jobs to guarantee that only open jobs remain posted.

Filter jobs by keywords so only relevant jobs are posted.

Auto-replace keywords in content & clean up job formatting.

Schedule regular spidering / posting sessions.

Auto-post via XML, EXCEL or CSV to single or multiple websites or forward to a list of emails.

Post to HTTP interface, via API, SOAP, to FTP or email.

!!!The way I want it to work: I shall introduce the address of the main career page (or, even better, the website address and then the engine will go to the career page by itself but this is optional, not compulsory) of the hiring company

(just for example only: [url removed, login to view];Action=U&var=


[url removed, login to view]


[url removed, login to view]


[url removed, login to view])

and the software will scrap the entire posting of the job openings that will be found in the career pages of the website. The software product should be build in a way which will allow me to give only the web address of the career section of the website to be parsed or only the website address and the software product will do all the rest. This is really a MUST HAVE.

Very important: I need to have perfect accuracy of the scraped content - text, url, images. The engine has to be able to scrap at least 500000 career pages / day (which will come from arround [url removed, login to view] web-sites/day), convert them into XML, EXCEL, CSV or similar files and post them on my job board website. The minimum accuracy is 70% - this means that at least 70% of the career pages content scraped / parsed will be error free.

The same accuracy percent (70%) have to be meet in which concerns the scraping / parsing of the documents. The project software / engine will be able to scrap at least 10000 documents (cv’s) / day and upload them into the website.

The software should be able to do at least what the following two are doing:

[url removed, login to view]

[url removed, login to view]

Another very important thing: before paying for your work I have to make sure that the product meets all the above requirements. A free trial of 30 days or demo is compulsory.

Feel free to ask or comment on any of the above.

Kind regards


Beceriler: Veri Madenciliği, Veri İşleme, Web Scraping, İnternet Araştırması

Daha fazlasını gör: scrap, web scrap, xml aggregate, what jobs are hiring, what is the highest paying job, web scraping software free, web scraping https, web scraping free, web scraping api, web content downloads, web analysis services, software scraping, semantic web search engine, semantic web jobs, scraping website software free, scraping web content, scraping the web, scraping free, post my job, posting advertisement jobs, pdf to ms word jobs, new jobs hiring, new career at 40, my career job search, my career jobs

İşveren Hakkında:
( 0 değerlendirme ) Romania

Proje NO: #4426311

Bu iş için 10 freelancer ortalamada $992 teklif veriyor


I can help in your project, please check PMB and our ratings/reviews to get idea of our experience. Please let me know if you have any queries.

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(44 Değerlendirme)

HI, Thank you for the invite have done similar work many many times. Kindly send me more details of the work.

in %bids___i_period_sub_35% gün içinde630%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(31 Değerlendirme)

Seasoned web scraper. I worked on many similar projects, I have big experience in data mining projects. I can finish this task in short time, with the best quality.

in %bids___i_period_sub_35% gün içinde500%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(18 Değerlendirme)

Sir, I can do the project. Refer PMB. Looking for further discussions in this matter. with thanks and regards

in %bids___i_period_sub_35% gün içinde500%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(24 Değerlendirme)

Hello, I can help you. thanks

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(10 Değerlendirme)

hi there i would like to help you in this task thankyou

in %bids___i_period_sub_35% gün içinde715%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(1 Yorum)

Hello, I have over 8 years experience in web developing. I can do this job. Please feel free to ask if you have any questions. Thank You

in %bids___i_period_sub_35% gün içinde262%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(3 Değerlendirme)

i can try to do it

in %bids___i_period_sub_35% gün içinde750%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)

The kind of genericity you want - being able to scrape jobs-information from any website - cannot come within the price-range you have indicated. Typically, manual configuration takes 1 to 2 hours per site; even at $2 Daha Fazla

in %bids___i_period_sub_35% gün içinde5000%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)

Dear Hiring Manager, I'm interested in the position of lead for your job. I have six month experience on the freelancing sector & am very good in Data Entry Sector. Also you can see my profile. I'm very excited to as Daha Fazla

in %bids___i_period_sub_35% gün içinde300%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)