We require a CV parser that will extract:
2) First line of address
3) Second line of address
6) Post code/zip code/area code/district code
7) Email address,
8) Telephone number,
11) Personal objectives
12) Hobbies and interests
15) Career history - for each company they have worked in we need separate xml records for:
15.1) company name
15.2) job title
15.3) start date
15.4) end date - NULL if the job is their current date
15.5) job description
16) Education history - for each education or training course we need separate xml records for:
16.1) start date
16.2) end date
16.3) Education level - for example, in England this could be: Degree, Secondary, Higher, Other.
16.4) If degree, then degree level - for example, in England this could be: high school, non postgrad, associate, post grad, bachelor, master, doctorate, other
16.5) School/college/institution name
16.6) Qualification - e.g. GCSE, BSc. etc.
16.7) Subject and grade. For example, Maths grade A.
16.8) Any summary for that qualification that may be available.
17) Any skills that can be detected from the CV - for example, Java, Sage, MS Word, Running etc.
I would expect an authenticated REST interface (written in php) that we can send CVs to. I would suggest you find a lot of different CV formats from the internet and make sure the parser can handle them. It will need to parse any format of CV sent to it so it requires a lot of intelligent pattern matching.
It needs to be able to open up the CV using Apache Tika - it can open PDFs, word and other text documents, parse the CV, and then output the parsed data into HR-XML (you should research the elements for this standard before bidding) format.
The parser also needs to be able to parse different languages, including:
It should be written primarily in PHP on a LAMP stack - it should be able to parse a complete CV within 1.5 seconds at the most.
I will provide a full spec to the winning bid but I expect the parser to be able to read keywords from a database, but not rely on the keywords in the database to parse the CV as the CV may not have the keywords in them. Remember, everyone writes their resume/CV differently :-)
I would be willing to pay a deposit but the remaining of the payment would be dependent on a demonstrably working system. I would also want to inspect code at regular intervals throughout the project to ensure that the developer's code and design quality meets my client's needs.
I would expect the application to interact with a DB but via memcache.
Note, I do not believe this is a simple project. This project is for software engineers rather than script kiddies. Please only provide a quote if you believe you can really do this and please be realistic on your timescales.