We provide over 420 structually(!) different PDF files. Each PDF has unstructured (i.e. text, pictures) and structured (attribute+value) parts.
From the structured parts, we need attributes and values for each PDF in one separate XML file plus one (single) XML schema (XSD) file with all the attributes (from all PDF).
3 example PDFs plus more detailed requirements incl. proposed algorithms attached.
Provide documentation of your algorithm(s).
All software must be self contained. No outside services should be needed. You will put all software on my server(s). You will need to install/configure anything/everything that is necessary to make script work.
Code must be modular and elegantly organized and easy for a novice like myself to sort through.
You will stay within your time schedule. No exceptions. If you are beyond your deadline date, I reserve the right to void and null the relationship and get escrow funds back.
All rights to software belong exclusively to me. You will not be able to share of any of this project with anyone else. I want 100% source code. No encrypted files.