Kapalı

Develop an script to generate OCR with ALTO standards for PDF and other outputs

The main scope of the project is to create a python bash or perl script that process OCR over a master folder of TIFF files or a PDF file.

The workflow will be:

1. Read a Master directory file with .Tiff files uncompressed

2. Start OCR processing

3. Generate Output files:

3.1. Output #1: Master PDF 1.4 (PDF/A-1)

3.2. Output #2: Access PDF 1.4 (PDF/A-1) - (Never bigger than 20mb.)

3.3. Output #2: ALTO XML with all the OCR information

3.4. Output #3: TXT with all the OCR text results

3.5. Output #4: TXT with all the OCR text results

4. Create XML with the info of the process

4.1 XML has to record all the processes and also record any failure on the process.

The project scope is to generate a script on bash, perl or python based on any Open-source OCR tools to do those tasks. To complete the project, the script has to be tested, checked and documented in detail. No chance to make anything different. During the development, the developer has to give us a report of the fields and values we have to work with and we will help on the definition. Can be a long term project if the results are good with much more image processing tasks.

Some valid Open-source tools that can bu used are:

- [login to view URL]

- [login to view URL]

- [login to view URL]

Reference documentation links:

- [login to view URL]

- [login to view URL]

- [login to view URL]

We can provide image samples and also give much more detail of the script to develop.

Other options can be used but need to be validated by us before start the development.

Project will be divided in two milestones, defined as here:

Milestone 1 is to complete:

- Point 1

- Point 2

- Point 3.1

- Point 3.3

Milestone 2 is to complete the project:

- Performance and correct any error of Milestone 1

- Point 3.2

- Point 3.4

- Point 3.5

- Point 4

- Documentation

- Testing

Feel free to ask any question

Beceriler: Python, XML, Kabuk Betiği, OCR, PDF

İşveren Hakkında:
( 26 değerlendirme ) Marín, Spain

Proje NO: #32722135

Bu iş için 14 freelancer ortalamada $564 teklif veriyor

(39 Değerlendirme)
6.8
Devrits

Hi Hiring manager I am Full Stack Principal Architect,Expert Image processing,OCR Most popular spells that I use: C#, Python, .NET, WPF, WCF, VSTO, SQL, OCR, PHP, Java, React.js, Node.js, Laravel. I have more than 15 Daha Fazla

$750 USD in 4 gün içinde
(35 Değerlendirme)
6.0
mannanmaan1425

Note: I am available to start right now! Dear Hiring Manager, I have read out your job description and the requirements for the job.I am a professional Artificial intelligence and machine learning programmer . I Daha Fazla

1 gün içinde %bids___i_sum_sub_32%%project_currencyDetails_sign_sub_33% USD
(30 Değerlendirme)
5.3
umairkaramat24

Hello, I read your project details and really interested in your mentioned job. I have 5+ years’ experience doing similar jobs related to these skills Python, Shell Script, PDF, XML and OCR. I think its doable job, and Daha Fazla

$750 USD in 8 gün içinde
(8 Değerlendirme)
4.1
ayesha0124

Hi there, How r u? I have had a look and i am sure that i can handle this project well as i have experience in XML, PDF, OCR, Shell Script and Python. I have worked on similar projects before too. Please initiate the c Daha Fazla

$750 USD in 24 gün içinde
(1 Yorum)
4.2
(2 Değerlendirme)
4.0
mykrsolovlo

How are you? We have AI & Data Science,Django team who are highly experienced in Machine learning and Deep Learning and can deliver products as per your requirements. We have done many real time projects like Semantic Daha Fazla

$700 USD in 7 gün içinde
(5 Değerlendirme)
3.7
demvitalii3

Dear, sir. How are you? ~~~ Computer Vision Professional is here. ~~~ I've a good interest about your project as a computer vision professional who has been specializing in this field for over 8 years. I recently devel Daha Fazla

$700 USD in 7 gün içinde
(8 Değerlendirme)
3.2
yinshu2020

----------------Professional OCR Expert! Best Result in Time!----------- Dear sir. I've read your project description very carefully. I've extensive experience in OCR, so I believe that I can provide excellent result i Daha Fazla

$500 USD in 7 gün içinde
(2 Değerlendirme)
2.3
vladilavsuhovoy1

Hi! I am an expert Python engineer. I am familiar with Python and I have a lot of work experiences in OCR, XML, Shell Script, Python and PDF. I can start right away. I want to discuss for this project in detail. Plea Daha Fazla

$500 USD in 5 gün içinde
(2 Değerlendirme)
1.5
aafreenkhan208

Hi, I'm Aafreen Khan! Hope you’re doing well. I'll complete your project in the way you'll fall in love with because I've been working as a Full Stack Web & Software developer for 5 years. I provide end to end solutio Daha Fazla

$500 USD in 7 gün içinde
(0 Değerlendirme)
0.0
aiworks4u

Hello to Spain, your project sounds very interesting for me and i could already realize numerous comparable projects. I have a lot of experience with your required tasks like: - OCR (OpticalDocumentRecognition Daha Fazla

$251 USD in 3 gün içinde
(0 Değerlendirme)
0.0
SyedAdeel2020

Greetings! I have reviewed your project details and as you need. I believe that I can assist you with this project. I have been working as a website developer for the past 8+ years and I have impressed a lot of my cl Daha Fazla

$500 USD in 7 gün içinde
(0 Değerlendirme)
0.0
kofanovsan

Hello, Jorge! I have read your project requirements very carefully and with great interest. I am a python expert with 5 years of experience. Recently, I have performed tasks to interpret pdf documents with python. http Daha Fazla

$500 USD in 5 gün içinde
(0 Değerlendirme)
0.0