Kapalı

Search for headings in pages in a PDF using python

I want to extract titles from pdf pages and match them with a search query. See attached file for an example.

In the attached file, if I search for "Balance Sheet", the code should be able to return page 232.

So input will be a string and output will be a page number (integer value).

Note that "balance sheet" would be at multiple locations but we want to return only those pages in which it is in the title.

If you have previously used pdfminer then this should be easy for you. I'm open to other core languages like Java.

You can also explore pdftitle library, if that works.

Important thing is speed and accuracy. We tried doing it with PyPDF but it is not so accurate. So keep that in mind.

We can provide many other example documents if needed.

Beceriler: Python, Veri Madenciliği, PDF, Java

About the Client:
( 2 değerlendirme ) Gurgaon, India

Proje NO: #32749279

Bu iş için 14 freelancer ortalamada ₹24821 teklif veriyor

(130 Değerlendirme)
7.0
(37 Değerlendirme)
6.6
(91 Değerlendirme)
5.8
suyashdhoot

Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several comp Daha Fazla

₹35000 INR in 7 gün içinde
(38 Değerlendirme)
6.0
VladProkopchuk

Hello Sir! I think I'm a great fit for this project because I have an interest in your project and can deliver on time, according to your specifications

₹25000 INR in 7 gün içinde
(10 Değerlendirme)
4.4
JaibhanSinghGaur

Hello sir, I can make this for you. I am a python developer with more than 2 years of experience. I have done many projects in past. I can work on : 1. Web Scraping / Data Science / ML 2. Django 3. APP development 4. Daha Fazla

₹12500 INR in 2 gün içinde
(37 Değerlendirme)
4.6
(3 Değerlendirme)
4.5
RomanRut

Hello, sir I've read your job posting carefully. I will search the title from pdf successfully. Here are my python skills - Data Visualization (Cryptocurrency trading bot, stock prediction, Prediction Algorithm for Spo Daha Fazla

₹37500 INR in 3 gün içinde
(1 Yorum)
2.6
yinshu2020

----------------Professional Python & PDF Processing Expert! Best Result in Time!----------- Dear sir. I've read your project description very carefully. I've extensive experience in Python & PDF Processing, so I belie Daha Fazla

₹25000 INR in 7 gün içinde
(2 Değerlendirme)
2.1
MUKUND12

I want to volunteer for your project of encoding and decoding. If you feel I am worth it you can give it a try. I will share the image of output for your confidence and then only ask for payment. If you want you can Daha Fazla

₹35000 INR in 7 gün içinde
(0 Değerlendirme)
0.0
mldlaids

Hi. I am a data scientist. I am very familiar to Deep learning apis such as Tensorflow and fastai, mxnet. I have a good hands on working with Advanced R and Python and BI tools and technologies, AI, Big Data. I have qu Daha Fazla

₹25000 INR in 7 gün içinde
(0 Değerlendirme)
0.0
nsv91

I am a software developer and will be able to do the above mentioned task in 7 days.

₹15000 INR in 7 gün içinde
(0 Değerlendirme)
0.0
aakaakar

We can build this using tesaract and open cv , using NLP we can also use pdf miner We can alterativelt also use AWS textextract

₹25000 INR in 7 gün içinde
(0 Değerlendirme)
0.0
HafetzAzahari

I am expert in data entry, typing, editing etc. if you hire me for this project, I will assure you that I will complete it on time. Thank you.

₹25000 INR in 7 gün içinde
(0 Değerlendirme)
0.0