Needed AI to Extract Rate Formula from Text Description in PDF

Hello. This is a unique problem. Please provide a detailed proposal. Vague applications will be ignored. Speak to the problem. Looking for people with creative ideas.

The task is to extract a rate formula from a textual description in a PDF file.

In Texas, the electricity market is deregulated. Rates are defined by a document called an Energy Facts Label (EFL). Several examples of EFLs are attached. These PDFs then describe, in words, a math formula.

There are thousands of these EFLs.

The Rate Formulas PDF file (attached) gives several examples of different descriptions, and a graph of the formulas that result.

Rates are a function of kwh, ie R(x) where x = kilowatt hours.

EFLs include a spot pricing table at 500, 1000, and 2000 kwh. This shows the rate value at those precise points, ie R(500), R(1000), and R(2000). This is useful for testing whether an accurate rate formula solution has been found or not.

C# source code has been attached. There are two console applications.

1) PowerToChooseScraper. This program will download all the EFLs currently in the market. Just give it a target folder and it will download the PDFs there. This program may have some little bugs, but should work for you.

2) PTC. This is old code. It is a first draft attempt at creating a program to parse the PDFs and extract the rate formulas. Code hasn't been touched for many years. At the time it was created, it was looking good. Not 100%, but was getting ~65% accuracy.

I do not care if the existing PTC code is used or not. I also don't care if your work is in C# or something else, but whatever the solution, the final working version will end up in C#. If you want to use a language other than C# for developing the initial logic, I'll ask why. If using ML techniques, that could be a good reason.

This is a unique problem because it could be approached in a lot of ways. It could maybe be solved using ML/learning techniques. Maybe word similarity algorithms like Jaro-Winkler. The PTC code works by trying multiple approaches. It runs in a loop, stepping through methods, until it successfully found a solution. The approaches attempted are all fairly rudimentary. No learning algorithms have been attempted.

I also do not expect 100% accuracy. Just as close as possible. ~95%. It's possible some EFLs have human errors in them, where the numbers are actually wrong and don't make sense. In which case the goal is to discover that. If a solution can't be found, we want to flag this EFL for a human to review it and determine what is going on. Over time we can improve the accuracy.

I'm looking for for the discrete logic that processes a single PDF and outputs the rate formula, or an error code if it can't be determined. The larger infrastructure to then download and process these files, database the results, etc., is a separate thing outside the scope of this project.

I will be working with you directly on this. I am an expert in C#, ML, and well versed in these EFLs. I can help guide your approach.

Beceriler: C# Programlama, PDF, Machine Learning (ML), Veri Madenciliği, Data Extraction

Daha fazlasını gör: extract text pictures pdf, extract text picture pdf, java extract text structure pdf, php extract text excerpt pdf, extract text data pdf, extract plain text doc pdf docx, extract text special pdf, extract text from pdf, extract text from pdf file, extract russian text from pdf, extract text from pdf image, extract text from pdf online, extract text from pdf python, vba extract text from pdf, extract text from pdf to excel, extract text from pdf command line, extract text from pdf mac, extract text from pdf free, extract text from pdf acrobat

İşveren Hakkında:
( 0 değerlendirme ) New York, United States

Proje NO: #31566346

Bu iş için 10 freelancer ortalamada $593 teklif veriyor

(89 Değerlendirme)

Hi, I am Smithangshu Ghosh, a C#.Net developer with the experience of more than 7 years. I have seen you have posted this project twice so I am placing my bid on the recent one. I only bid on those projects which I b Daha Fazla

$655 USD in 5 gün içinde
(8 Değerlendirme)
(43 Değerlendirme)
(10 Değerlendirme)

Hi, We have checked your job description carefully and we can give a try. We have rich experience on Python, ML, DL etc. We are sure that we can deliver the perfect result as you want on time within your budget. Our Daha Fazla

$700 USD in 7 gün içinde
(1 Yorum)

hello, I have seen that you need an experienced AI expert for Needed AI to Extract Rate Formula from Text Description in PDF . I am a professional AI expert with more than 10 years experience. I have carefully unde Daha Fazla

$500 USD in 14 gün içinde
(3 Değerlendirme)
(0 Değerlendirme)

Hey, I have checked your requirement and understand that as well. I have done SIMILAR work past. Do you want to see the DEMO WORK??? Will show you Thanks.

$500 USD in 7 gün içinde
(0 Değerlendirme)

Hi. I did a very similar project for another client a few months ago. I am sure i can do the same for you. Kindly drop me a message in chat so we can discuss this in more detail

$500 USD in 5 gün içinde
(0 Değerlendirme)

I have 10 years plus experienced in web and windows applications development and also worked on pdf data extraction using itextsharp with regex patterns matching of data.

$600 USD in 10 gün içinde
(0 Değerlendirme)