I have a pdf which is a scan of an old book (old enough that it is in the public domain). I want the text from this book. You can use OCR to aid you if you wish, but note that I want the exact text, so you will have to manually correct OCR errors. As you go, I want each chapter entered into the website at www.sefaria.org. You'll need to create an account to edit (don't worry, it's free and doesn't require anything other than a name). I'll create the record in the database for you, you'll just need to copy and paste your final output into the website.
Note that I only want the text of the book itself, I don't need the header or page numbers.
I would like the introduction as well.
Since I'm only interested in the text, don't bother preserving the line breaks of split words (words that go across a line break in the original and were hyphenated).
The internet archive also has an OCRed copy you can use: [url removed, login to view]
Stats on the size of the book:
* The pdf is 284 pages, but the margins are very small (only about a 5 inch page).
* about 45,000 words
* about 250,000 characters (including spaces)
The pdf is attached so you can see what you're getting yourself into.