İptal Edildi

Extracting Text from HTML with Encoding resolution


I need a script that takes as input an HTML file (with possible many encodings UTF, ASCII, etc. in English and non English languages) and output only text in ASCII Encoding (respecting charachters such as üöà...). It is part of a small web crawler. It collects x number of texts for linguistic analysis.

The main focus is Encoding.

Perl 5.8/5.10

Best Regards

Beceriler: Perl

Daha fazlasını gör: input html 5, linguistic, encoding, text extracting, html number, text html perl, uuml, small text html, perl script input, web crawler html, crawler text, perl script text, html crawler, html resolution, html encoding, html text input output, file text, resolution file, html english, file resolution, analysis text, script web crawler, crawler web, html analysis, file html file

İşveren Hakkında:
( 0 değerlendirme ) Paris, France

Proje NO: #701904

Bu iş için 6 freelancer ortalamada $73 teklif veriyor


see pm........

1 gün içinde %bids___i_sum_sub_32%%project_currencyDetails_sign_sub_33% USD
(39 Değerlendirme)

I am an expert in data extraction with Perl. Looking forward to working on this project.

1 gün içinde %bids___i_sum_sub_32%%project_currencyDetails_sign_sub_33% USD
(1 Yorum)

Pls see PM

in %bids___i_period_sub_35% gün içinde70%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(5 Değerlendirme)

I have the code base ready. I do lots of non-English language crawling on sites that use all sorts of character encoding sets (iso-8859-1 to utf-8) and have managed to solve this problem for good.

1 gün içinde %bids___i_sum_sub_32%%project_currencyDetails_sign_sub_33% USD
(0 Değerlendirme)

I have 7 years expericens on perl and can finish this project on time.

in %bids___i_period_sub_35% gün içinde45%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)

I read your requirement carefully .I have 2 year experience in Perl scrapping .I have experience of famous websites scrapping example Expedia ,orbitz,travelocity etc ......

in %bids___i_period_sub_35% gün içinde50%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)