We need to extract individual articles with meta data in wikipedia by parsing the latest XML Dump. Since, wikipedia XML Dump is a very huge file about 25GB after decompressing, it needs a work around to avoid memory issues while parsing.
Bu iş için 7 freelancer ortalamada $193 teklif veriyor
i can do it. i have participated one project, parsing HTML/XML content, and i have lots of experience on XML processing, like JAXB, Dom, Dom4j and SAX and so on.
I am an experienced Java programmer, and I could complete this project.