We need to extract individual articles with meta data in wikipedia by parsing the latest XML Dump. Since, wikipedia XML Dump is a very huge file about 25GB after decompressing, it needs a work around to avoid memory issues while parsing.
7 freelancers are bidding on average $193 for this job
i can do it. i have participated one project, parsing HTML/XML content, and i have lots of experience on XML processing, like JAXB, Dom, Dom4j and SAX and so on.