Hi,
i have some really big xml files (up to over 1G) where i need the content of a specific node, lets say all contents within all existing THIS CONTENT tags. I need everey -node content in a single line of a new created text-file.
Using an DOM modell approach or xslt is NOT an option, because they need too much memory for execution. I need a realisation which comes around with a standard pc and 256 - 512 mb ram.
i guess this should be possible with sax, because you have not to save parts of the tree or something, just write the line to the file, forget it and go on.
i do not need a fancy gui, command-line app is enough. as parameters there are the incoming xml-file, the created txt-file and the tag to search for.
I have done a few apps that parse XML using expat (a very lightweight sax-like parser). I could do an application using Python that would parse the file using less than 10mb of memory (in fact much less).
See PMB for an example of prior work.
David