Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Splitting up xml file

Status
Not open for further replies.

stdf23173

Technical User
Jan 4, 2005
7
NL
I have searched the web a little but could not find a clear answer to the following :

My document is 4Megabyte and has around 1000 pages, XMLMind has trouble processing it.
Can i split it up so i can work on different sections easily ?

grtz

Simon
 
You can do it, but your program would need to know the structure of the file in order to split it cleanly.

Basic steps would be:
1) Use SAX parser to read file (shouldn't use a DOM for a 4mb file)

2) Count the number of records going by (use the endElement event)

3) When you get to "x" records, write what you've read so far to a file, and close then end of the XML document (add the closing elements to make it a valid document)

Chip H.


____________________________________________________________________
Click here to learn Ways to help with Tsunami Relief
If you want to get the best response to a question, please read FAQ222-2244 first
 
Java 1.5 now supports a XMLStreamReader.

"What is the SJSXP?

The Sun Java Streaming XML Parser is a high-speed implementation of StAX. BEA Systems, working in conjunction with Sun Microsystems, Inc., as well as XML-guru James Clark, Stefan Haustein, and Aleksandr Slominski (XmlPull developers), and others in the Java Community Process developed StAX as an implementation of JSR 173. StAX is a parser independent Java API based on a set of common interfaces.

The SJSXP is included with version 1.5 of the Java Web Services Developer Pack. The first thing that you're likely to notice about SJSXP is that it is based on a streaming API, which does not need to read an entire document before a developer can access any of the nodes. It also does not adhere to the principle of starting the parser and allowing the parser to "push" data to the event listener methods. Instead, SJSXP implements a "pull" method, where the parser maintains a pointer of sorts to the currently-scanned location in the document--this is often called a cursor. You simply ask the parser for the node that the cursor currently points to."


More info:
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top