Woodstox is a full-featured high-performance Open Souce Java XML processor. It implements Streaming XML API, Stax (JSR-173, javax.xml.stream), and is available under 2 Open Source licenses (Apache License, LGPL).
Some of noteworthy features of Woodstox 4.1 (latest stable release) are:
- Full XML 1.0 and 1.1 support, including complete namespace and DTD support (only Java parser besides Xerces that can make this claim)
- Implements both Stax and SAX (2.0) APIs
- Support for validation using all major streamable Validation schema languages: DTD, W3C Schema and RelaxNG; both for reading AND writing.
- Complete Stax2 extension API implementation, including efficient Typed Access API for working with typed data, not just text (from integral numbers all the way to Base64 encoded binary content!)
FasterXML offers full services for solving demanding high-throughput XML processing use cases using Woodstox, as well as related tools.
Tutorial page has links to tutorials for using Woodstox (or Stax parsers in general)
Documentation page has links to more in-depth documentation
Download page has the artifacts you need to use Woodstox.
Woodstox Project Page at Github
Cowtown Blog covers various topics, including development and use of Woodstox
Woodstox also forms basis of some command-line tools:
DTDFlatten is a tool for "flattening" DTDs -- that is, pre-processing DTDs such that all entity expansions are resolved, resulting in a single ("flat") DTD file, with no DTD entities left. This is useful for troubleshooting, as well as optimizing deployments.
ValidateXML is a command-line utility for validating XML documents against arbitrary DTDs, allowing overriding of DTDs defined in DOCTYPE declarations.