Jackson Best Practices: Performance
Jackson is designed to add minimum amount of overhead for your JSON processing. But just like other high-performance engines, actual real-life performance is dependant on usage patterns. Here are some things you can do to ensure efficient operation.
1. Reuse when appropriate
Some objects that Jackson package use are either expensive to create, or can reuse other things that are expensive to create. This is specifically true for following objects:
ObjectMapper: object mappers cache serializers and deserializers that are created first time handler is needed for given type (or more precisely, mapper holds a reference to Provider objects that do). If you do not reuse mappers, new serializers and deserializers need to be created each time: and these are expensive operations due to amount of introspection and annotation processing involved.
JsonFactory: all JsonParser and JsonGenerator instances used for Streaming API and Data Binding are constructed by JsonFactory. And heavy-weight objects like symbol tables are reused by these factories. As such it is beneficial to reuse these factories. (NOTE: ObjectMapper's hold their own references to JsonFactory instances)
Fortunately reuse of these objects is very easy: they are thread-safe provided that configuration is done before any use (and from a single thread). After initial configuration use is fully thread-safe and does not need to be explicitly synchronized.
Some other commonly used objects may be reused, but where benefits are much fewer:
ObjectReader/ObjectWriter: these are light-weight immutable objects that can be safely shared between threads (and thus reused) as well; however, there are no performance drawbacks from creating new instances, beyond basic construction of objects (which itself simply populates fields and is very cheap)
2. Clean up after use
As a general rule, you should close all objects that are closeable, after you are done with them. This means closing:
instances once you are done with them.
Why does this matter? Some resources (internal buffers) can be released when instances are closed, and will be reusable later on for other instances (handled by JsonFactory that creates instances). This matters most when handling small JSON documents, but is a good practice to follow in general.
3. Use the "least processed" forms of input source
(and output targets)
In general, Jackson is optimized for use cases where input is given in its raw form, and can do all necessary transformations and buffering very efficiently. Consider following:
Jackson can UTF-8 decode and encode fast, faster than default JDK encoder would (and very likely faster than code you would write). So if your input comes in as InputStream (like what servlets get), pass it as such, and do NOT construct InputStreamReader for it. This is not going to help anyone.
- Not only is this fastest way to do it, it is less likely to lead to problems, since Jackson can also reliably auto-detect actual physical encoding (as per JSON specification and encodings it allows)
Further: if you only know File or URL input is to be read from, pass that reference and do not bother constructing matching input stream -- Jackson knows how to do this, and can then also ensure that resource gets closed when JsonParser or JsonGenerator gets closed.
- Don't buffer buffering input: Jackson does it anyway (correctly and efficiently).
Another way to put these is that it is unlikely that you can help a lot by pre-processing things. Sort of like you should not try to "help" plumber when he comes to unclog your toilet -- let expert do his/her job.
4. Change defaults only when you need to
As a general rule, default settings are chosen for optimal (or at least good enough!) performance: and all "slow features" need to be specifically enabled. Although most configurable features do not have significant performance impacts, some do: and usually mostly because they implement less often needed functionality and thus have not been extensively optimized. Because of this, it makes sense to use default settings unless you specifically need non-default settings.
Examples of settings that can reduce performance.
As with other minor performance concerns, it makes sense to measure effects if you really want to optimize performance -- it may be that you can not see enough difference for settings to really matter.
5. If you need to re-process, replay, don't re-parse
In cases where you can not process JSON content in one fell swoop, you do not have to (and should not!) serialize intermediate results back into textual JSON: this because reading and writing of textual JSON is still somewhat costly, even with fast processor like Jackson. Instead, you can use more efficient intermediate forms, such as:
If you need a JSON Tree for processing, you can construct a JsonParser by calling JsonParser jp = treeRoot.traverse();
If you just need to read through same stream multiple times, you can construct a org.codehaus.jackson.util.TokenBuffer, fill it with tokens, and re-read as many times as you want, with very little overhead
As an added bonus, TokenBuffer implements JsonGenerator interface, so you even serialize regular POJOs as TokenBuffers:
Use of either JsonNode or TokenBuffer can boost performance nicely in cases where incremental processing is necessary. Further, due to wide range of available conversions, resulting code can be simple and easy to follow as well.
6. Static typing
Static types are slightly more efficient for serialization: for example, if a class is known to be final, serializer functionality may be able to avoid some of type checking. So although it probably does not make sense to add 'final' modifiers just for this purpose (speed difference is unlikely to be huge; typically in range of 10 - 20% for raw serialization), it is nice to know that this can have some positive performance benefit.
7. Prefer ObjectReader/ObjectWriter over ObjectMapper
Although the main reason to prefer these objects is thread-safety (and thus correctness), there may be performance benefits as well. Latest Jackson versions (2.1 and above) will be able to reuse root-level (de)serializers, making reuse of ObjectReaders and ObjectWriters bit more efficient than using ObjectMapper.
8. Use ObjectReader.readValues for sequences
When reading a root-level sequence of POJOs, readValues() method of ObjectReader can be much more efficient than doing explicit iteration using ObjectMapper.readValue() method.