Design goals

The goals of WKB4J are:

  1. (very) fast retrieval of data from a database using the WKB format,
  2. (very) fast conversion of raw WKB data into a geographical toolkit,
  3. compatibility with several different geographical toolkits.

There is many geographical toolkits out there, each one modeling the very same things, in slightly different ways: points, lines, polygons... Due to their personal history, some are very close to the WKB format (JTS for example), while some are further away (OpenMap). The goal of WKB4J is to allows the author of each toolkit to integrate easily WKB4J into their own toolkit while maintaining strong performance.

Overall design

This was accomplished by splitting the driver into three parts: , WKBReader, WKBParser and WKBFactories.

The WKBReader retrieves raw data from a data source. The data is then passed to the WKBParser. Right now the only possible datasource is a PostgreSQL database, but it is trivial to fetch data from files or other databases.

The WKBParser sequentially process the WKB stream and emits calls as it encounter items in the data stream.

Those calls are performed on a WKBFactory provided by the user. This architecture draws from both the Strategy pattern (the WKBFactory is provided by the user so it can really be anything) and the SAX API (an evenmential-base API for XML processing). A WKBFactory is free to answer to calls as it see fit, including ignoring calls. This allows th e author of each WKBFactory to implement the most efficient processing code for its own toolkit. For example, some toolkits handles only immutable objects, while some other allows for object to be constructed incremmentally.

In order to simplify the job of the implementor of a custom factory, several default implementations are provided:

  • AbstractWKBFactory: this factory contains code that keep tabs on the flow of events and yell when something is wrong (for example if the WKB stream tries to insert a GeometryCollection in a LineString). It can be seen as a "validating" WKBFactory. If you're sure that nothing is wrong with your data, you can choose not derive this class and just enjoy the speed. Still the overhead of this class is very low.
  • LoggingWKBFactory: a Decorator around an AbstractWKBFactory that logs each call to a Log4J system. During development of a new WKBFactory, you can subclass LoggingWKBFactory in order to vizualize the flow of calls. Once you're reasonably sure that your WKBFactory is correct, you can bypass LoggingWKBFactory by extending AbstractWKBFactory. Normal call events are logged to the Info level. Once again the overhead of this class is low.
  • EmptyWKBFactory: this factory is just empty, for the author to fill the blanks. The author of a new WKBFactory shouldn't extend it but just copy it into its own application.

3D Support

Currently, there is no support for height information in the Simple Features Specification , but it seems that many projects (including PostGIS) implements the Two-and-a-half-D extensions for Simple Features proposal. One important point : it is possible to mix 2D geometries with 3D geometries in "container" geometries (GeometryCollection, MultiPolygon, MultiLineString, MultiPoint) but you can't mix 2D points with 3D points in the same geometries since list of points are actually arrays of doubles without any boundary.

Since it isn't standard, it isn't implemented in the normal WKBFactories and WKBParser. Instead, it is specified in the WKBFactories3D interfaces and the WKBParser3D. Factories that wish to support this specification should implements the WKBFactories3D interface. JTSFactories and PostGISFactory do but OpenMapFactory doesn't since by default, OpenMap doesn't support height information. If your data contains 3D points, you should use the WKBParser3D instead of WKBParser.

The implementation was straightforward, but it did required some modifications to the WKBParser to make it more generic.

Moot points

Some design decisions are up to grabs.

How do the driver decides that it has seen all the data for a given geometry and that it should move to another geometry.

We can either let the WKBFactory decides for itself by keeping depth counter, or we can have the driver performs calls to signal when geometries begin and where they end. Depth counter adds comparaison everywhere, so they put more burden on the developer of the Factory. Additionnals calls, well, the developers have to implement them, so it isn't much better. I picked the solution of additionnal calls because they actually represent the stream of WKB data while depth counters only predict what should come next.

SRIDs aren't actually part of the WKB format. Nevertheless some toolkits might require their objects to be created with one (JTS).