R-Tree Index Packing

R-Tree Index Packing

The PostGIS spatial index currently splits pages based on longest edge. This results in trees that have sub-optimal spatial clustering in the branches, which in turn makes for sub-optimal clustering. A packed R-Tree would provide the highest performance clustering.

Clustering is the act of re-writing a data table onto physical disk in the "index order", so that records that are close together in the index tree are close together on the disk. This makes physical data access faster. For spatial data, the optimal cluster is spatially oriented – spatially close together records should be close together on the disk.

The PostGIS R-Tree does not automatically pack records into spatial clusters. Often several branches of the index tree cover the same spatial area. A packed tree would have a minimum number of branches covering a particular spatial area. There are various algorithms for R-Tree index packing which could be useful, but they generally apply to in-memory indexes. The main development effort would be in figuring out how to create a packed structure in the PostgreSQL GiST disk-based index.

Funding

This roadmap item is currently unfunded.

Get a quote now!

Get a quote or read more about core development to add your support to a road-map item.

Other Roadmap Items

Higher Dimensional Index

The PostGIS spatial index currently works exclusively in the X/Y plane. For data such as fleet tracking the “Z” or “time” dimension can serve as a useful query restriction. Higher dimensional indexes are needed.

R-Tree Index Packing

The PostGIS spatial index currently splits pages based on longest edge. This results in trees that have sub-optimal spatial clustering in the branches, which in turn makes for sub-optimal clustering. A packed R-Tree would provide the highest performance clustering.

Better Estimated Extent

Client software, particularly software that does coordinate system reprojection, requires the extent of a layer in order to integrate it with other layers. Spatial databases are generally slow at returning that information.

All roadmap items...