Indexing of multi-dimensional data has been the focus of a considerable amount of research effort over many years but no generally agreed paradigm has emerged to compare with the impact of the B-Tree, for example, on the indexing of one-dimensional data. At the same time, the need for efficient methods is ever more important in an environment where databases become larger and more complex in their structures.
Mapping multi-dimensional data to one dimension, thus enabling one-dimensional access methods to be exploited, has been suggested in the literature but for the most part interest has been confined to the Z-order curve. The possibility of using other curves, such as the Hilbert and Gray-code curves, whose characteristics differ from those of the Z-order curve, has also been suggested.
In this thesis we design and implement a working file store which is underpinned by the principle of mapping multi-dimensional data to one of a variety of space-filling curves and their variants. Data is then indexed using a B+ Tree which remains compact, regardless of the volume and number of dimensions. The implementation has entailed developing algorithms for mapping data to one dimension and, most importantly, developing algorithms to facilitate the querying of data in a flexible way. We focus on the Hilbert curve but also consider other curves and propose new alternative algorithms for querying data mapped to the Z-order curve.
The current implementation accommodates data in up to sixteen dimensions but the approach is generic and not limited to this number. We report on preliminary testing of the implemetation, which provides very encouraging results. We also undertake a brief exploration of the application of space-filling curves to the indexing of spatial data.
This paper presents and discusses a radically different approach to multi-dimensional indexing based on the concept of the space-filling curve. It reports the novel algorithms which had to be developed to create the first actual implementation of a system based on this approach, on some comparative performance tests, and on its actual use within the TriStarp Group at Birkbeck to provide a Triple Store repository. An important result that goes beyond this requirement, however, is that the performance improvement over the Grid File is greater the higher the dimension.
Mapping to one-dimensional values and then using a one-dimensional indexing method has been proposed as a way of indexing multi-dimensional data. Most previous related work uses the Z-Order Curve but more recently the Hilbert Curve has been considered since it has superior clustering properties. Any approach, however, can only be of practical value if there are effective methods for executing range and partial match queries. This paper describes such a method for the Hilbert Curve.
The Hilbert Curve describes a method of mapping between one and $n$ dimensions. Such mappings are of interest in a number of application domains including image processing and, more recently, in the indexing of multi-dimensional data. Relatively little work, however, has been devoted to techniques for mapping in more that 2 dimensions. This paper presents a technique for constructing state diagrams to facilitate mappings and is a specialization of an incomplete generic process described by Bially. Although the storage requirements for state diagrams increase exponentially with the number of dimensions, they are useful in up to about 9 dimensions.
This report reproduces and briefly discusses an algorithm proposed by Butz for calculating a mapping between one-dimensional values and n-dimensional values regarded as being the coordinates of points lying on Hilbert Curves. It suggests some practical improvements to the algorithm and presents an algorithm for calculating the inverse of the mapping, from n-dimensional values to one-dimensional values.