Abstract: The cost estimation of spatial queries has a considerable computational cost, especially in situations involving simple and multiway spatial joins, as various execution plans must be considered for these types of queries. To select the most efficient plan, it is common to employ spatial histograms, which provide an estimate of the number of objects that will be retrieved in each query. However, this estimate faces inaccuracies due to three main sources of error: multiple object counting, approximation of objects using the Minimum Bounding Rectangle (MBR), and the assumption of uniformity in the equations used for estimation. This work proposes an algorithm capable of generating synthetic datasets that reproduce these errors (jointly and individually) and evaluates the unique contribution of each of them to the inaccuracy of the estimates of the IHWAF, Euler Improved, MinSkew, and EulerSkew histograms. We conducted a set of experiments using window queries – which are the basis of more complex estimates involving joins and multi-joins – on the generated datasets and observed that, regardless of the type (line or polygon) and extent of the dataset, the multiple object counting error is mainly responsible for the inaccuracy in the histogram selectivity estimation. This is followed by the uniformity assumption error, which is more pronounced in smaller datasets. In summary, the analysis of the results highlights the importance of considering these factors in improving histogram techniques and, consequently, the process of optimizing spatial queries, improving computational efficiency and the accuracy of the results obtained.

Keywords: Histogram; Selectivity Estimation Error; Window Query; Dataset Generator.

Complete monograph. Copyright © 2024. All rights reserved.

Disclaimer: Although the student carefully wrote the original abstract, and it was revised and improved, English is not him or the advisor' mother language. The original work is written in Portuguese.

Citation: Gustavo Rezende Gouveia. Caracterização do Erro de Estimativa de Seletividade para Consultas Espaciais de Janela. Monografia. Bacharelado em Ciência da Computação. Universidade Federal de Jataí. Jataí, GO, Brasil. 2024. 59p.

Copy citation in bibtex format.