Characterization of Selectivity Estimation Accuracy in Distributed Multiway Spatial Queries
Abstract: Devices capable of collecting spatial data are increasingly present in our daily lives. However, the systems responsible for processing this data have not kept up with the same pace of evolution. These data and systems play a crucial role in the decision-making process across various areas of human activity. In medicine, for example, they can be used to check whether a certain cancer has regressed in relation to the use of oncological therapies. The multiway join is an important and complex type of spatial query used to process such data, and it involves a series of steps whose efficient execution is fundamental to reduce the use of computing resources. Similar to common relational database systems, spatial systems have a component called a query optimizer, which takes into account the estimated selectivity to choose the best execution plan. This work evaluated the accuracy of selectivity estimation throughout the various steps involved in the multiway spatial join. For this purpose, an experiment involving five queries, each composed of ten sets of real data, was conducted. The selectivity estimation was also calculated for each query using two methods proposed in the literature. The results were applied to a metric, followed by visualization through graphs to facilitate analysis. It was concluded that the selectivity error is propagated exponentially along the query steps when the estimation method provides poor estimates. The same does not occur when the estimation method is more accurate, and the propagation tends to be more controlled. As the propagation of errors along query steps causes bad execution plan selection, we expect that the results of this research will clarify the behavior of selectivity estimation, providing valuable insights for future studies in this area.
Keywords: Multiway Spatial Join; Selectivity Estimation; Error Propagation; Distributed Processing.
Complete monograph. Copyright © 2024. All rights reserved.
Disclaimer: Although the student carefully wrote the original abstract, and it was revised and improved, English is not him or the advisor' mother language. The original work is written in Portuguese.
Citation: Leonardo Paiva Vieira. Caracterização da Precisão da Estimativa de Seletividade em Consultas Distribuídas de Multijunção Espacial. Monograph. Bacharelado em Ciência da Computação. Universidade Federal de Jataí. Jataí, GO, Brasil. 2024. 51p.
Copy citation in bibtex format.