Abstract: Spatial data processing has grown significantly in recent years, and computing devices equipped with GPS (Global Positioning System) and communication networks (2G, 3G, and others) such as mobile phones, smartphones, and sensors are increasingly common and affordable. There is a great availability of spatial data: geolocalized images, open data from federal, state, and municipal governments, mapping of commercial stores, georeferenced data collection by governmental entities, among others. All these data enable us to produce new information. An example of spatial data processing is a spatial join query that finds correlated information in two or more datasets. It can be quite complex to process due to the amount of data involved and thus the distributed execution is often recommended in the literature. In a distributed system, a query is partitioned into a set of tasks so that multiple machines can process it, and the distribution of the tasks is optimized so that the query execution takes the shortest execution time possible. One parameter used for this division of tasks is selectivity. This research presents the analysis of methods that estimate the selectivity of a query. Several experiments were performed to compare the selectivity of Euler and Grid histograms. Our experiments showed that in a distributed system scenario and with distinct grids resolutions for each dataset, the Euler histogram has a worse result than the grid histogram. In other scenarios, with an aligned grid for both datasets, the superiority of the method was confirmed as per the original article.

Keywords: Spatial Databases; Geographic Information Systems; Distributed Processing; Spatial Join.

Complete monograph. Copyright © 2018. All rights reserved.

Disclaimer: Although the student carefully wrote the original abstract, and it was revised and improved, English is not him or the advisor' mother language. The original work is written in Portuguese.

Citation: Andrey Gonçalves França. Accuracy of Selectivity Estimation for Distributed Spatial Join Tasks using Euler Histograms. Monograph. Bacharelado em Ciências da Computação. Universidade Federal de Goiás, Regional Jataí. Jataí, GO, Brasil. 2018. 63p.

Copy citation in bibtex format.