Distributed Spatial Approximation Tree - SAT*

José Peñarrieta (1), Patricio Morriberón (1), Ernesto Cuadros-Vargas (1,2)

e-mails: jpenarrieta@inf.ucsp.edu.pe, pmorriberon@inf.ucsp.edu.pe, ecuadros@spc.org.pe

(1) San Pablo Catholic University
(2) Peruvian Computer Society

Abstract

The problem of classifying elements by similarity has a lot of applications. In this paper we propose a new Metric Access Method (MAM) called "Distributed Spatial Approximation Tree (SAT*)" based on the "Spatial Approximation Tree (SAT)" [Navarro, 2002]. SAT is based on approaching "spatially" the searched objects. However, this MAM cannot assure an optimal distribution because it chooses its root randomly. For example, it can choose an extreme of the dataset as the root, so the remaining objects would be at the same side of the root, heading to very inefficient queries.

We present as a possible solution to this problem, an algorithm called "Centroid Selection Algorithm (CSA)" which is based on the idea of choosing the center of the dataset as the root. The advantage of SAT* is that it assures a much better distribution of the data structure, heading to more efficient queries. Experiments show that distance calculations are reduced up to 48% compared with SAT.


PDF de este artículo
PDF de JPC2006 (incluye todos los artículos)