The Sixth DIMACS Implementation Challenge:
Near Neighbor Searches
Workshop Presenters
- Title:
"Nearest Neighbor Search in Vector Quantization"
- Authors:
- Kevin Zatloukal (kevinz@cs.washington.edu)
- Mary Holland Johnson (mhjohns@ee.washington.edu)
- Richard Ladner (ladner@cs.washington.edu)
- Organization:
University of Washington
- Proposal Summary:
- Data Set:
- Software Contributions:
- Title:
"Analysis of Approximate Nearest Neighbor Searching with Clustered
Point Sets"
- Authors:
- Songrit Maneewongvatana, songrit@cs.umd.edu
- David Mount, mount@cs.umd.edu
- Organization:
University of Maryland, College Park
- Proposal Summary:
- Data Set:
- Software Contributions:
- Title:
"Analysis and Improvement of the SR-tree: an Index Structure for
Nearest Neighbor Searches"
- Authors:
- Norio Katayama (katayama@rd.nacsis.ac.jp)
- Shin'ichi Satoh (satoh@rd.nacsis.ac.jp)
- Organization:
National Center for Science Information Systems, Japan
- Proposal Summary:
Experimental evaluation of the Sphere/Rectangle-Tree (SR-Tree) for
high-dimensional nearest neighbor queries. This is a secondary-memory
data structure, corresponding to the nested hierarchy of the space
decomposition while permitting overlap. For more details, see
article in Proc. 1997 ACM SIGMOD, pp 369-380, or see their website at
www.rd.nacsis.ac.jp/~katayama/homepage/research/srtree/English.html
, which includes an online demonstration.
- Data Set:
40700 images were collected from the photo and the video archives of
NASA. As for videos, cuts are detected based on the transition of the
color histogram and then representative images are extracted. The
similarity of images is measured in terms of the color histogram.
Munsell color space (hue, saturation, intensity) is used. It is
divided into nine subspaces: black, gray white and six colors.
Histograms of four sub-regions, i.e., upper-left, upper-right,
lower-left and lower-right, are calculated for an image in order to
take account of the composition of the image. Four histograms are
concatenated to compose a 36-dimensional feature vector. Then,
feature vectors are reduced to 20-dimensional vectors with the
principal component analysis. Similarity of images is measured by
Euclidean distance among these 20-dimensional vectors. As an
experiment, 37000 of these vectors are randomly selected to be in the
original data set, with the remaining 3700 used as query points.
- Software Contributions:
- SR-Tree algorithm
- Data Set Contributions:
- Image vectors (Euclidean)
- Title:
"Experimentation with an Approximate Nearest Neighbours Search
Algorithm based on the Extended General Spacefilling Curves
Heuristic"
- Authors:
- Juan-Carlos Pérez (jcperez@disca.upv.es)
- Enrique Vidal (evidal@iti.upv.es)
- Organization:
Universidad Plitécnica de Valencia
- Proposal Summary:
- Data Set:
- Software Contributions:
- Title:
"Nearest Neighbor Searching in Metric Spaces: Some Experimental
Results"
- Authors:
- Kenneth
L. Clarkson (clarkson@research.bell-labs.com)
- Organization:
Bell Laboratories, Lucent Technologies
- Proposal Summary:
- Data Set:
- Software Contributions:
- Title:
"Excluded Middle Vantage Point Forests for Nearest Neighbor
Search"
- Authors:
- Peter N. Yianilos (pny@research.nj.nec.com)
- Organization:
NEC Research Institute
- Proposal Summary:
- Data Set:
- Software Contributions:
- Title:
"Evaluating Some Fast Nearest Neighbor Search Algorithms in Metric
Spaces"
- Authors:
- Luisa Micó (mico@dlsi.ua.es)
- Jose Oncina (oncina@dlsi.ua.es)
- Organization:
Universidad de Alicante
- Proposal Summary:
Evaluation of various AESA-based algorithms (in particular, LAESA
and TLAESA). AESA-based algorithms find the exact nearest neighbor in
arbitrary metric spaces. The general methodology invovles choosing a
representative sample of "base prototypes" from the original set.
Preprocessing of the data set is done based on this sample. At query
time, branch and bound techniques are used for finding the nearest
neighbor.
Variants in the algorithms depend on the exact data structures used
for representing the preprocessed information, and in the manner in
which the set of base prototypes is chosen.
- Data Set:
- Software Contributions:
Sixth
DIMACS Implementation Challenge.
Previous
DIMACS Implementation Challenges.
DIMACS Home Page.
Maintained by Michael Goldwasser
challenge6@dimacs.rutgers.edu
Document last modified on November 6, 1998.