The Sixth DIMACS Implementation Challenge:
Near Neighbor Searches
In conjunction with its Special Year
on Massive Data Sets, the Center for Discrete Mathematics and
Theoretical Computer Science
(DIMACS) invites participation in an Implementation Challenge
focusing on Near Neighbor Searches. The Implementation
Challenge will take place between January 1998 and January 1999.
Participants are invited to carry out research projects related to the
problem area and to present research papers at a DIMACS workshop to be
held on January 15, 1999 (see Workshop
Information). A refereed workshop proceedings will be published
as part of the AMS-DIMACS
book series.
Finding near neighbors according to various distance measures arises
in many application areas, including but not limited to finding
similar web documents, images, DNA sequences, lines of text, or audio
and video clips, and more generally in areas of statistics, data
mining, information retrieval, pattern recognition and data
compression. The goal of the Implementation Challenge is to determine
how algorithms depend on the structure of the underlying data sets, as
well as to determine realistic algorithm performance where worst case
analysis is overly pessimistic and probabilistic models are too
unrealistic. We invite research projects in the following two
directions.
Providing Interesting Data Sets and Distance
Measures. We invite researchers from various application
areas to provide interesting data sets for near neighbor problems.
Contributions could consist either of sample data sets from a true
application or of realistic instance generators resembling practical
data sets. Our goal is to construct a library of test problems that
will be available for study both during and after the Challenge.
Developing Near Neighbor Algorithms. Neighbor
algorithms may be developed either for specific application domains or
for more general spaces. Algorithms may aim to find the nearest
neighbor, several nearest neighbors, or approximate neighbors.
Projects may involve either public domain or proprietary codes.
Contents:
- Steering Committee:
- Michael Goldwasser, Coordinator, Princeton University
- Jon Bentley, Bell Labs
- Ken Clarkson, Bell Labs
- David Johnson, AT&T Labs
- Catherine McGeoch, Amherst College
- Robert Sedgewick, Princeton University
Previous
DIMACS Implementation Challenges.
DIMACS Home Page.
Maintained by Michael Goldwasser
challenge6@dimacs.rutgers.edu
Document last modified on May 26, 1999.