DIMACS TR: 2001-43

QuickSAND: Quick Summary and Analysis of Network Data



Authors: Anna C. Gilbert, Yannis Kotidis, S. Muthukrishnan, and Martin. J. Strauss

ABSTRACT

Monitoring and analyzing traffic data generated from large ISP networks imposes challenges both at the data gathering phase as well as the data analysis itself. Still, both tasks are crucial for responding to day to day challenges of engineering large networks with thousands of customers. In this paper we build on the premise that approximation is a necessary evil of handling massive datasets such as network data. We propose building compact summaries of the traffic data called sketches at distributed network elements and centers. These sketches are able to respond well to queries that seek features that stand out of the data. We call such features ``heavy hitters.'' In this paper, we describe sketches and show how to use sketches to answer aggregate and trend-related queries and identify heavy hitters. This may be used for exploratory data analysis of network operations interest. We support our proposal by experimentally studying AT&T WorldNet data and performing a feasibility study on the Cisco NetFlow data collected at several routers.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-43.ps.gz
DIMACS Home Page