To aid in comparisons between papers on the clique problem, the
DIMACS Challenge Steering Committee is making available two benchmark
programs, dfmax.c and dmclique.c.
PROGRAM 1: dfmax.c implements a simple-minded branch-and-bound
algorithm very similar to that of Carraghan and Paradalos
[Oper.Res.Lett. 9 (1990), 375-382]. Subproblems are pruned only
when the number of vertices still eligible for the clique is not
enough to unseat the current champion. This C code, written by David
Applegate and David Johnson, has a somewhat tighter inner loop than
does the C&P Fortran code, and runs significantly faster on a
variety of architectures. It assumes a 32-bit word length, but
otherwise should be reasonably portable. It appears to compile
successfully under both ANSI and non-ANSI C-compilers and
has been run on a VAX-8550, various SGI machines, and SPARCstations.
On an SGI Challenge computer (comparable to a DEC Alpha or a
Sun SPARCstation 10), it can handle a 500 vertex, p=0.5 random
graph in 3 minutes. A summary of results of the SGI Challenge for
random graphs and the DIMACS benchmarks is in the file results.dfmax.
INPUT: dfmax is designed to take as input the compressed "clq.b" format
used in the DIMACS benchmark directories. Usage: "dfmax filename".
OUTPUT: To indicate that dfmax is making progress (always useful in
a potentially exponential time algorithm), it prints a line of output
each time a vertex is eliminated from further consideration, giving the
current biggest clique found and the current running time (assuming your
computer's clock ticks 100 times per second). It also reports input time
at the beginning of a run and outputs the clique at the end.
CUSTOMIZING: The defined constant NMAX, currently set to 2400, gives the
maximum number of vertices that can be handled. If you increase this
value, note that the constants MAX_NR_VERTICES and MAX_NR_VERTICESdiv8
must also be adjusted as indicated in the code. (The code was
kludged together from at least three sources, and so does not set
standards for programming style...) To run the code as an algorithm
for finding maximum stable sets, place an "!" immediately before
the word "bitmap" in the line beginning "#define edge".
As the code currently stands, it implements the vertex pre-ordering
scheme C&P. This can be turned off by commenting out the
"#define REORDER" line. To compile, simply say "make dfmax".
BENCHMARKING: Whether you are studying exact or approximate algorithms,
this code can serve as a simple benchmark for calibrating your machine's
speed relative to those of the other participants.
If possible, please compile and run it on the supplied p=0.5 random
graphs r100.5.b, r200.5.b, r300.5.b, r400.5.b, and r500.5.b and report
your times. For comparison purposes, the SGI Challenge "user" times
(not counting input time) for one run on each graph were
0.08, 1.00, 7.87, 47.68, and 179.98 seconds, respectively.
The code can also be used as a common point of reference
for papers implementing exact clique algorithms. It is easily beaten
on dense graphs (average degree exceeding |V|/2) but may be hard
to beat on sparser graphs, especially random ones.
PROGRAM 2: dmclique.c is a variant on the simple "semi-exhaustive
greedy" scheme for finding large independent sets used in the
graph coloring algorithm XRLF described in Johnson, Aragon, McGeoch,
and Schevon [Oper.Res. 39 (1991), 378-406], originally suggested by
Matula and Johri. dmclique has four integer input parameters, SETLIM,
TRIALNUM, CANDNUM, and ITERATIONS, of which the key parameters are
SETLIM and CANDNUM. The algorithm proceeds as follows:
At any given time there is a set C (initially empty) of
vertices in the current clique, and a set U consisting of all those
remaining vertices that are adjacent to all members of C.
If |U| <= SETLIM, the branch-and-bound code of dfmax (without the
initial C&P pre-ordering) is invoked to find the maximum clique in
U, and this is added to C. Otherwise, we sample CANDNUM vertices
(with replacement) and choose one with maximum degree
(ties broken in favor of the earlier index). The chosen vertex is
then added to C, U is updated, and we repeat. We then repeat
the entire process TRIALNUM*ITERATIONS times.
Algorithm dmclique algorithm has much in common with the "GRASP"
algorithm of Feo and Resende, except the latter sets SETLIM = 0 and
uses a final local optimization phase to improve the clique found,
typically with slightly better results than are obtained with dmclique.
The value of dmclique as a benchmark code (as with dfmax) is that it
represents what can be obtained by a (relatively) efficient
implementation of the most straightforward of ideas.
As with dfmax.c, dmclique.c should compile successfully on a variety
of machines. A summary of results on an SGI Challenge for random graphs
and the DIMACS Challenge benchmarks is given in the file "results.dmclique"
These tests were all done with TRIALNUM = 1 and ITERATIONS = 1000, and
the file reports the distributions of clique sizes found over those
1000 runs. Setting the parameters SETLIM and CANDNUM was by trial
and error, but as a rule of thumb, take CANDNUM between |V|/5 and |V|/4,
and start with SETLIM = 100 for density <= 0.5, SETLIM = 50 for
density = 0.7, and SETLIM = 30 for density = 0.9.
INPUT: dmclique, like dfmax, is designed to take as input the compressed
"clq.b" format used in the DIMACS benchmark directories. Usage:
"dmclique filename SETLIM TRIALNUM CANDNUM ITERATIONS".
OUTPUT: An initial output line reports the parameter settings.
The second output line gives the input time. Then there is one
output line for each successive set of TRIALNUM runs, giving the size
of the best clique found during those runs as well as the total time
taken for those runs. The best clique found is currently NOT output,
although it would be a simple matter to modify the code so it was.
CUSTOMIZING: The defined constant NMAX is handled as in dfmax,
as is the conversion to look for stable sets instead of cliques.
One possible compilation problem: the code assumes the
existence of a random number generator "random()" that returns a
random 31-bit integer and is initialized by a call to "srandom()".
If a different syntax is used on your machine, you will have to
modify the code in a couple of places. To compile, say "make dmclique".
BENCHMARKING: If your paper is on approximation algorithms
for clique, this code should serve as a useful standard for
comparison. Ideally you should re-run the tests summarized
in "results.dmclique" on your own machine (for those instances
you ran your own code on). If time does not allow, an acceptable
substitute would be to scale the times reported there by
the "dfmax r500.5.b" speed-up (slow-down) factor between your
machine and the SGI Challenge.