Consider the diagonal entries d_j, j=1,2,...,n, of the matrix D in an LDL^T factorization of an n-by-n matrix X. As a function of X, each d_j is well-defined on the closed domain of positive semidefinite matrices. We show that these functions are twice continuously differentiable and concave throughout the interior of this domain. Using these facts, we show how to formulate semidefinite programming problems as standard convex optimization problems that can be solved using an interior-point method for nonlinear programming.

Note: This research has been facilitated by the Special Year on Large Scale Discrete Optimization, via a graduate fellowship and the Workshop on Semidefinite Programming and Its Applications to Large Scale Optimization.

%A Yuval Shavitt %A Peter Winkler %A Avishai Wool %T On the Economics of Multicasting %D January 13, 2000 %Z Wed, 2 Feb 2000 23:00:00 GMT %I DIMACS %R 2000-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-02.ps.gz %XA supplier of multicast information services will often be faced with the following problem: Broadcasting to the whole customer base (including non-paying customers) is cheaper than multicasting only to the paying customers. However, broadcasting discourages potential customers from paying. The result is an economic game in which the supplier tries to maximize profit in the face of rational, but not omniscient, behavior by customers.

In this work we build a model for such environments, which we believe is both reasonably realistic and amenable to mathematical analysis. The supplier's basic strategy is to broadcast every service for which the fraction of subscribed customers exceeds some threshold. We then model the customers' behavior, taking into account the possibility of customers receiving services for free. From this model, coupled with some mild assumptions on the supplier's cost structure, we can find the optimal setting of the supplier's broadcast threshold. The solution necessarily depends on choosing functions which describe the customers' utility for the offered services; we study in detail several such choices.

In all the examples we studied, our model predicts that the supplier's profits will be maximized if the supplier's broadcast threshold is set below 100%: The loss in revenue due to customers subscribing to fewer services is offset by the cost savings made possible by broadcasting the most popular services to all customers. We found our model to be fairly robust with respect to parameter choices. As such, we believe it can be of value to a supplier in devising a multicast/broadcast strategy, and that broadcasting when subscriptions are sufficiently high is likely to be the approach of choice in maximizing profits.

%A Alexander Kelmans %A Igor Pak %A Alexander Postnikov %T Tree and Forest Volumes of Graphs %D January 27, 2000 %Z Wed, 2 Feb 2000 23:00:00 GMT %I DIMACS %R 2000-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-03.ps.gz %X
The *tree volume of a weighted graph* $G$ is the ``sum'' of
the tree volumes of all spanning trees of $G$, and the
*tree volume of a weighted tree* $T$ is the product of the edge
weights of $T$ times the ``product'' of the letters of the Pr\"{u}fer
code of $T$ where the vertices of $G$ are viewed as independent
indeterminants that can be multiplied and commute.
The *forest volume of* $G$ is the tree volume of the graph $G^c$
obtained from $G$ by adding a new vertex $c$ and connecting every
vertex of $G$ with $c$ by an arc of weight 1.
We show that the forest volume is a natural generalization of
the Laplacian polynomial of graphs and that it also can be expressed
as the characteristic polynomial of a certain matrix similar to the
Laplacian matrix. It turns out that the forest volumes of graphs
possesses many important properties of the Laplacian polynomials,
for example, the reciprocity theorem holds also for the forest volumes.
We describe two constructions of graph compositions, and show that
the forest volume of a composition can be easily found if the
``structure'' of the composition and the forest volumes of
the graph--bricks are known. As an illustration of the results on
the forest volume of graph--compositions we give a combinatorial
interpretation and proof of Hurwitz's identity.

**Keywords:**
graph,
tree, forest
spanning tree,
Laplacian matrix and polynomial,
tree and forest volume.

We show new lower bounds for satisfiability and nondeterministic
linear time. Satisfiability cannot be solved on general purpose
random-access Turing machines in time *n*^{1.618}
and space *n*^{o(1)}. This improves recent results of
Fortnow and of Lipton and Viglas.

In general, for any constant *a* less than the golden ratio, we
prove that satisfiability cannot be solved in time
*n ^{a}* and space

Higher up the polynomial-time hierarchy we can get better bounds. We
show that linear-time alternating computations with a most *k*
alternations require essentially *n ^{k}* time if we only allow

An independent set $C$ of vertices in a graph is an efficient dominating set when each vertex not in $C$ is adjacent to exactly one vertex in $C$.

A countable family of nested graphs, each of which has at least one efficient dominating set, is produced for each real number in the unit interval [0,1].

These families range from star graphs, for which the efficient domination property was proved by Arumugam and Kala, to pancake graphs.

The families obtained are extended to partitions of the vertex sets into efficient dominating sets. Uniqueness holds for the given partitions whenever the countable family is not that of the star graphs.

For star graphs, all the possible ways in which such a partition occurs are established.

%A Peter C. Fishburn %A Fred S. Roberts %T Full Color Theorems for L(2,1)-Colorings %D March 16, 2000 %Z Thu, 15 Jun 2000 15:00:00 GMT %I DIMACS %R 2000-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-08.ps.gz %X The span $\la (G)$ of a graph $G$ is the smallest $k$ for which $G$'s vertices can be $L(2,1)$-colored, i.e., colored with integers in $\{0,1, \ldots, k \}$ so that adjacent vertices' colors differ by at least two, and colors of vertices at distance two differ. $G$ is full-colorable if some such coloring uses all colors in $\{0,1, \ldots, \la (G) \}$ and no others. We prove that all trees except stars are full-colorable. The connected graph $G$ with the smallest number of vertices exceeding $\la (G)$ that is not full-colorable is $C_6$. We describe an array of other connected graphs that are not full-colorable and go into detail on full-colorability of graphs of maximum degree four or less. %A Suh-Ryung Kim %A Fred S. Roberts %T Competition Graphs of Semiorders and the Conditions $C(p)$ and $C^*(p)$ %D March 16, 2000 %Z Wed, 19 Apr 2000 16:00:00 GMT %I DIMACS %R 2000-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-09.ps.gz %X Given a digraph $D$, its competition graph has the same vertex set and an edge between two vertices $x$ and $y$ if there is a vertex $u$ so that $(x,u)$ and $(y,u)$ are arcs of $D$. Motivated by a problem of communications, we study the competition graphs of the special digraphs known as semiorders. This leads us to define a conditions on digraphs called $C(p)$ and $C^*(p)$ and to study the graphs arising as competition graphs of acyclic digraphs satisfying conditions $C(p)$ or $C^*(p)$. %A Endre Boros %A Takashi Horiyama %A Toshihide Ibaraki %A Kazuhisa Makino %A Mutsunori Yagiura %T Finding Small Sets of Essential Attributes in Binary Data %D March 17, 2000 %Z Wed, 19 Apr 2000 16:00:00 GMT %I DIMACS %R 2000-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-10.ps.gz %X We consider the problem of finding support sets (i.e., sets of essential attributes) in a given data set, which consists of n-dimensional binary vectors of positive examples and negative examples. A set of attributes is a support set if positive examples and negative examples can be separated by using only the attributes in the set. Finding small support sets is an important topic in such fields as knowledge discovery, data mining, learning theory and logical analysis of data. Based on several measures of separation, we discuss why finding small support sets is important, and how to find such sets, together with results of some computational experiment. Theoretical analysis of the approximation ratios of the proposed algorithms is also provided. %A Vince Grolmusz %T A Note on Set Systems with no Union of Cardinality 0 Modulo m %D March 27, 2000 %Z Wed, 19 Apr 2000 16:00:00 GMT %I DIMACS %R 2000-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-11.ps.gz %XAlon, Kleitman, Lipton, Meshulam, Rabin and Spencer (Graphs. Combin. 7 (1991), no. 2, 97-99) proved, that for any hypergraph ${\cal F}=\{F_1,F_2,\ldots, F_{d(q-1)+1}\}$, where $q$ is a prime-power, and $d$ denotes the maximal degree of the hypergraph, there exists an ${\cal F}_0\subset {\cal F}$, such that $|\bigcup_{F\in{\cal F}_0}F|\equiv 0\pmod{q}$. We give a direct, alternative proof for this theorem, and we also give an explicit construction of a hypergraph of degree $d$ and size $\Omega(d^2)$ which does not contain a non-empty sub-hypergraph with a union of size 0 modulo 6.

%A Jie Wu %A Li Sheng %T An Efficient Sorting Algorithm for a Sequence of Kings in a Tournament %D April 4, 2000 %Z Wed, 19 Apr 2000 16:00:00 GMT %I DIMACS %R 2000-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-12.ps.gz %X A king $u$ in a tournament is a player who beats ($\rightarrow$) any other player $v$ {\em directly or indirectly}. That is, either $u \rightarrow v$ or there exists a third player $w$ such that $u \rightarrow w$ and $w \rightarrow v$. A sorting sequence of kings in a tournament of $n$ players is a sequence of players, $S=(u_1$, $u_2$, ..., $u_n)$, such that $u_i \rightarrow u_{i+1}$ and $u_i$ in a king in the sub-tournament $T_{u_i}$ induced by $u_i, u_{i+1}, ..., u_n$ for $i=1, 2,..., n-1$. The existence of a sorting sequence of kings in any tournament is shown \cite{LouW00} where a sorting algorithm with a complexity of $\Theta(n^{3})$ is given. In this paper, we present a constructive proof for the existence of a sorting sequence of kings of a tournament and propose an efficient algorithm with a complexity of $\Theta(n^2)$. %A Yuri Levin %A Adi Ben-Israel %T Directional Newton Methods inDirectional Newton methods for functions $f$ of $n$ variables are shown to converge, under typical assumptions, to a solution of $f(\mathbf{x})=0$. The rate of convergence is quadratic, for near-gradient directions, and directions along components of the gradient of $f$ with maximal modulus. These methods are applied to solving systems of equations without inversion of the Jacobian matrix.

**Key words and phrases.** Newton Method, Single equations, Systems of
equations

**Mathematics Subject Classification.** Primary 65H05, 65H10; Secondary 49M15

Distance estimation is important to many Internet applications, most
notably for a WWW client that needs to select a server among several
potential candidates. Current approaches to
distance (i.e., time delay) estimation in the
Internet are based on placing *Tracer* stations in key
locations and conducting measurements between them.
The *Tracers* construct an approximated map of the Internet after
processing the information obtained from these measurements.

This work presents a novel algorithm, based on Algebraic tools,
that computes *additional* distances, which are not explicitly
measured. As such, the algorithm extracts more information
from the same amount of measurement data.

Our algorithm has several practical impacts. First, it
can reduce the number of
*Tracers* and measurements without sacrificing information.
Second, our algorithm is able to compute
distance estimates between locations where
*Tracers* cannot be placed.
This is especially important when unidirectional
measurements are conducted, since such measurements require
specialized equipment which cannot be placed everywhere.

To evaluate the algorithm's performance, we tested it
both on randomly generated topologies and on real Internet
measurements. Our results show that
the algorithm computes
up to 50-200% additional distances beyond the basic
*Tracer-to-Tracer* measurements.

We study a model of Ising spins with short range ferromagnetic and long range SK interactions. We generalize the results obtained for the standard SK model, computing in particular the high temperature pressure.

RÉSUMÉ. Nous étudions un modéle de spins binaires qui est soumis á la fois á des interactions de type Ising et de type Sherrington-Kirkpatrick. Nous généralisons des résultats obtenus dans le cas du modéle SK seul, en calculant en particulier la pression á haute température.

%A Endre Boros %A Vladimir Gurvich %A Leonid Khachiyan %A Kazuhisa Makino %T Generating Weighted Transversals of a Hypergraph %D June 21, 2000 %Z Mon, 26 Jun 2000 15:00:00 GMT %I DIMACS %R 2000-17 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-17.ps.gz %XWe consider a generalization of the notion of transversal to a finite hypergraph, so called {\em weighted transversals}. Given a non-negative weight vector assigned to each hyperedge of the input hypergraph, we define a weighted transversal as a minimal vertex set which intersects a collection of hypereredges of sufficiently large total weight. We show that the hypergraph of all weighted transversals is dual-bounded, i.e, the size of its dual hypergraph is polynomial in the number of weighted transversals and the size of the input hypergraph. Our bounds are based on new inequalities of extremal set theory and threshold Boolean logic, which may be of independent interest. For instance, we show that for any threshold frequency, the number of maximal frequent sets of columns in a binary matrix is bounded by the number of minimal infrequent sets of columns in the same matrix multiplied by the number of its rows. We also prove that the problem of generating all weighted transversals for a given hypergraph is polynomial-time reducible to the generation of all ordinary transversals for another hypergraph, i.e., to the well-known hypergraph dualization problem. As a corollary, we obtain an incremental quasi-polynomial-time algorithm for generating all weighted transversals for a given hypergraph. This result includes as special cases the generation of all the minimal Boolean solutions to a given system of non-negative linear inequalities and the generation of all minimal infrequent sets of columns for a given binary matrix.

%A Shigang Chen %A Yuval Shavitt %T A Scalable Distributed QoS Multicast Routing Protocol %D July 26, 2000 %Z Thu, 3 Aug 2000 15:00:00 GMT %I DIMACS %R 2000-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-18.ps.gz %XMany Internet multicast applications such as teleconferencing and remote diagnosis have Quality-of-Service (QoS) requirements. Such requirements can be additive (end-to-end delay), multiplicative (loss rate) or with a bottleneck nature (bandwidth). For these applications, QoS multicast routing protocols are important in enabling new receivers to join a multicast group. However, current routing protocols are either too restrictive in their search for a feasible path between a new receiver and the multicast tree, or burden the network with excessive overhead.

In this paper we propose S-QMRP, a new Stateless Qos Multicast Routing Protocol that supports all three QoS requirement types. S-QMRP is scalable because it has very small communication overhead and requires no state outside the multicast tree; yet, it retains a high success probability. S-QMRP achieves the favorable tradeoff between routing performance and overhead by carefully selecting the network sub-graph in which it conducts the search for a path that can support the QoS requirement, and by auto-tuning the selection according to the current network conditions. S-QMRP does not require any global network state to be maintained and can operate on top of any unicast routing protocol. Our extensive simulation shows that S-QMRP performs better than the previously suggested protocols.

%A A. Barg %A D. Nogin %T Bounds for Packings of Spheres in the Grassmann Manifolds %D July 27, 2000 %Z Sat, 5 Aug 2000 13:00:00 GMT %I DIMACS %R 2000-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-19.ps.gz %XWe derive the Varshamov--Gilbert and Hamming bounds for packings of spheres (codes) in the Grassmann manifolds over $\mathbb R$ and $\mathbb C$. The distance between two $k$-planes is defined as $\rho(p,q)=(\sin^2\theta_1+\dots+\sin^2\theta_k)^{1/2}$, where $\theta_i, 1\le i\le k$, are the principal angles between $p$ and $q$.

%A A. Barg %A G. Cohen %A S. Encheva %A G. Kabatiansky %A G. Zemor %T A Hypergraph Approach to the Identifying Parent Property: the Case of Multiple Parents %D July 27, 2000 %Z Fri, 11 Aug 2000 15:00:00 GMT %I DIMACS %R 2000-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-20.ps.gz %XLet $C$ be a code of length $n$ over an alphabet of $q$ letters. A codeword $y$ is called a descendant of a set of $t$ codewords $ x^1,\dots,x^t$ if $y_i\in\{x^1_i,\dots,x^t_i\}$ for all $i=1,\dots,n.$ A code is said to have the identifiable parent property if for any $n$-word that is a descendant of at most $t$ parents it is possible to identify at least one of them. We prove that for any $t\le q-1$ there exist sequences of such codes with asymptotically nonvanishing rate.

%A E. Boros %A K. Elbassioni %A V. Gurvich %A L. Khachiyan %T An Incremental RNC Algorithm for Generating All Maximal Independent Sets in Hypergraphs of Bounded Dimension %D August 3, 2000 %Z Sat, 5 Aug 2000 13:00:00 GMT %I DIMACS %R 2000-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-21.ps.gz %XWe show that for hypergraphs of bounded edge size, the problem of extending a given list of maximal independent sets is NC-reducible to the computation of an arbitrary maximal independent set for an induced sub- hypergraph. The latter problem is known to be in RNC. In particular, our reduction yields an incremental RNC dualization algorithm for hypergraphs of bounded edge size, a problem previously known to be solvable in polynomial incremental time. We also give a similar parallel algorithm for the dualization problem on the product of arbitrary lattices which have a bounded number of immediate predecessors for each element.

%A Alexander Kelmans %A Dhruv Mubayi %A Benny Sudakov %T Asymptotically Optimal Tree--Pacings in Regular Graphs %D August 8, 2000 %Z Tue, 8 Aug 2000 22:00:00 GMT %I DIMACS %R 2000-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-22.ps.gz %XLet $T$ be a tree with $t$ vertices. Clearly, an $n$ vertex graph contains at most $n/t$ vertex disjoint trees isomorphic to $T$. In this paper we show that for every $\ep>0$, there exists a $D(\ep,t)>0$ such that, if $d>D(\ep,t)$ and $G$ is a simple $d$-regular graph on $n$ vertices, then $G$ contains at least $(1-\ep)n/t$ vertex disjoint trees isomorphic to $T$.

**Keywords:** graph, hypergraphs, packing, trees, matchings

In this paper we show that every simple cubic graph on $n$ vertices has a list of at least n/4 disjoint 2-edge paths and that this bound is sharp. Our proof provides a polynomial time algorithm for finding such a list in a simple cubic graph.

**Keywords:**
packing, 2--edge path, cubic graph, polynomial time algorithm.

We build the natural language question-answering system to teach autistic patients reasoning about mental states. In accordance to our model of autism disorder, some reasoning patterns concerning perception of the concepts of knowing, believing and intention are corrupted. Based on this model, we have suggested the autism diagnosis and reasoning rehabilitation strategy, where the professional psychologists explained the autistic children the set of multi-agent scenarios. Experiments showed that acquiring of mental concepts based on our formalism helps autistic children not only to improve judgment interacting with other people, but also to stimulate the emotional development.

Further evaluation of the model and rehabilitation technology is conducted with automatic training toolkit, implemented on the Internet. Asking questions about mental states of heroes of the scene or textual scenarios assists the revealing and training of the corrupted autistic reasoning. Natural language technology of semantic headers is applied, where the textual answer (explanation) is assigned with a mental formula, which is matched against the representation of an input question or command.

%A A.A. Sapozhenko %T On the Number of Independent Sets in Bipartite Graphs with %D August 27, 2000 %Z Mon, 28 Aug 2000 18:00:00 GMT %I DIMACS %R 2000-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-25.ps.gz %XThe asymptotics of the independent set number is found for bipartite graphs which minimum degree is equal in order to the number of vertices.

%A Piotr Berman %A Bhaskar DasGupta %A S. Muthukrishnan %T On the Exact Size of the Binary Space Partitioning of Sets of Isothetic Rectangles with Applications %D September 12, 2000 %Z Fri, 29 Sep 2000 02:00:00 GMT %I DIMACS %R 2000-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-26.ps.gz %XWe show an upper bound of 3n on size of the Binary Space Partitioning (BSP) tree for a set of n isothetic rectangles, and an upper bound of 2n if the rectangles tile the underlying space. This improves the previous bounds of 4n. The BSP tree is one of the most popular data structures and even ``small'' factor improvements of 4/3 or 2 we show improves the performance of applications relying on the BSP tree. Furthermore, our upper bounds yield improved approximation algorithms for several rectangular tiling problems in the literature. We also a show a lower bound of 2n in the worst case for a BSP for n isothetic rectangles, and a lower bound of 1.5n if they must form a tiling of the space.

%A Karhan Akcoglu %A James Aspnes %A Bhaskar DasGupta %A Ming-Yang %T Opportunity Cost Algorithms for Combinatorial Auctions %D September 19, 2000 %Z Fri, 29 Sep 2000 02:00:00 GMT %I DIMACS %R 2000-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-27.ps.gz %XTwo general algorithms based on opportunity costs are given for approximating a revenue-maximizing set of bids an auctioneer should accept, in a combinatorial auction in which each bidder offers a price for some subset of the available goods and the auctioneer can only accept non-intersecting bids. Since this problem is difficult even to approximate in general, the algorithms are most useful when the bids are restricted to be connected node subsets of an underlying object graph that represents which objects are relevant to each other. The approximation ratios of the algorithms depend on structural properties of this graph and are small constants for many interesting families of object graphs. The running times of the algorithms are linear in the size of the bid graph, which describes the conflicts between bids. Extensions of the algorithms allow for efficient processing of additional constraints, such as budget constraints that associate bids with particular bidders and limit how many bids from a particular bidder can be accepted.

%A Xiaodong Sun %T Explicit Interpolation Sets Using Perfect Hash Families %D October 9, 2000 %Z Mon, 30 Oct 2000 22:00:00 GMT %I DIMACS %R 2000-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-28.ps.gz %X Let $S$ be a set of functions with common domain $D$. We say $X$, a subset of $D$, an interpolation set for $S$ if the values on $X$ uniquely determine a function $f$ in $S$. Recently, Piotr Indyk gave an explicit construction of interpolation sets of size $O(k^4 \log ^4 n)$ for the family of boolean functions on $n$ variables which depend symmetrically on at most $k$ variables. Using perfect hash families, we have a variant of Indyk's method to get an explicit construction of interpolation sets of size $O(k^{10}\log n\log\log n/\log\log\log n)$ for this family of functions. %A Boris Galitsky %T Technique of semantic headers: a manual for knowledge engineers %D November 20, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-29.ps.gz XX This manual addresses the issues of knowledge representation in the form of semantic headers (SH) for conversion of a textual document into a form, appropriate for question answering. A set of semantic headers is intended to formally represent the essential idea of a document with respect to possible questions, such that this document would serve as an answer. The knowledge base then includes the textual answers, structured by means of assigned semantic headers. The formal representation of the natural language (NL) query is then matched against the knowledge base in the form of a set of these headers. The logical mechanisms of matching query translations against semantic headers and derivation of semantic headers themselves is presented, as well as applications of meta-reasoning, default reasoning, reasoning about action and time, and graph representation of answer classification.This technique shows superior performance over the knowledge systems based on syntactic matching of NL queries with the prior prepared NL representation of canonical queries, and the knowledge systems based on the fully formalized knowledge. Our approach gives the higher precision of answers than the former one because it involves the semantic information in higher degree. At the same time, in the logically complex and poorly structured domains SH technique gives more complete answers, possesses higher consistency to context deviation and is more efficient than the latter approach because the full knowledge formalization is not required.

The manual of semantic header technique is intended to assist the linguists and knowledge engineers in the creation of question answering systems for the sophisticated vertical domains. The manifold of coding samples and extended discussions help to deeper understand the peculiarities of using logic programming in the knowledge representation. The tutorial part is followed by the qualification test using the multiple-choice questions. %A Larry Shepp %T A Model for Stock Price Fluctuations Based on Information %D October 19, 2000 %Z Mon, 30 Oct 2000 22:00:00 GMT %I DIMACS %R 2000-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-30.ps.gz %X This paper is dedicated to the memory of Aaron Wyner, the well known information theorist. It gives a new model for stock price fluctuations based on a concept of ``information''. In contrast, the usual Black-Scholes-Merton-Samuelson model is based on the {\em explicit} assumption that information plays {\em no} role in stock prices. The new model is based on the non-uniformity of information in the market and the time delay until new information becomes generally known.

The new model is expected to give more accurate predictions of future prices and more accurate formulas for hedge option valuations. The new valuations have been calculated for the various standard options inside the new model in a recent PhD thesis by Xin Guo. Because the concept of information is the driving one in the new model it seemed appropriate to discuss this in the present volume even though ``information'' is used in a somewhat different sense here than in communication theory.

In communication theory, information is {\em intended} to be communicated. One is concerned with designing a means for the successful transfer of messages from a source to the receiver and the value of the information is in its successful transmission. In the stock market, the reverse is usually true; information is hoarded and is of value to the owner {\em only} until it becomes known to others. Some messages are avidly communicated in the market but are really meant to {\em disinform} the receivers. This may of course also be true occasionally in communication theory which usually avoids concerning itself with the {\em meaning} of the message to be transmitted. Despite these differences in the role of information in the two situations, there are similarities in the mathematics used to study them. Certainly probabilistical models play a prominent role in both theories.

In Shannon's communication theory quantitative value is assigned to a channel (its rate or its capacity). Is there anything similar in the market? We are able to assign a quantitative value to knowing an item of information {\it which is unknown to others} say that a company is in a strong (or weak) position at a given point in time, via pricing of options on the company's stock. This paper is intended as a first step towards the goal of making the use of information into a quantitative tool.

Problems with explicit solutions are of value in obtaining insights. At the end of the paper we compare several problems of mathematical interest in order to better understand which optimal stopping problems have explicit solutions. %A S. Bastea %A R. Esposito %A J. L. Lebowitz %A R. Marra %T Binary Fluids with Long Range Segregating Interaction I: Derivation of Kinetic and Hydrodynamic Equations %D October 24, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-31.ps.gz %X We study the evolution of a two component fluid consisting of ``blue'' and ``red'' particles which interact via strong short range (hard core) and weak long range pair potentials. At low temperatures the equilibrium state of the system is one in which there are two coexisting phases. Under suitable choices of space-time scalings and system parameters we first obtain (formally) a mesoscopic kinetic Vlasov-Boltzmann equation for the one particle position and velocity distribution functions, appropriate for a description of the phase segregation kinetics in this system. Further scalings then yield Vlasov-Euler and incompressible Vlasov-Navier-Stokes equations. We also obtain, via the usual truncation of the Chapman-Enskog expansion, compressible Vlasov-Navier-Stokes equations. %A Peter C. Fishburn %A Fred S. Roberts %T Minimal Forbidden Graphs for L(2,1)-Colorings %D October 30, 2000 %Z Mon, 30 Oct 2000 22:00:00 GMT %I DIMACS %R 2000-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-32.ps.gz %X The span $\lambda(G)$ of a graph G is the smallest k for which G's vertices can be L(2,1)-colored, i.e., colored with integers in {0,1,...,k} so that adjacent vertices' colors differ by as least two, and colors of vertices at distance two differ. We study minimal forbidden graphs, graphs with the property that every proper subgraph has smaller span. Fixing the maximum degree, we observe that the first nontrivial case involves minimal forbidden graphs of maximum degree 3 and span 5. We present examples to illustrate the variety of graphs with this property and in particular provide several infinite families of such graphs. %A E. Boros %A V. Gurvich %A R. Meshulam %T Difference Graphs %D November 1, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-33.ps.gz %X Intersection and measured intersection graphs are quite common in the literature. In this paper we introduce the analogous concept of measured difference graphs: Given an arbitrary hypergraph $\cH = \{H_1,...,H_n\}$, let us associate to it a graph on vertex set $[n]=\{1,2,...,n\}$ in which $(i,j)$ is an edge iff the corresponding sets $H_i$ and $H_j$ are ``sufficiently different''. More precisely, given an integer threshold $k$, we consider three definitions, according to which $(i,j)$ is an edge iff (1) $|H_i \setminus H_j| + |H_j \setminus H_i| \geq 2k$, (2) $max\{|H_i \setminus H_j|,|H_j \setminus H_i|\} \geq k$, and (3) $min\{|H_i \setminus H_j|,|H_j \setminus H_i|\} \geq k$. It is not difficult to see that each of the above define hereditary graph classes, which are monotone with respect to $k$. We show that for every graph $G$ there exists a large enough $k$ such that $G$ arises with any of the definitions above. We prove that with the first two definitions one may need $k=\Omega(\log n)$ in any such realizations of certain graphs on $n$ vertices. However, we do not know a graph $G$ which could not be realized by the last definition with $k=2$. %A Srikrishna Divakaran %A Michael Saks %T An Online Scheduling Problem with Job Set-ups %D November 13, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-34.ps.gz %X Machine scheduling and paging are two fundamental problems that have been studied extensively from the view of online algorithms. In both problems, one processes a sequence $J_1,\ldots,J_n$ of {\em jobs} (or {\em page requests}) that arrive over time. In the paging problem, requests must be answered in the order of arrival because the requests arise in the context of the execution of a sequential program. The need for decision making comes from the presence of a cache which can store a limited number of pages, where serving a cached page is considerably cheaper than an uncached one. Thus algorithms for paging are focused around the choice of caching policy. In scheduling, there is no cache, but the scheduler has (at least limited) freedom to reorder the jobs, and choosing the order is the major issue in the design of online scheduling algorithms. In computer systems, however, there are important situations where both caching and job reordering are available. One very natural setting where this happens is handling web page requests at a high volume web site. As in the usual paging problem, the server must manage a cache. But here the requests are (essentially) independent, so the server is free to reorder them.

In this paper, we begin the study of online algorithms for scheduling systems where both caching and reordering are available. We consider a simple, but already interesting and nontrivial, case: that of a single machine with a cache of unit size, and uniform page size. This particular special case is also natural in the usual machine scheduling context where jobs are classified into one of $\{1,2,...,F\}$ job types, and before processing a job, the machine must be configured (or set-up) according to the type of the job. This configuring (or set-up) is required unless the previously served job is of the same type as the current one. The current configuration of the machine in this scheduling problem, corresponds exactly to the contents of the unit cache in the page server problem.

In this context, we consider the online problem of scheduling a sequence $J=(J_{1},J_{2},...,J_{N})$ of jobs with release times, processing times and sequence independent set-up times, on a single machine, to minimize the maximum flow time. A partitioning of the jobs into $F$ job types is given. A set-up is required at the start of each batch, where a batch is the largest set of contiguously scheduled jobs of the same job type. This problem is a special case of the maximum lateness problem, a problem known to be $NP$-hard even for very restrictive special cases and for which there are no known polynomial time $O(1)$ approximation algorithms. In this paper, we present a polynomial time $O(1)$-competitive online algorithm for the maximum flow time problem and also prove the $NP$-Hardness of the offline version of this problem. %A Oliver Penrose %A George Stell %T Close to close packing %D November 13, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-35.ps.gz %X For various lattice gas models with nearest neighbour exclusion (and, in one case, second-nearest neighbour exclusion as well), we obtain lower bounds on $m$, the average number of particles on the non-excluded lattice sites closest to a given particle. They are all of the form $$ m/m_{cp} \ge 1 - \hbox{~const.}(N_{cp}/N - 1) $$ where $N$ is the number of occupied sites, $m_{cp}$ is the coordination number, snd $N_{cp}$ is the value of $N$ at close packing. An analogous result exists for hard disks in the plane. %A E. Boros ,V. Gurvich %A L. Khachiyan %A K. Makino %T An inequality limiting the number of maximal frequent sets %D November 13, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-37.ps.gz %X Given an arbitrary $m\times n$ binary matrix $A$ and a threshold $t \in \{1, \ldots, m\}$, a subset $C$ of the columns is called $t$-{\em frequent} if there are at least $t$ rows having a $1$ in each of these columns, and otherwise $C$ is called $t$-{\em infrequent}. Denoting by $\alpha$ the number of maximal $t$-frequent, and by $\beta$ the number of minimal $t$-infrequent column sets in the given matrix $A$, we prove that the inequality $\alpha \leq (m-t+1)\beta$ holds. This inequality is sharp, and allows for an incremental quasi-polynomial algorithm for generating all minimal $t$-infrequent sets, while the analogous problem for maximal $t$-frequent sets is NP-hard. Our proof is based on an inequality from extremal set theory, which may be of independent interest. %A Ayman Khalfalah %A Sachin Lodha %A Endre Szemeredi %T Tight Bound for the Density of Sequences of Integers the Sum of No Two of which is a Perfect Square %D November 16, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-39.ps.gz %X P. Erd\H{o}s and D. Silverman~\cite{eg-80} proposed the problem of determining the maximal density attainable by a set $S$ of positive integers having the property that no two distinct elements of $S$ sum up to a perfect square. J. P. Massias exhibited such a set consisting of all $x \equiv 1$ (mod $4$) with $x \equiv 14, 26, 30$ (mod $32$) in~\cite{m}. In~\cite{los-82}, J. C. Lagarias, A. M. Odylzko and J. B. Shearer showed that for any positive integer $n$, one cannot find more than $\frac{11}{32} n$ residue classes (mod $n$) such that the sum of any two is never congruent to a square (mod $n$), thus essentially proving that the Massias' set has the best possible density. They~\cite{los-83} also proved that the density of such a set $S$ is never more than $0.475$ when we allow general sequences.

We improve on the lower bound for general sequences, essentially proving that it is not $0.475$, but arbitrarily close to $\frac{11}{32}$, the same as that for sequences made up of only arithmetic progressions. %A Tiina Heikkinen %T Resource Allocation in a Distributed Network %D November 17, 2000 %Z Tue, 21 Nov 2000 02:00:00 GMT %I DIMACS %R 2000-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-40.ps.gz %X

This paper discusses distributed resource allocation in a communication system. Some recent approaches to decentralized resource allocation in a communication system are summarized. A characterization for resource control and pricing in a distributed wireless network with transmit power control is presented.

Power control is modelled as a noncooperative externality game between the users. In the absence of a price each user has the incentive to increase the transmit power to increase the quality-of service as measured by the signal-to-noise ratio. However, the increase in transmission energy of a representative user by definition deteriorates the signal-to-noise ratio for all other users, thus imposing an externality cost. The Nash equilibrium in an uncontrolled distributed network is one where the system fails due to excessive congestion. However, by introducing a simple linear pricing penalty, the distributed system can be made to achieve the Pareto-optimal quality of service.

Convergence to an efficient Nash equilibrium under pricing is observed in a numerical example assuming that the
users apply a standard learning automaton.

Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-40.ps.gz
%A F. R. McMorris
%A Fred S. Roberts
%A Chi Wang
%T The Center Function on Trees
%D December 7, 2000
%Z Fri, 15 Dec 2000 22:00:00 GMT
%I DIMACS
%R 2000-41
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-41.ps.gz
%X When $(X, d)$ is a finite metric space and $\pi =
(x_1 , \ldots, x_k ) \in
X^k$, a central element for $\pi$ is an element $x$ of $X$ for
which max$\{ d(x, x_i ): i = 1 ,\ldots ,k\}$ is minimum. The function
that
returns the set of all central elements for
any tuple $\pi$ is called the center function on $X$. In this note, the
center function on
finite trees is characterized.
%A Italo J. Dejter
%T Perfect Dominating Sets of Grids
%D December 13, 2000
%Z Fri, 15 Dec 2000 22:00:00 GMT
%I DIMACS
%R 2000-42
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-42.ps.gz
%X Let m and n be positive integers. The algorithmic search for perfect
dominating sets of the rectangular grid G(m,n) with an initial
condition S' defined as an admissible subset of a side G(m,1) of
G(m,n) is considered. A binary decision algorithm that generates
all perfect dominating sets of G(m,n) satisfying (or containing) such
S' is presented, and some related questions and conjectures are posed,
leading to consideration of their periodic extendibility to the plane
lattice as well as to extremal properties of the grid depth n for a
fixed grid width m under the presence of such dominating sets.
%A Ashwin Nayak
%A Ashvin Vishwanath
%T Quantum Walk on the Line
%D December 13, 2000
%Z Fri, 15 Dec 2000 22:00:00 GMT
%I DIMACS
%R 2000-43
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-43.ps.gz
%X Motivated by the immense success of random walk and Markov
chain methods in the design of classical
algorithms, we consider {\em quantum\/} walks on graphs.
We analyse in detail the behaviour of unbiased quantum walk on the line,
with the example of a typical walk, the ``Hadamard walk''. In
particular, we show that
after~$t$ time steps, the probability distribution on the line induced
by the Hadamard walk is almost uniformly distributed over the
interval~$[-t/\sqrt{2},\;t/\sqrt{2}]$. This implies that the same walk
defined on the circle mixes in {\em linear\/} time.
This is in direct contrast with the quadratic mixing time for the
corresponding classical walk.
We conclude by indicating
how our techniques may be applied to more general graphs.
%A Ashwin Nayak
%A Ashvin Vishwanath
%T Quantum Walk on the Line
%D December 13, 2000
%Z Fri, 15 Dec 2000 22:00:00 GMT
%I DIMACS
%R 2000-43
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-43.ps.gz
%X Motivated by the immense success of random walk and Markov
chain methods in the design of classical
algorithms, we consider {\em quantum\/} walks on graphs.
We analyse in detail the behaviour of unbiased quantum walk on the line,
with the example of a typical walk, the ``Hadamard walk''. In
particular, we show that
after~$t$ time steps, the probability distribution on the line induced
by the Hadamard walk is almost uniformly distributed over the
interval~$[-t/\sqrt{2},\;t/\sqrt{2}]$. This implies that the same walk
defined on the circle mixes in {\em linear\/} time.
This is in direct contrast with the quadratic mixing time for the
corresponding classical walk.
We conclude by indicating
how our techniques may be applied to more general graphs.
%A Alexander Kelmans
%T Packing Trees in Graphes
%D December 15, 2000
%Z Fri, 15 Dec 2000 22:00:00 GMT
%I DIMACS
%R 2000-44
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-44.ps.gz
%X Let ${\cal G}^s_r $ denote the set of graphs with each vertex of degree at least
$r$ and at most $s$, $v(G)$ the number of vertices, and
$\tau _k (G)$ the maximum number of disjoint $k$--edge trees in $G$.
In this paper we show that
\\[0.5ex]
$(a1)$ if $G \in {\cal G}^s_2 $ and $s \ge 4$ then $\tau _2 (G) \ge v(G)/(s+1)$,
\\[0.5ex]
$(a2)$ if $G \in {\cal G}^3_2 $ and $G$ has no 5--vertex components then
$\tau _2 (G) \ge v(G)/4$,
\\[0.5ex]
$(a3)$ if $G \in {\cal G}^s_1 $ and $G$ has no $k$--vertex component
where $k \ge 2$ and $s \ge 3$ then
$\tau _k(G) \ge (v(G) - k)/(sk - k +1)$, and
\\[0.5ex]
$(a3)$ the above bounds are attained for infinitely many connected graphs.
\\[0.5ex]
\indent
Our proofs provide polynomial time algorithms for finding the corresponding
packings in a graph.
%A I. E. Zverovich
%A I. I. Zverovich
%T On the ratio of the stability number and the domination number
%D January 5, 2000
%Z Fri, 2 Feb 2001 01:00:00 GMT
%I DIMACS
%R 2000-45
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-45.ps.gz
%X Let $\alpha(G)$ and $\gamma(G)$ be the stability number and domination number
of a graph $G$, respectively.
Since $\alpha(G) \ge \gamma(G)$, $\alpha(G)/\gamma(G) \ge 1$.
For each $r \ge 1$, we define a graph $G$ to be an
{\em $\alpha/\gamma \le r$-perfect graph} if $\alpha(H)/\gamma(H) \le r$
for each induced subgraph $H$ of $G$.

We show that the class of all $\alpha/\gamma \le r$-perfect graphs is determined by a unique forbidden induced subgraph.

It gives the possibility to approximate $\alpha(G)$ and $\gamma(G)$ in the corresponding classes. %A I. E. Zverovich %T Graph Bipartition with Hereditary Properties %D January 5, 2000 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2000-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-46.ps.gz %X

A hereditary class $P$ is called {\it finitely generated} if the set of all minimal forbidden induced subgraphs for $P$ is finite. For a pair of hereditary classes $P$ and $Q$, we define a hereditary class $P * Q$ of all graphs $G$ which have a partition $A \cup B = V(G)$ such that $G(A) \in P$ and $G(B) \in Q$ ($G(X)$ is the subgraph of $G$ induced by $X \subseteq V(G)$). We investigate the problem of recognizing finitely generated classes of the form $P * Q$.

We use the following model. Let $H^0$ and $H^1$ be hypergraphs with the same vertex set $V$. The ordered pair $H = (H^0, H^1)$ is called a {\em bihypergraph}. A bihypergraph $H = (H^0, H^1)$ is called {\em bipartite} if there is an ordered partition $V^0 \cup V^1 = V(H)$ such that $V^i$ is stable in $H^i$, $i = 0, 1$. If the maximum cardinality of hyperedges in $H$ is at most $r$ and every $k$-subset of $V(H)$ contains at least one hyperedge then $H \in C(k, r)$.

We prove that there exists a finite number of minimal non-bipartite bihypergraphs in $C(k, r)$ (when $k$ and $r$ are fixed).

Let $P$ and $Q$ be hereditary classes of graphs. Suppose that the stability number $\alpha (H)$ is bounded above for all $H \in P$, and the clique number $\omega (H)$ is bounded above for all $H \in Q$. An ordered partition $A \cup B = V(G)$ is called a {\it ramseian $P * Q$-partition} if $G(A) \in P$ and $G(B) \in Q$. Let ${\rm Ramsey}(P * Q)$ be the set of all graphs which have a ramseian $P * Q$-partition.

Generalizing a result of Gy\'arf\'as \cite{Gyarfas98}, we prove that if both $P$ and $Q$ are finitely generated then ${\rm Ramsey}(P * Q)$ is also finitely generated. In particular, every class of $(\alpha, \beta)$-polar graphs (which are generalizations of split graphs) has a finite forbidden induced subgraph characterization. %A Danny Dolev %A Osnat Mokryn %A Yuval Shavitt %A Innocenty Sukhov %T An Integrated Architecture for the Scalable Delivery of %D January 5, 2000 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2000-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-47.ps.gz %X The competition on clients attention requires sites to update their content frequently. As a result, a large percentage of web pages are semi-dynamic, i.e., change quite often and stay static between changes. The cost of maintaining consistency for such pages discourages caching solutions. We suggest here an integrated architecture for the scalable delivery of frequently changing hot pages. Our scheme enables sites to dynamically select whether to cyclically multicast a hot page or to unicast it, and to switch between multicast and unicast mechanisms in a transparent way. Our scheme defines a new protocol, called HTTPM. In addition, it uses currently deployed protocols, and dynamically directs browsers seeking for a URL to multicast channels, while using existing DNS mechanisms. Thus, we enable sites to deliver content to a growing amount of users at less cost and during denial of service attacks, while reducing load on core links. We report simulation results that demonstrate the advantages of the Integrated architecture, and its significant impact on server and network load, as well as clients delay. %A Vadim Mottl %A Sergey Dvoenko %A Oleg Seredin %A Casimir Kulikowski %A Ilya Muchnik %T Alignment scores in a regularized support vector classification method for fold recognition of remote protein families %D January 5, 2001 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2001-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-01.ps.gz %X

One of fundamental principles of molecular biology says that the primary structure of a protein, i.e. sequence of amino acid residues forming its polypeptide chain, carries an essential amount of information for unambiguous establishing its spatial structure. Despite the fact that each protein has its own spatial structure, it is typical phenomenon that the fold pattern remains basically the same within large groups of evolutionarily allied proteins, so that the "number" of essentially different spatial structures is much less than that of known proteins. Since spatial structures are classified in that or other manner, the estimation of the spatial structure of a given protein reduces to its allocating over a finite set of classes, i.e. the problem falls into the competence area of pattern recognition.

The traditional methodology of pattern recognition presupposes that the object whose class-membership is to be recognized is represented by vector of some numerical features and is considered as a point in the respective linear vector space. However, the actual diversity of amino acid properties that may play an important part in forming the spatial structure of a protein is so immensely rich, that the choice of suitable numerical properties makes a special problem which is the key one here. We consider here an alternative featureless approach to recognition of spatial structure of proteins. It is proposed to judge about the membership of a protein in one of the classes of spatial structures immediately on the basis of measuring the proximity of its amino acid chain to those of some other proteins whose spatial structure is known.

To infer the decision rule of recognition from a training sample of proteins of known structure, we apply the traditional support vector method of machine learning on the basis of treating amino acid chains as elements of a Hilbert space, i.e. linear space with inner product, in what role the pairwise alignment scores are used. The inevitable difficulty of the small size of training samples is overcome by a special regularization technique that makes use of some available a priori information on the sought-for decision rule.

The proposed approach to the problem of fold class recognition illustrated by results of processing a collection of 396 mutually distant protein domains of 51 fold classes chosen from the SCOP database. %A Leonid Shvartser %A Casimir Kulikowski %A Ilya Muchnik %T Multiple sequence alignment using the quasi-concave function optimization based on the DIALIGN combinatorial structures %D January 5, 2001 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2001-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-02.ps.gz %X Multiple sequence alignment is usually considered as an optimization problem, which has a statistical and a structural component. It is known that in the problem of protein sequence alignment a processed sample is too small and not representative in the statistical sense though this information can be sufficient if an appropriate structural model is used. In order to utilize this information a new structural description of the pairwise alignment results union has been developed. It is shown that if the structure is restored then Multiple Sequence Alignment is achieved. Introduced structure represents the set of local maximums of quasi-concave set function on a lower semi lattice, which in turn is a union of the set-theoretical intervals. This union is a set of the consistent subsets of diagonals, introduced by B. Morgenstern, A. Dress, and T. Werner (1996). Algorithm for local maximums search on proposed structure has been developed. It consists of an alternation of the Forward and Backward passes. The Backward pass in this algorithm is a rigorous while the Forward pass is based on heuristics. Multiple alignment of 5 protein sequences are used as an illustration of the proposed algorithm. %A Vince Grolmusz %T Constructing Set-Systems with Prescribed Intersection Size %D January 11, 2001 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2001-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-03.ps.gz %X Let $f$ be an $n$ variable polynomial with positive integer coefficients, and let ${\cal H}=\{H_1,H_2,\ldots,H_m\}$ be a set-system on the $n$-element universe. We define set-system $f({\cal H})=\{G_1,G_2,\ldots,G_m\}$, and prove that $f(H_{i1}\cap H_{i2}\cap\ldots\cap H_{ik})=|G_{i1}\cap G_{i2}\cap\ldots\cap G_{ik}|$, for any $1\leq k\leq m$, where $f(H_{i1}\cap H_{i2}\cap\ldots\cap H_{ik})$ denotes the value of $f$ on the characteristic vector of $H_{i1}\cap H_{i2}\cap\ldots\cap H_{ik}$.

The construction of $f({\cal H})$ is a straightforward polynomial--time algorithm from ${\cal H}$ and polynomial $f$. In this paper we use this algorithm for constructing set-systems with prescribed intersection sizes modulo an integer.

As a by-product of our method, some Ray-Chaudhuri--Wilson-like theorems are proved. %A Vince Grolmusz %T Set-Systems with Restricted Multiple Intersections and Explicit Ramsey Hypergraphs %D January 11, 2001 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2001-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-04.ps.gz %X We give generalizations for the Deza-Frankl-Singhi Theorem in case of multiple intersections. More exactly, we prove, that if ${\cal H}$ is a set-system, which satisfies that for some $k$, that the $k$-wise intersections occupy only $\ell$ residue-classes modulo a $p$ prime, while the sizes of the members of ${\cal H}$ are not in these residue classes, then the size of ${\cal H}$ is at most $$(p-1)(k-1)\sum_{i=1}^{|L|}{n\choose i}$$ This result strengthtens a result of F\"uredi (1983), and gives partial answer to a question of T. S\'os (1976).

As an application, we give explicit constructions for (multi-colored) Ramsey hypergraphs. By our best knowledge, this is the first explicit construction of a Ramsey-hypergraph in the literature. %A Italo J. Dejter %T Distribution of Distances in Star Graphs %D January 22, 2001 %Z Fri, 2 Feb 2001 01:00:00 GMT %I DIMACS %R 2001-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-05.ps.gz %X The distribution of distances in the star graph ST_n, where n is an integer >1, is determined from any fixed vertex, in particular to the vertices of each one of the n so-called efficient dominating sets, or 1-perfect codes. %A I. E. Zverovich %T Weighted well-covered graphs and complexity questions %D February 6, 2001 %Z Thu, 5 Apr 2001 05:00:00 GMT %I DIMACS %R 2001-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-06.ps.gz %X

A weighted graph G is called well-covered if all its maximal stable sets have the same weight. This common weight is the value of G. Let S be a stable set of G (possibly, $S = \emptyset$). The subgraph $G - N[S]$ is called a co-stable subgraph of G. We denote by Sub(G) the set of all co-stable subgraphs of G (considered up to isomorphism). A class of weighted graphs P is called co-hereditary if it is closed under taking co-stable subgraphs, i.e., $G \in \mathcal{P}$ implies ${\rm CSub}(G) \subseteq \mathcal{P}$.

We note that the class $\mathcal{WW}$ of all weighted well-covered graphs is co-hereditary and characterize $\mathcal{WW}$ in terms of forbidden co-stable subgraphs.

Then we use a reduction from Satisfiability to show that the following decision problems are NP-complete.

**Decision Problem 1 (Co-Stable Subgraph).**

*Instance:* A graph $G$ and a set $U \subseteq V(G)$ that induces
a subgraph $H$.

*Question:* Is $H$ a co-stable subgraph of $G$?

**Decision Problem 2 (Co-Stable Subgraph $H$).**

*Instance:* A graph $G$.

*Question:* Is $H$ a co-stable subgraph of $G$?

Let $\Delta(G)$ be the maximum vertex degree of a graph $G$. We show that recognizing weighted well-covered graphs with bounded $\Delta(G)$ can be done in polynomial time. %A I. E. Zverovich %A I. I. Zverovich %T Finding a maximum stable set in pentagraphs %D February 6, 2001 %Z Thu, 5 Apr 2001 05:00:00 GMT %I DIMACS %R 2001-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-07.ps.gz %X

*Penta* is the configuration shown in Figure 1(a), where
continuous lines represent edges and dotted lines represent non-edges.
The vertex $u$ in Figure 1(a) is called the *center* of Penta.
A graph $G$ is called a *pentagraph* if every induced subgraph $H$
of $G$ has a vertex $v$ which is not a center of induced Penta in $H$.

The class of pentagraphs is a common generalization of chordal (triangulated) graphs and Mahadev graphs.

We construct a polynomial-time algorithm that either find a maximum stable set of $G$ or concludes that $G$ is not a pentagraph. %A Boris Galitsky %A Sergey Shelepin %T On the inter-residue correlation patterns and their role in classification of protein families %D February 9, 2001 %Z Thu, 5 Apr 2001 05:00:00 GMT %I DIMACS %R 2001-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-08.ps.gz %X

We build a novel method to calculate and analyze the correlations in mutational behavior between different positions in a multiple sequence alignment. The inter-dependence between the residues for a protein family is represented as a matrix of correlation values obeying the invariance with respect to specific amino acids, the number of sequences representing a family, the length of sequences, residue variability and the uniformity of data set representation. Common and distinguishing properties of the few protein families, including immunoglobulins, are revealed, based on the geometry of correlation matrices. We analyze the specific texture of these matrices, inherent to the specific families, and suggest a way to distinguish proteins from non-protein set of sequences.

The role of correlation matrix technique in classification is discussed. We suggest that the classification criteria should be based on the residues at the positions with the highest overall correlation with the other positions. Revealing the positions with various correlation strength helps to reconstruct the phylogeny of protein families. %A Omer Reingold %A Salil Vadhan %A Avi Wigderson %T Entropy Waves, the Zig-Zag Graph Product, and New Constant-Degree Expanders and Extractors %D February 23, 2001 %Z Thu, 5 Apr 2001 05:00:00 GMT %I DIMACS %R 2001-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-09.ps.gz %X

The main contribution of this work is a new type of graph product, which we call the \defword{zig-zag product}. Taking a product of a large graph with a small graph, the resulting graph inherits (roughly) its size from the large one, its degree from the small one, and its expansion properties from both! Iteration yields simple explicit constructions of constant-degree expanders of every size, starting from one constant-size expander.

Crucial to our intuition (and simple analysis) of the properties of this graph product is the view of expanders as functions which act as ``entropy wave'' propagators --- they transform probability distributions in which entropy is concentrated in one area to distributions where that concentration is dissipated. In these terms, the graph product affords the constructive interference of two such waves.

A variant of this product can be applied to extractors, giving the
first explicit extractors whose seed length depends
(poly)logarithmically on only
the entropy deficiency of the source (rather than its length)
and that extract almost all
the entropy of high min-entropy sources. These high min-entropy
extractors have
several interesting applications, including the first constant-degree
explicit expanders which beat the ``eigenvalue bound.''
%A Peter C. Fishburn
%A Fred S. Roberts
%T No-Hole L(2,1)-Colorings
%D March 7, 2001
%Z Thu, 5 Apr 2001 05:00:00 GMT
%I DIMACS
%R 2001-10
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-10.ps.gz
%X An L(2,1)-coloring of a graph G is a coloring of G's vertices with integers in {0,1, ..., k} so that adjacent vertices' colors differ by
at least two and colors of distance-two vertices differ.
We refer to an L(2,1)-coloring as a coloring.
The span \lambda(G) of G is the smallest k for which G has a coloring, a *span coloring* is a coloring whose greatest color is \lambda (G), and the * hole index* \rho (G) of G is the minimum number of colors in {0,1, ..., \lambda (G) } not used in a span coloring.
We say that G is *full-colorable* if \rho (G) =0.
More generally, a coloring of G is a * no-hole coloring* if it uses all colors between 0 and its maximum color.
Both colorings and no-hole colorings were motivated by channel assignment problems.
We define the *no-hole span* \mu (G) of G as \infty if G has no no-hole
coloring;
otherwise \mu (G) is the minimum k for which G has a no-hole coloring
using colors in {0,1, ..., k}.
We prove that
G is full-colorable if it has \lambda (G) +1 vertices.
In addition, if G is not full-colorable and if it has at least
\lambda (G) +2 vertices, then \mu (G) \le \lambda (G) + \rho (G).
Moreover, for each m >= 1 there is a graph with \rho (G) = m
and \mu (G) = \lambda (G) + \rho (G).
%A Vince Grolmusz
%A Benny Sudakov
%T k-wise Set-Intersections and k-wise Hamming-Distances
%D March 7, 2001
%Z Thu, 5 Apr 2001 05:00:00 GMT
%I DIMACS
%R 2001-11
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-11.ps.gz
%X We prove a version of the Ray-Chaudhuri--Wilson and
Frankl-Wilson
theorems for $k$-wise intersections and also generalize a classical
code-theoretic result of Delsarte for
$k$-wise Hamming distances. A set of code-words $a^1,a^2,\ldots,a^k$
of length $n$ have $k$-wise Hamming-distance $\ell$, if there are
exactly $\ell$ such coordinates, where not all of their coordinates
coincide (alternatively, exactly $n-\ell$ of their coordinates are the
same). We show a Delsarte-like upper bound: codes with few $k$-wise
Hamming-distances must contain few code-words.
%A Endre Boros
%A Khaled Elbassioni
%A Vladimir Gurvich
%A Leonid Khachiyan
%A Kazuhisa Makino
%T Dual-Bounded Generating Problems: All Minimal Integer Solutions for a Monotone System of Linear Inequalities
%D April 10, 2001
%Z Thu, 16 Aug 2001 09:00:00 GMT
%I DIMACS
%R 2001-12
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-12.ps.gz
%X We consider the problem of enumerating all minimal integer
solutions of a monotone system of linear inequalities. We first show
that for any monotone system of $r$ linear inequalities in $n$
variables, the number of maximal infeasible integer vectors is at
most $rn$ times the number of minimal integer solutions to the
system. This bound is accurate up to a $polylog(r)$ factor and
leads to a polynomial-time reduction of the enumeration problem to
a natural generalization of the well-known dualization problem for
hypergraphs, in which dual pairs of hypergraphs are replaced by
dual collections of integer vectors in a box. We provide a
quasi-polynomial algorithm for the latter dualization problem.
These results imply, in particular, that the problem of
incrementally generating
all minimal integer solutions to a monotone system of linear
inequalities can be done in quasi-polynomial time.
%A Daniel Kral
%A Jana Maxova
%A Pavel Podbrdsky
%A Robert Samal
%T On Hamiltonian Cycles in Strong Products of Graphs
%D April 11, 2001
%Z Thu, 16 Aug 2001 09:00:00 GMT
%I DIMACS
%R 2001-13
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-13.ps.gz
%X

We prove that the strong product $G_1\times\cdots\times G_n$ contains a hamiltonian cycle for $n\ge\Delta$ whenever all $G_i$ are connected graphs of maximum degree at most $\Delta$; in particular $G^{\Delta(G)}$ contains a hamiltonian cycle. For large $\Delta$ we prove the same statement for $n\approx c\Delta$ for any $c>\ln (25/12) +1/60$.

The research was done as a part of Research Experience for Undergraduates programme; this is a joint programme of DIMACS at Rutgers University and DIMATIA at Charles University. The REU supervisors were J\'anos Koml\'os and Endre Szemer\'edi from Rutgers and Jan Kratochv\'{\i}l and Jaroslav Ne\v{s}et\v{r}il from Charles University. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T An inequality for polymatroid functions and its applications %D April 12, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-14.ps.gz %X

An integral-valued set function $f:2^V \mapsto \ZZ$ is called polymatroid if it is submodular, non-decreasing, and $f(\emptyset)=0$. Given a polymatroid function $f$ and an integer threshold $t\geq 1$, let $\alpha=\alpha(f,t)$ denote the number of maximal sets $X \subseteq V$ satisfying $f(X) < t$, let $\beta=\beta(f,t)$ be the number of minimal sets $X \subseteq V $ for which $f(X) \ge t$, and let $n=|V|$. We show that if $\beta \ge 2$ then $\alpha \le \beta^{(\log t)/c}$, where $c=c(n,\beta)$ is the unique positive root of the equation $1=2^c(n^{c/\log\beta}-1)$. In particular, our bound implies that $\alpha \le (n\beta)^{\log t}$ for all $\beta \ge 1$. We also give examples of polymatroid functions with arbitrarily large $t, n, \alpha$ and $\beta$ for which $\alpha \ge \beta^{(.551 \log t)/c}$.

More generally, given a polymatroid function $f:2^V \mapsto \ZZ$ and an integral threshold $t \ge 1$, consider an arbitrary hypergraph $\cH$ such that $|\cH| \ge 2$ and $f(H) \ge t$ for all $H \in \cH$. Let $\cS$ be the family of all maximal independent sets $X$ of $\cH$ for which $f(X) < t$. Then $|\cS| \leq |\cH|^{(\log t)/c(n,|\cH|)}$. As an application, we show that given a system of polymatroid inequalities $f_1(X) \ge t_1,\ldots,f_m(X) \ge t_m$ with quasi-polynomially bounded right hand sides $t_1,\ldots,t_m$, all minimal feasible solutions to this system can be generated in incremental quasi-polynomial time. In contrast to this result, the generation of all maximal infeasible sets is an NP-hard problem for many polymatroid inequalities of small range. %A Juan Garay %A Joseph (Seffi) Naor %A B\"ulent Yener %A Peng Zhao %T On-line Admission Control and Job Scheduling with Preemption %D April 20, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-15.ps.gz %X This paper studies the effect of preemption on the throughput of a single ``communication channel'' where requests arrive with a given processing time and slack. The problem is to decide which requests to serve so as to maximize the channel's utilization. This simple model captures many situations, both at the application (e.g., delivery of video) as well as at the network/transmission levels (e.g., scheduling of jobs at a switch). The problem is on-line in nature, and thus we use competitive analysis for measuring the performance of our scheduling algorithms. We consider two modes of operation: with and without commitment, and derive upper and lower bounds for each case. Since the competitive analysis is based on the worst-case scenario, the average-case performance of the on-line algorithms are examined by a simulation study. %A Olga Ourioupina %A Boris Galitsky %T Application of default reasoning to semantic processing under %D April 25, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-16.ps.gz %X

We build the natural language question-answering system, where the query representation formula is subject to transformation in accordance to the set of default rules. This transformation is required to disambiguate entities in a vertical domain, where they usually have a principle meaning and a set of foreign ones. Default reasoning is discussed to be applicable to a variety of semantic processing algorithms, including the semantic header approach. Using default rules can be referred as pragmatic machinery, complementary to syntactic and semantic processing of complex queries in a vertical domain. The methodology of building the set of default rules is developed, capable of adding the essential entities or eliminating misleading ones to the representation of input query.

Implementation of operational semantics for default reasoning is
suggested to transform the query representation by means of conflicting
default rules. Technique of the keyword-based question answering is
developed, where default rules link the set of potential queries with the
initially coded canonical ones. The task of automatic annotation is then
posed as building the set of keywords semantic headers and the set of
canonical semantic headers for an answer.

Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-16.ps.gz
%A Dean H. Lorenz
%A Ariel Orda
%A Danny Raz
%A Yuval Shavitt
%T How good can IP routing be?
%D April 26, 2001
%Z Thu, 16 Aug 2001 09:00:00 GMT
%I DIMACS
%R 2001-17
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-17.ps.gz
%X

In the traditional IP scheme, both the packet forwarding and the routing protocols are source invariant, i.e., their decisions depend on the destination IP address and not on the source address. Recent protocols, such as MPLS, as well as traditional circuit based protocols like PNNI allow routing decisions to depend both on the source and destination addresses. In fact, much of the theoretical work on routing assumes per-flow forwarding and routing, i.e., the forwarding decision is based both on the source and destination addresses.

The benefit of per-flow forwarding is well-accepted, so is the practical implications of its deployment. Nevertheless, no quantitative study has been carried on the performance differences between the two approaches.

This work aims at investigating the toll in terms of performance
degradation that is incurred by source invariant schemes, as opposed
to the per-flow alternatives. We show, both theoretically and by
simulations, that source invariant routing can be significantly worse
than per-flow routing. Realizing that static shortest path algorithms
are not optimal even among the source invariant routing algorithms,
we develop novel routing algorithms that are based on dynamic weights, and
empirically study their performance in an Internet like environment.

Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-17.ps.gz
%A Shuhong Gao
%A D. Frank Hsu
%T Short containers in Cayley graphs
%D May 8, 2001
%Z Thu, 16 Aug 2001 09:00:00 GMT
%I DIMACS
%R 2001-18
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-18.ps.gz
%X The star diameter of a graph measures the minimum distance from any source
node to several other target nodes in the graph. For a class of Cayley
graphs from abelian groups, a good upper bound for their star diameters is
given in terms of the usual diameters and the orders of elements in the
generating subsets. This bound is tight for several classes of graphs
including hypercubes and directed $n$-dimensional tori. The technique
used is the so-called disjoint ordering for a system of subsets, due to
Gao, Novick and Qiu (1998).
%A Philip MacKenzie
%A Michael K. Reiter
%T Networked Cryptographic Devices Resilient to Capture
%D May 10, 2001
%Z Thu, 16 Aug 2001 09:00:00 GMT
%I DIMACS
%R 2001-19
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-19.ps.gz
%X We present a simple technique by which a device that performs private
key operations (signatures or decryptions) in networked applications,
and whose local private key is activated with a password or PIN, can
be immunized to offline dictionary attacks in case the device is
captured. Our techniques do not assume tamper resistance of the
device, but rather exploit the networked nature of the device, in that
the device's private key operations are performed using a simple
interaction with a remote server. This server, however, is
untrusted---its compromise does not reduce the security of the
device's private key unless the device is also captured---and need not
have a prior relationship with the device. We further extend this
approach with support for "key disabling", by which the rightful
owner of a stolen device can disable the device's private key even if
the attacker already knows the user's password.
%A Rahul Shah
%A Ravi Jain
%A S. Rajagopalan
%A Farooq Anjum
%T Mobile Filters for Efficient Dissemination of Personalized Information Using Content-Based Multicast
%D June 16, 2001
%Z Thu, 16 Aug 2001 09:00:00 GMT
%I DIMACS
%R 2001-20
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-20.ps.gz
%X There has been a surge of interest in the delivery of personalized
information to users (e.g. personalized stocks or travel information),
particularly as the amount of information readily available from sources
like the World Wide Web increases, and mobile users with limited terminal
device capabilities increasingly desire updated, targeted information in
real time. When the number of information recipients is large and there
is sufficient commonality in their interests, as is often the case, it is
worthwhile to use multicast rather than unicast to deliver the
information. However, traditional multicast services, e.g. at the IP
level, do not consider the structure and semantics of the information in
the multicast process. We consider the use of Content-Based Multicast
(CBM) where extra content filtering is performed at the interior nodes of
the multicast tree so as to reduce network bandwidth usage and delivery
delay, as well as to reduce the computation required at the sources and
sinks. Note that filtering could be performed at the IP level or, more
likely, at higher software layers e.g. in applications such as
publish-subscribe and event notification systems.

In this paper we evaluate the situations in which CBM is worthwhile. The benefits of CBM depend critically upon how well filters are placed at interior nodes of the multicast tree, and the costs depend upon those introduced by filters themselves. Further, we consider the benefits of allowing the filters to be mobile so as to respond to user mobility or changes in user interests, and the corresponding costs of filter mobility. We consider two criteria: minimizing total network bandwidth utilization and minimizing mean information delivery delay. For each criterion we also develop a heuristic that runs faster than the optimal algorithm. Finally, we evaluate all the algorithms by means of simulation experiments.

Our results indicate that filters can be effective in substantially reducing bandwidth and delay. We also find filter mobility is worthwhile if there is sufficient locality in the interests of users, or there is marked large-scale user mobility. We conclude with suggestions for further work. %A Richard Lyons %T Generalized Fitting Subgroups and p-Layers of Finite Groups %D June 21, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-21.ps.gz %X The similarities between Bender's generalized Fitting subgroup, Bender's E(G) subgroup and the Gorenstein-Walter $p$-layer of a finite group are axiomatized. %A Alair Pereira do Lago %T Local Groups in Free Burnside Groupoids %D June 26, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-22.ps.gz %X

The study of repeated sequences is the basis of the study of \fb{groups} and \fb{semigroups}, and a better knowledge on Burnside algebras can be very useful in the analysis of many properties of sequences containing repetitions. The main theorem announced here establishes an important connection between \fb{groups} and \fb{semigroups}. In order to establish this connection, we use interesting properties on graphs, categories, combinatorics on words to obtain the algebraic main theorem.

Let $\grafo{G}$ be~a (possibly infinite) strongly connected graph and let $\T$ be a set of monoid identities such that any monoid satisfying $\T$ is also a group. Let $\cat{B}$ be~the free groupoid on $\grafo{G}$ satisfying $\T$. Then, the local groups $\local{B}{v}$, for $v\in \vertices{G}$, are all isomorphic to a free group satisfying $\T$. Furthermore, it is free over a generating set which can be effectively characterized and whose cardinality is the cyclomatic number of the graph $\grafo{G}$. %A Amr Elmasry %T Generalized Pairing Heaps %D July 31, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-23.ps.gz %X We give a generalized form for the standard implementation of the pairing heaps. A new parameter k is introduced. When the node with the minimum value is to be deleted from the heap, the operations to combine the resulting sub-trees into one tree depend on the value of k. When the value of k is equal to 2, the implementation will be equivalent to the standard pairing heaps' implementation. We show that, for any constant k, this general form achieves the same bounds as the standard implementation. Finally, experimental results are conducted showing that, by tuning the value of k, the number of comparisons involved in this operation can be reduced. %A Amr Elmasry %T Optimal Adaptive Sorting %D July 31, 2001 %Z Thu, 16 Aug 2001 09:00:00 GMT %I DIMACS %R 2001-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-24.ps.gz %X We introduce Binomialsort, an adaptive sorting algorithm that is optimal with respect to the number of inversions. The number of comparisons performed by Binomialsort, on an input sequence of length n that has I inversions, is at most $2n \log{\frac{I}{n}} + O(n)$. The bound on the number of comparisons is further reduced to $\frac{3}{\log{3}} n \log{\frac{I}{n}} + O(n)$ by using a new structure, which we call trinomial queues. Experimental results show that our sorting algorithm is practical, efficient and easy to implement. %A Amir Dembo %A Abram Kagan %A Lawrence A. Shepp %T Remarks on the maximum correlation coefficient %D August 15, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-25.ps.gz %X The maximum correlation coefficient between partial sums of independent identically distributed random variables with finite second moment equals the classical (Pearson) correlation coefficient between the sums and, thus, does not depend on the distribution of the random variables. Besides proving this, relations between linearity of regression of each of two random variables on the other and the maximum correlation coefficient are discussed. %A Graham Cormode %A S. Muthukrishnan %T The String Edit Distance Matching Problem with Moves %D August 15, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-26.ps.gz %X

The edit distance between two strings S and R is defined to be the minimum number of character inserts, deletes and changes needed to convert R to S. Given a text string t of length n, and a pattern string p of length m, informally, the string edit distance matching problem is to compute the smallest edit distance between p and substrings of t. A well known dynamic programming algorithm takes time O(nm) to solve this problem, and it is an important open problem in Combinatorial Pattern Matching to significantly improve this bound.

We relax the problem so that (a) we allow an additional operation, namely, substring moves, and (b) we approximate the string edit distance upto a factor of O(log n log* n).(log*n is the number of times log function is applied to n to produce a constant.) Our result is a neat linear time deterministic algorithm for this version of the problem. This is the first known significantly subquadratic algorithm for a string edit distance problem in which the distance involves nontrivial alignments. Our results are obtained by embedding strings into L_1 vector space using a simplified parsing technique we call Edit Sensitive Parsing (ESP). This embedding is approximately distance preserving, and we show many applications of this embedding to string proximity problems including nearest neighbors, outliers, and streaming computations with strings. %A Vincent Mousseau %A Luis Dias %T Fuzzy outranking relations in ELECTRE providing manageable disaggregation procedures %D August 15, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-27.ps.gz %X

In ELECTRE methods, the construction of an outranking relation $S$ amounts at validating or invalidating, for any pair of alternatives $(a,b) \in A$, an assertion $aSb$. This comparison is grounded on the evaluation vectors of both alternatives, and on additional information concerning the DM's preferences, accounting for two conditions: concordance and non-discordance.

In decision processes using these methods, the analyst should interact with DM(s) in order to elicit values for preferential parameters. This can be done either directly or through a disaggregation procedure that infers the parameters values from holistic judgements provided by the DM(s). Inference is usually performed through an optimization program that accounts for the aggregation model and minimizes an ``error function". Although disaggregation approaches have been largely used in additive models, only few advances have been made towards a disaggregation approach for ELECTRE methods. This probably reflects the ``optimization unfriendly" character of the most recent ELECTRE methods.

In this paper we are concerned with a slight adaptation of the fuzzy outranking relation used in the ELECTRE III and ELECTRE TRI that preserves the original ideas and is more optimization-friendly for parameter inference programs. Such modification is shown to preserve the original discordance concept. We show that the modified outranking relation makes it easier to solve inference programs. %A Linchun Gao %A Andras Prekopa %T On Performance Prediction of Cellular Telephone Networks %D August 29, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-28.ps.gz %X This paper considers large cellular mobile networks in which the arrival rates of calls for different cells are different. The lower and upper bound of how much the given number of channels would satisfy the arriving calls are presented. Numerical examples are given. %A Amr Elmasry %T Adaptive Properties of Pairing Heaps %D September 14, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-29.ps.gz %X We show that the pairing heap benefits from the presortedness in the input sequence, and sorts adaptively. In particular, we show that given an input sequence of size $n$ that is already sorted in ascending order from right to left, repeatedly deleting the smallest element, using the pairing heap operation, requires at most $7n$ comparisons. We also show that starting with any initial heap structure that has the property that the corresponding sequence of the heap is sorted in ascending order, at most $7n$ comparisons are required to produce the sorted sequence by repeatedly deleting the smallest element from the heap. This latter result implies an easy proof for Tarjan's sequential access theorem for splay trees, with a better bound of $3.5n$ rotations. Finally, experimental results are conducted, supporting the fact that the pairing heap is adaptive by comparing sorting using the pairing heap with Splaysort and Binomialsort. %A Amr Elmasry %T Scrambled pairing and square-root trees %D September 14, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-30.ps.gz %X

An n-node forest of trees is called a square-root forest, if it has the following structure. For a given positive integer $k$ the forest has $2k$ trees. The first $k+1$ are single nodes. For the other $k-1$ trees, the root of tree $l$ has $l$ single-node children, for all $l$ from 1 to $k-1 $. If $n \neq k+\frac{k \; (k+1)}{2}$ the definition is slightly different.

Given a forest of $\tau$ trees, the rank of a tree is defined to be the number of children of the root of this tree. A phase of operations is defined as first linking the trees in pairs, then replacing the tree with the largest rank with its sub-trees together with a new single-node tree. The pairing is done by first sorting the trees by rank and numbering them from $1$ to $\tau$, then linking the root of tree $l$ to the root of tree $l+\lceil \tau/2 \rceil$, for all $l$ from $1$ to $\lfloor \tau/2 \rfloor$. We give a combinatorial proof that after applying an $O(n^{1.5})$ phases, the forest will converge to the square-root forest.

It is proven in \cite{fsst} that using any pairing strategy, the amortized cost of deleting the item with the minimum value from a heap is $O(\sqrt{n})$. Our pairing strategy gives $\Theta(\sqrt{n})$ amortized cost for this operation. %A B. Derrida %A J. L. Lebowitz %A E. R. Speer %T Large Deviation of the Density Profile in the Symmetric Simple Exclusion Process %D September 20, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-31.ps.gz %X We consider an open one dimensional lattice gas on sites $i=1,\dots,N$, with particles jumping independently with rate $1$ to neighboring interior empty sites, the {\it simple symmetric exclusion process}. The particle fluxes at the left and right boundaries, corresponding to exchanges with reservoirs at different chemical potentials, create a stationary nonequilibrium state (SNS) with a steady flux of particles through the system. The mean density profile in this state, which is linear, describes the typical behavior of a macroscopic system, i.e., this profile occurs with probability 1 when $N \to \infty$. The probability of microscopic configurations corresponding to some other profile $\rho(x)$, $x = i/N$, has the asymptotic form $\exp[-N {\cal F}(\{\rho\})]$; $\cal F$ is the {\it large deviation functional}. In contrast to equilibrium systems, for which ${\cal F}_{eq}(\{\rho\})$ is just the integral of the appropriately normalized local free energy density, the $\cal F$ we find here for the nonequilibrium system is a nonlocal function of $\rho$. This gives rise to the long range correlations in the SNS predicted by fluctuating hydrodynamics and suggests similar non-local behavior of $\cal F$ in general SNS, where the long range correlations have been observed experimentally. %A Endre Boros %A Robert E. Jamison %A Renu Laskar %A Henry Martyn Mulder %T On 3-Simplicial Vertices in Planar Graphs %D September 24, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-32.ps.gz %X A vertex $v$ in a graph $G = (V,E)$ is {\em $k$-simplicial} if the neighborhood $N(v)$ of $v$ can be vertex-covered by $k$ or fewer complete graphs. The main result of the paper states that a planar graph of order at least four has at least four 3-simplicial vertices of degree at most five. This result is a strengthening of the classical corollary of Euler's Formula that a planar graph of order at least four contains at least four vertices of degree at most five. %A Endre Boros %A Peter L. Hammer %T Pseudo-Boolean Optimization %D September 24, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-33.ps.gz %X This survey examines the state of the art of a variety of problems related to pseudo-Boolean optimization, i.e. to the optimization of set functions represented by closed algebraic expressions. The main parts of the survey examine general pseudo-Boolean optimization, the specially important case of quadratic pseudo-Boolean optimization (to which every pseudo-Boolean optimization can be reduced), several other important special classes, and approximation algorithms. %A Bela Csaba %A Sachin Lodha %T A Randomized Online Algorithm for the k-Server Problem on a Line %D October 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-34.ps.gz %X We give a $O(n^{2 \over 3}\log{n})$-competitive randomized $k$--server algorithm when the underlying metric space is given by $n$ equally spaced points on a line. For $n = k + o(k^{3 \over 2}/\log{k})$, this algorithm is $o(k)$--competitive. %A Giovanni Di Crescenzo %A Clemente Galdi %T Secret Sharing and Hypergraph Decomposition %D November 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-35.ps.gz %X

A secret sharing scheme is a protocol by which a dealer distributes a secret among a set of participants in such a way that only qualified sets of them can reconstruct the value of the secret whereas any non-qualified subset of participants obtain no information at all about the value of the secret. Secret sharing schemes have always played a very important role in the construction of higher level cryptographic primitives and protocols.

In this paper we investigate the construction of efficient secret sharing schemes by using a technique called {\em hypergraph decomposition}, extending in a non-trivial way the previously studied graph decomposition technique.

The application of this technique allows us to obtain secret sharing schemes for particular and large classes of access structures (such as hyperpaths, hypercycles, hyperstar and hypertrees) with dramatically improved efficiency over previous results. Specifically, in all cases, schemes obtained using previous techniques generate shares of exponential blowup in size, while for ours the blowup is always polynomial, in some cases even constant. Our schemes are also complemented by bounds on the blowup that are essentially tight.

In the course of the formulation of the hypergraph decomposition technique, we also obtain an elementary characterization of the ideal access structures among the hyperstar, which is of independent interest. %A Ali Shokoufandeh %A Yi Zhao %T On a Tiling Conjecture of Komlos for 3-Chromatic Graphs %D November 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-35.ps.gz %X Given two graphs $G$ and $H$, an $H$-{\em matching} of $G$ (or a {\em tiling} of $G$ with $H$) is a subgraph of $G$ consisting of vertex-disjoint copies of $H$. For an $r$-chromatic graph $H$ on $h$ vertices, we write $u=u(H)$ for the smallest possible color-class size in any $r$-coloring of $H$. The {\em critical chromatic number} of $H$ is the number $\chi_{cr}(H)=(r-1)h/(h-u)$. A conjecture of Koml\'{o}s states that for every graph $H$, there is a constant $K$ such that if $G$ is any $n$-vertex graph of minimum degree at least $\left(1-(1/\chi_{cr}(H))\right)n$, then $G$ contains an $H$-matching that covers all but $K$ vertices of $G$. In this paper we prove that the conjecture holds for all sufficiently large values of $n$, when $H$ is a 3-chromatic graph. %A Philip MacKenzie %A Michael K. Reiter %T Delegation of Cryptographic Servers for Capture-Resilient Devices %D November 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-37.ps.gz %X A device that performs private key operations (signatures or decryptions), and whose private key operations are protected by a password, can be immunized against offline dictionary attacks in case of capture by forcing the device to confirm a password guess with a designated remote server in order to perform a private key operation. Recent proposals for achieving this allow untrusted servers and require no server initialization per device. In this paper we extend these proposals to enable dynamic delegation from one server to another; i.e., the device can subsequently use the second server to secure its private key operations. One application is to allow a user who is traveling to a foreign country to temporarily delegate to a server local to that country the ability to confirm password guesses and aid the user's device in performing private key operations, or in the limit, to temporarily delegate this ability to a token in the user's possession. Another application is proactive security for the device's private key, i.e., proactive updates to the device and servers to eliminate any threat of offline password guessing attacks due to previously compromised servers. %A Yi Zhao %T On a Conjecture of Loebl %D November 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-38 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-38.ps.gz %X Martin Loebl conjectured that if in a graph $G$ on $n$ vertices at least half the vertices have degree at least $n/2$, then $G$ contains, as subgraphs, all trees with at most $n/2$ edges. We prove the conjecture for sufficient large $n$. %A I. E. Zverovich %T Extended $(P_5, {\overline P_5})$-free graph %D November 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-39.ps.gz %X Let $G$ and $H$ be graphs. A {\em substitution} of $H$ in $G$ instead of a vertex $v \in V(G)$ is the graph $G(v \to H)$, which consists of disjoint union of $H$ and $G - v$ with the additional edge-set $\{xy: x \in V(H), y \in N_G(v)\}$.

For a hereditary class of graphs ${\mathcal P}$, the {\em substitutional closure} of ${\mathcal P}$ is defined as the class ${\mathcal P}^*$ consisting of all graphs which can be obtained from graphs in $P$ by repeated substitutions.

We characterize the substitutional closure $(P_5 \cup K_1, {\overline P_5} \cup K_1)$-free graphs in terms of forbidden induced subgraphs.

The weighted stability problem is polynomially solvable for this class. %A I. E. Zverovich %T Characterization of superbipartite graphs %D November 1, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 20 01-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-40.ps.gz %X Let $G$ and $H$ be graphs. A {\em substitution} of $H$ in $G$ instead of a vertex $v \in V(G)$ is the graph $G(v \to H)$, which consists of disjoint union of $H$ and $G - v$ with the additional edge-set $\{xy: x \in V(H), y \in N_G(v)\}$.

For a hereditary class of graphs ${\mathcal P}$, the {\em substitutional closure} of ${\mathcal P}$ is defined as the class ${\mathcal P}^*$ consisting of all graphs which can be obtained from graphs in $P$ by repeated substitutions.

Let ${\mathcal B}$ be the class of all graphs $G$ such that $G - N[v]$ is a bipartite graph for every vertex $v$ of $G$. Here $N[v]$ denotes the closed neighborhood of $v$. A graph is called a {\em superbipartite graph} if it is in the substitutional closure of ${\mathcal B}$.

We characterize superbipartite graphs in terms of forbidden induced subgraphs. Note that the weighted stability number in ${\mathcal B}^*$ can be found in polynomial time. %A Piotr Berman %A Sridhar Hannenhalli %A Marek Karpinski %T 1.375-Approximation Algorithm for Sorting by Reversals %D November 7, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-41.ps.gz %X Analysis of genomes evolving by inversions leads to a general combinatorial problem of {\em Sorting by Reversals}, MIN-SBR, the problem of sorting a permutation by a minimum number of reversals. This combinatorial problem has a long history, and a number of other motivations. It was studied in a great detail recently in computational molecular biology. Following a series of preliminary results, Hannenhalli and Pevzner developed the first exact polynomial time algorithm for the problem of sorting signed permutations by reversals, and a polynomial time algorithm for a special case of unsigned permutations. The best known approximation algorithm for MIN-SBR, due to Christie, gives a performance ratio of 1.5. In this paper, by exploiting the polynomial time algorithm for sorting signed permutations and by developing a new approximationalgorithm for maximum cycle decomposition of breakpoint graphs, we improve the performance ratio for MIN-SBR to 1.375. Besides its biological and combinatorial importance, better approximation algorithms for MIN-SBR have become particularly challenging recently because this problem has been proven to NP-hard by Caprara, and MAX-SNP hard by Berman and Karpinski, excluding thus an existence of a polynomial time approximation scheme (PTAS) for that problem. %A Piotr Berman %A Marek Karpinski %T Efficient Amplifiers and Bounded Degree Optimization %D November 7, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-42.ps.gz %X This paper studies the existence of efficient (small size) amplifiers for proving explicit inaproximability results for bounded degree and bounded occurrence combinatorial optimization problems, and gives an explicit construction for such amplifiers. We use this construction also later to improve the currently best known approximation lower bounds for bounded occurrence instances of linear equations mod 2, and for bounded degree (regular) instances of MAX-CUT. In particular we prove the approximation lower bound of 152/151 for exactly 3-occurrence E3-OCC-E2-LIN-2 problem, and MAX-CUT problem on 3-regular graphs, E3-MAX-CUT, and the approximation lower bound of 121/120 for E3-OCC-2-LIN-2 problem. As an application we are able to improve also the best known approximation lower bound for E3-OCC-MAX-E2SAT. %A Anna C. Gilbert %A Yannis Kotidis %A S. Muthukrishnan %A Martin. J. Strauss %T QuickSAND: Quick Summary and Analysis of Network Data %D November 8, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-43.ps.gz %X Monitoring and analyzing traffic data generated from large ISP networks imposes challenges both at the data gathering phase as well as the data analysis itself. Still, both tasks are crucial for responding to day to day challenges of engineering large networks with thousands of customers. In this paper we build on the premise that approximation is a necessary evil of handling massive datasets such as network data. We propose building compact summaries of the traffic data called sketches at distributed network elements and centers. These sketches are able to respond well to queries that seek features that stand out of the data. We call such features ``heavy hitters.'' In this paper, we describe sketches and show how to use sketches to answer aggregate and trend-related queries and identify heavy hitters. This may be used for exploratory data analysis of network operations interest. We support our proposal by experimentally studying AT&T WorldNet data and performing a feasibility study on the Cisco NetFlow data collected at several routers. %A Anna C. Gilbert %A Sudipto Guha %A Piotr Indyk %A S. Muthukrishnan %A Martin J. Strauss %T Near-Optimal Sparse Fourier Representations via Sampling %D November 15, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-44.ps.gz %X We give an algorithm for finding a Fourier representation r of B terms for a given discrete signal a of length N, such that the sum-square error is within the factor (1+ epsilon) of best possible sum-square error. Our algorithm can access the signal a by reading its values on a sample set T, chosen randomly from a (non-product) distribution of our choice, independent of the signal a. That is, we sample non-adaptively. The total time cost of the algorithm is polynomial in B log(N) log(M)/epsilon (where M is a standard input precision parameter), which implies a similar bound for the number of samples. %A Dragan Stevanovic %T On the components of NEPS of connected bipartite graphs %D November 20, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-45.ps.gz %X

Back in 1983, D.~Cvetkovi\'c posed the conjecture that the components of NEPS of connected bipartite graphs are almost cospectral. In 2000, we showed that this conjecture does not hold for infinitely many bases of NEPS, and we posed a necessary condition on the base of NEPS for NEPS to have almost cospectral components. At the same time, D.~Cvetkovi\'c posed weaker version of his original conjecture which claims that each eigenvalue of NEPS is also the eigenvalue of each component of NEPS.

Here we prove this weaker conjecture, give an upper bound on the multiplicity of an eigenvalue of NEPS as an eigenvalue of its component, give new sufficient condition for the almost cospectrality of components of NEPS of connected bipartite graphs, and characterize the bases of NEPS which satisfy this condition. %A Dragan Stevanovic %T Antipodal graphs of small diameter %D November 27, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-46.ps.gz %X

A proper metric space $X=(X,d)$ is called {\em antipodal} if -- with $[x,y]=\{z\in X\,\colon\,d(x,y)=d(x,z)+d(z,y)\}$ -- for every $x\in X$ there exists some $y\in X$ such that $[x,y]=X$. A connected undirected finite graph~$G$ is called {\em antipodal} if its associated graph metric is antipodal.

Here we characterize antipodal graphs of diameter~$3$ and show that almost every graph is an induced subgraph of some antipodal graph of diameter~$3$. %A Pierre Hansen %A Hadrien Melot %A Dragan Stevanovic %T Integral Complete Split Graphs %D November 27, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-47.ps.gz %X We give characterizations of integral graphs in the family of complete split graphs and a few related families of graphs. %A Dragan Stevanovic %A Gilles Caporossi %T On the separator of fullerenes %D December 7, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-48 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-48.ps.gz %X The separator of a graph is the difference between the two largest eigenvalues of its adjacency matrix. The program {\em Graffiti}, developed by S.~Fajtlowicz, posed the conjecture that the separator of any fullerene is at most~$1$. Here we prove this conjecture by using the {\it interlacing theorem} in an interesting manner and then extend this method to show that the dodecahedron has the largest separator amongst all fullerenes. %A Srikrishnan Divakaran %A Michel Saks %T Approximation algorithms for Offline scheduling with set-ups %D December 21, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-49 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-49.ps.gz %X In this paper, we present new approximation results for the offline problem of scheduling $n$ independent jobs with sequence independent set-ups and common release times on a single machine system for the following optimality criteria :

- total weighted completion time

- maximum lateness

For the total weighted completion time criterion, we present a $2$-approximation algorithm, the first known polynomial time constant approximation algorithm for this problem. For this criterion, when the number of job types is arbitrary, the computational complexity is open and prior to our results there were no known polynomial time constant approximation algorithms even when all jobs have unit weight and all job types have unit set-up times.

For the maximum lateness criterion, we present an algorithm that obtains an optimal solution in a relaxed framework where our algorithm is given strictly more resources (i.e. a machine with higher processing speed) than the optimal offline algorithm to which it is compared. For this criterion, when the number of job types is arbitrary, this problem is known to be $NP$-hard even when there are two jobs in each type or when all job types have unit set-up times, there are at most three jobs in each type and there are at most three distinct due dates. Also, for this criterion, when the number of job types and the due dates are arbitrary, there are no constant approximation algorithms unless $P=NP$. %A Srikrishnan Divakaran %A Michel Saks %T Online scheduling with release times and set-ups %D December 21, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-50 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-50.ps.gz %X In this paper, we address the problem of online scheduling of $n$ independent jobs with release times and sequence independent set-ups on a single machine system with the objective of minimizing the maximum flow time. For this problem, we present an $O(1)$-competitive online algorithm. We also present the proof of $NP$-completeness of the offline problem. %A Ali Shokoufandeh %A Yi Zhao %T Proof of a Tiling Conjecture of Komlos %D December 21, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-51 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-51.ps.gz %X Given two graphs $G$ and $H$, an $H$-{\em matching} of $G$ (or a {\em tiling} of $G$ with $H$) is a subgraph of $G$ consisting of vertex-disjoint copies of $H$. For an $r$-chromatic graph $H$ on $h$ vertices, we write $u=u(H)$ for the smallest possible color-class size in any $r$-coloring of $H$. The {\em critical chromatic number} of $H$ is the number $\chi_{cr}(H)=(r-1)h/(h-u)$. A conjecture of Koml\'{o}s states that for every graph $H$, there is a constant $K$ such that if $G$ is any $n$-vertex graph of minimum degree at least $\left(1-(1/\chi_{cr}(H))\right)n$, then $G$ contains an $H$-matching that covers all but $K$ vertices of $G$. In this paper we prove that the conjecture holds for all sufficiently large values of $n$. %A A. Barg %A G. R. Blakley %A G. Kabatiansky %T Digital fingerprinting codes: Problems statements, constructions, identification of traitors %D November 21, 2001 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-52 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-52.ps.gz %X We consider a general fingerprinting problem of digital data under which coalitions of users can alter or erase some bits in their copies in order to create an illegal copy. Each user is assigned a fingerprint which is a word in a fingerprinting code of size M (the total number of users) and length n. We present binary fingerprinting codes secure against size-t coalitions which enable the distributor (decoder) to recover at least one of the users from the coalition with probability of error exp(-Omega(n)) for M=exp(Omega(n)). This is an improvement over the best known schemes that provide the error probability no better than exp(-Omega(n^{1/2})) and for this probability support at most exp(O(n^{1/2})) users. The construction complexity of codes is polynomial in n. We also present versions of these constructions that afford poly(n)=poly log(M) identification algorithms of a member of the coalition, improving over the best previously known complexity of Omega(M).

For the case t=2 we construct codes of exponential size with even stronger performance, namely, the distributor can either recover both users from the coalition with probability 1-exp(Omega(n)), or identifies one traitor with probability 1. %A I. E. Zverovich %T Weighted stability in hypergraphs and weighted domination in split graphs %D January 1, 2002 %Z Mon, 31 Dec 2001 09:00:00 GMT %I DIMACS %R 2001-53 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-53.ps.gz %X We define a family of hereditary subclasses of split graphs where the weighted domination problem is polynomially solvable. Our approach is similar to that of Balas and Yu \cite{BalasY89} for maximal stable sets in graphs.

We apply the result to the subproblem of finding the weighted stability number of a hypergraph. %A Vince Grolmusz %T A Trade-Off for Covering the Off-Diagonal Elements of Matrices %D January 22, 2002 %Z Tue Apr 23 18:22:25 EDT 2002 %I DIMACS %R 2002-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-01.ps.gz %X We would like to cover all the off-diagonal elements of an $n\times n$ matrix by non-necessarily contiguous rectangular submatrices; the diagonal elements cannot be covered. It is not difficult to give a cover with $2\lceil \log n\rceil$ rectangles, where some off-diagonal elements are covered as many as $\lceil \log n\rceil$-times, or another cover, using $n$ rectangles and any off-diagonal elements of the matrix is covered only once. We show that one cannot attain {\it both} low covering multiplicity and a small number of covering rectangles at the same time: We prove a trade-off between these two numbers. %A D. Frank Hsu %A Xingde Jia %T Additive Bases and Extremal Problems in Groups, Graphs and Networks %D January 22, 2002 %Z Tue Apr 23 18:22:26 EDT 2002 %I DIMACS %R 2002-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-02.ps.gz %X Bases in sets and groups and their extremal problems have been studied in additive number theory such as the postage stamp problem. On the other hand, Cayley graphs based on specific finite groups have been studied in algebraic graph theory and applied to constant efficient communication networks such as circulant networks with minimum diameter (or transmission delay). In this paper we establish a framework which defines and unifies additive bases in groups, graphs and networks and survey results on the bases and their extremal problems. Some well known and well studied problems such as harmonious graphs and perfect addition sets are also shown to be special cases of the framework. %A Victoria Ungureanu %T Regulating E-Commerce through Contract Certificates %D January 25, 2002 %Z Tue Apr 23 18:22:26 EDT 2002 %I DIMACS %R 2002-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-03.ps.gz %X

Enforcing e-commerce contracts is difficult because an enterprise may be concurrently bound by a very large number of commercial agreements, and because these agreements are likely to change in time or to be annulled. We argue that it is not feasible to establish a dedicated server for each contract; nor it is possible to construct a composition of such commercial agreements. To deal with these problems we propose to embed contracts in certificates. We show that disseminating contracts as certificates facilitates deployment, annulment and revision of contracts. We propose a language for stating contract terms, and present several formal examples.

We describe here our implementation, which can be used as an extension to a web server, or as a separate server with interface to application. The proposed model does not require any modification of the current certificate infrastructure, and only minor modifications to servers. %A Melvin F. Janowitz %T The Controversy About Continuity of Clustering Algorithms %D February 5, 2002 %Z Tue Apr 23 18:22:27 EDT 2002 %I DIMACS %R 2002-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-04.ps.gz %X It is not meaningful to talk about continuous cluster algorithms for use with data having only ordinal significance. An ordinal version of continuity for such data is introduced and studied. %A Melvin F. Janowitz %T The split systems generated by a connected graph %D February 5, 2002 %Z Tue Apr 23 18:22:28 EDT 2002 %I DIMACS %R 2002-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-05.ps.gz %X The bridges of a connected graph G are shown to generate splits of G that are compatible in the sense that they generate a tree. The results are extended to generalized bridges, and the theory is applied to develop new clustering algorithms. These algorithms operate by removing at each level any bridges (or generalized bridges) from the threshold graph at that level before applying any clustering criterion. %A Zdenek Dvorak %A Jan Kara %A Daniel Kral %A Ondrej Pangrac %T On Pattern Coloring of Cycle Systems %D February 19, 2002 %Z Tue Apr 23 18:22:29 EDT 2002 %I DIMACS %R 2002-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-06.ps.gz %X

A k-cycle system is a system of cyclically ordered k-tuples of a finite set. A pattern is a sequence of letters. A coloring of a k-cycle system with respect to a set of patterns of length k is proper iff each cycle is colored consistently with one of the patterns, i.e. the same/distinct letters correspond to (the) same/distinct color(s). The feasible set of a cycle system is the set of all l's such that there exists a proper coloring of it using exactly l colors.

For all combinations of a pattern set P and a number l, we either find a polynomial algorithm or prove NP--completeness of the problem whether a given cycle system with a set of patterns P can be colored by at most l colors. We further construct a cycle system with a prescribed feasible set for almost every set of patterns containing only two different letters. %A Gary Gordon %T Expected rank in antimatroids %D February 21, 2002 %Z Tue Apr 23 18:22:30 EDT 2002 %I DIMACS %R 2002-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-07.ps.gz %X We consider a probabilistic antimatroid $A$ on the ground set $E$, where each element $e \in E$ may succeed with probability $p_e$. We focus on the expected rank $ER(A)$ of a subset of $E$ as a polynomial in the $p_e$. General formulas hold for arbitrary antimatroids, and simpler expressions are valid for certain well-studied classes, including trees, rooted trees, posets, and finite subsets of the plane. We connect the Tutte polynomial of an antimatroid to $ER(A)$. When $S$ is a finite subset of the plane with no three points collinear, we derive an expression for the expected rank which has surprising symmetry properties. Corollaries include new formulas involving the beta invariant of subsets of $S$ and new proofs of some known formulas. %A David Avis %A Vasek Chvatal %T On a conjecture of Baiou and Balinski %D March 14, 2002 %Z Tue Apr 23 18:22:31 EDT 2002 %I DIMACS %R 2002-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-08.ps.gz %X We disprove a conjecture of Baiou and Balinski concerning a variation on the Birkhoff-von Neumann theorem. %A Vince Grolmusz %T Pairs of Codes with Prescribed Hamming Distances and Coincidences %D March 21, 2002 %Z Tue Apr 23 18:22:31 EDT 2002 %I DIMACS %R 2002-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-09.ps.gz %X In this work we describe a fast algorithm for generating pairs of q-ary codes with prescribed pairwise Hamming-distances and coincidences (for a letter $s\in\{0,1,\ldots,q-1\}$, the number of $s$-coincidences between codewords $a$ and $b$ is the number of letters $s$ in the same positions both in $a$ and $b$). The method is a generalization of a method for constructing set-systems with prescribed intersection sizes (V. Grolmusz: Constructing Set-Systems with Prescribed Intersection Sizes, DIMACS Technical Report No. 2001-03), where only the case $q=2$ and $s=1$ was examined. We also generate codes with prescribed $k$-wise coincidences and Hamming-distances. %A Boris Galitsky %T Use of virtual semantic headers to improve the coverage of natural language question answering domains %D March 24, 2002 %Z Tue Apr 23 18:22:32 EDT 2002 %I DIMACS %R 2002-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-10.ps.gz %X We report on the knowledge representation mechanism designed for natural language question-answering system to function in such poorly-formalized and heterogeneous domains as the financial, legal, pharmaceutical and psychological. The system is oriented to provide a customized expert advice, given the pre-designed set of textual templates and the database that contains customers’ profiles and preferences, parameters of products and services, etc. Question-answering is based on matching a formal representation of a query against the formalized representations of answers’ essential ideas (semantic headers of these answers). Semantic headers are designed to be independent on the query phrasing and are the means of approximate reasoning while generating the most relevant advice. A semantic skeleton of an answer includes semantic headers and deductive links between them and their entities, based on the common-sense domain knowledge. Semantic skeleton includes the virtual semantic headers, which do not have to be explicitly programmed but are generated on the fly, using the clauses of semantic skeleton, to be matched with a question.

We present the evaluation of the released question-answering system,
advising the customers of H&R Block and CBS Market Watch on various taxes
and associated legal issues starting from 1999. Domain development and
maintenance implications of semantic header technique, answer accuracy,
meaning deviations and overall customer impressions are analyzed.
%A Sorin Alexe
%A Eugene Blackstone
%A Peter L. Hammer
%A Hemant Ishwaran
%A Michael S. Lauer
%A Claire E. Pothier Snader
%T Coronary Risk Prediction by Logical Analysis of Data
%D March 28, 2002
%Z Tue Apr 23 18:22:34 EDT 2002
%I DIMACS
%R 2002-11
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-11.ps.gz
%X
The objective of this study was to distinguish within a population
of patients with known or suspected coronary artery disease groups at high
and at low mortality rates. The study was based on Cleveland Clinic
Foundation's dataset of 9454 patients, of whom 312 died during an
observation period of 9 years. The Logical Analysis of Data method was
adapted to handle the disproportioned size of the two groups of patients,
and the inseparable character of this dataset -- characteristic to many
medical problems. As a result of the study, we have identified a high-risk
group of patients representing 1/5 of the population, with a mortality
rate 4 times higher than the average, and including 3/4 of the patients
who died. The low-risk group identified in the study, representing
approximately 4/5 of the population, had a mortality rate 3 times lower
than the average.
%A Nathaniel Dean
%T Mathematical Programming Formulation of Rectilinear Crossing Minimization
%D March 28, 2002
%Z Tue Apr 23 18:22:34 EDT 2002
%I DIMACS
%R 2002-12
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-12.ps.gz
%X
In a {\em rectilinear drawing} of a simple graph $G$ each vertex is
mapped to a distinct point in the plane and each edge is represented by
a straight-line segment with appropriate ends. The goal of rectilinear
crossing minimization is to find a rectilinear drawing of $G$ with as
few edge crossings as possible.
A new approach to rectilinear crossing minimization is presented
including a formulation of the problem as a mathematical program with
a linear objective function and simple quadratic constraints. Then we
size the problem for various graph families.
These sizings provide a mechanism for ranking crossing number problems
according to their apparent complexity.
%A I. E. Zverovich
%A I. I. Zverovich
%T Graph Ramseian bipartitions and weighted stability
%D April 14, 2002
%Z Tue Apr 23 18:22:35 EDT 2002
%I DIMACS
%R 2002-13
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-13.ps.gz
%X
Let *P* and *Q* be hereditary
classes of graphs.
The ordered pair (*P*, *Q*) is called
{\em Ramseian} if both *P* and *Q*
are polynomially recognizible,
*P* is \alpha-bounded, and
*Q* is \omega-bounded.
Let (*P*, *Q*) be an ordered pair of
graph classes.
We denote by *P***Q* the class of all
graphs G such that there exists a partition
A \cup B = V(G) with

- G(A) \in
*P*and - G(B) \in
*Q*,

A class of graphs

Our main result is the following theorem.

**Theorem***
Let ( P, Q) be a Ramseian pair.*

(i) If Q is an \alpha_w-polynomial class then the class P*Q is also \alpha_w-polynomial.

(ii) If Q is an \alpha_w-complete class then the class P*Q is also \alpha_w-complete.

A similar results for \omega_w-polynomial classes and \omega_w-complete classes are easily follow (\omega_w(G) is the weighted clique number of a graph G). Finally, a recent result of Alekseev and Lozin (2002) is a particular case of our main theorem.

**Keywords**: Hereditary class, forbidden induced subgraphs,
Ramseian partition, weighted stability number.
%A Mohamed Haouari
%A Mohammad A. Al-Fawzan
%T A Bi-objective Model for Maximizing the Quality in Project Scheduling
%D April 14, 2002
%Z Tue Apr 23 18:22:36 EDT 2002
%I DIMACS
%R 2002-14
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-14.ps.gz
%X
Traditionally, the Resource Constrained Project Scheduling Problem (RCPSP)
is investigated in the operations research literature from the makespan
minimization perspective. However, a recent survey conducted in the United
States revealed the surprizing fact that the majority of project planners
consider maximization of the quality of project schedules as the most
important objective (Icmeli Tukel and Rom, 1998). In this paper, the
integration of quality in project scheduling is investigated. For that
purpose, the problem is modeled as a bi-objective resource-constrained
project scheduling problem. A new objective defined as the schedule
robustness is introduced as a quality measure. The maximization of this
objective is considered along with the makespan minimization. A tabu
search
algorithm is developped in order to generate an approximate set of
efficient
solutions. Several variants of the algorithm are tested and compared on a
large set of benchmark problems. The results are analyzed using
statistical
design of experiments techniques.

**Keywords : **resource-constrained project scheduling,
multi-objective
combinatorial optimization, tabu search, quality, design of experiments.
%A E. Boros
%A V. Gurvich
%T On Nash-solvability in pure stationary strategies of finite games with perfect information which may have cycles
%D April 18, 2002
%Z Tue Apr 23 18:22:37 EDT 2002
%I DIMACS
%R 2002-15
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-15.ps.gz
%X
Let $g$ be a positional game with perfect information modelled by
a directed graph $G = (V,E)$ which is finite but may have
directed cycles.
% an initial position $v_0 \in V(G)$,
A local cost function $f(i,e)$ is given for every player $i \in I$
and for every move $e \in E$. The players are allowed only pure
stationary strategies, that is the move in any position is
deterministic, and may depend only on this present position, not
on the preceding positions or moves. Hence, the resulting play $p$
is a directed path which begins in the initial position $v_0$ and
either (i) ends in a final position or (ii) results in a simple
directed cycle. Given a play $p$, for every player $i \in I$ the
effective cost $f(i,p)$ is defined as the sum of all local costs
along the path $f(i,p) = \sum_{e \in p} f(i,e)$ in case of (i), or
as $f(i,p) = + \infty$ in case of (ii). Let us call a local cost
function $f$ and the corresponding game terminal if $f(i,e) = 0$
for every player $i \in I$ and for every move $e$ which is not
leading to a final position. The players wish to minimize their
effective costs, thus in particular they should try to avoid
cycles. A game is called Nash-solvable if it has at least one Nash
equilibrium in pure stationary strategies and properly
Nash-solvable if the corresponding play results in a final
position, not in a cycle. It is easy to demonstrate that already
zero-sum games with two players may not be solvable. Yet,
Nash-solvability turns into an exciting open problem under the
following simple additional condition: ($\clubsuit$) all local
costs are non-negative, i.e. $f(i,e) \geq 0$ for every player $i
\in I$ for every move $e \in E$, or under the seemingly weaker but
in fact equivalent condition: ($\spadesuit$) $\sum_{e \in c}
f(i,e) \geq 0$, for every player $i \in I$ and for every simple
directed cycle $c$ in $G$.

In all examples, which we were able to analyze, games satisfying
condition ($\clubsuit$) turned out to be properly Nash-solvable,
yet the Nash-solvability of such games is an open problem. In this
paper we prove proper Nash-solvability for the following special
cases: (a) play-once games, i.e. games in which every player
controls exactly one position, (b) terminal games with only 2
players, and (c) terminal games with only 2 terminal moves. We
also show that in each of these cases a Nash equilibrium can be
constructed in polynomial time in the size of the graph $G$.

**Keywords**: cycle, cyclic game,
directed graph, directed cycle, directed path, effective cost,
local cost, Nash equilibrium,
perfect information, positional game,
potentials, pure strategies, saddle point, stationary strategies, strong
equilibrium, shortest path.
%A Viacheslav Prelov
%A Sergio Verd\'u
%T Second-order Asymptotics of Mutual Information
%D April 19, 2002
%Z Tue Apr 23 18:22:38 EDT 2002
%I DIMACS
%R 2002-16
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-16.ps.gz
%X
A formula for the second-order expansion of the input-output
mutual information of multidimensional channels as the
signal-to-noise ratio goes to zero is obtained. While the
additive noise is assumed to be Gaussian, we deal with very
general classes of input and channel distributions. As special
cases, these channel models include fading channels, channels
with random parameters and channels with almost Gaussian noise.
When the channel is unknown at the receiver, the second term
in the asymptotic expansion depends not only on the
covariance matrix of the input signal but also on the fourth
mixed moments of its components. The second-order asymptotics of
mutual information finds application in the analysis of the
bandwidth-power tradeoff achieved by specific (not necessarily
optimum) input signaling in the wideband regime.

**Keywords:** Mutual Information, Channel Capacity, Fading
Channels, Nonlinear Channels, Low-power communication.
%A Alberto Roverato
%A Guido Consonni
%T Compatible Prior Distributions for DAG models
%D April 19, 2002
%Z Tue Apr 23 18:22:39 EDT 2002
%I DIMACS
%R 2002-17
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-17.ps.gz
%X
The application of certain Bayesian techniques, such as the Bayes Factor
and model averaging, requires the specification of prior distributions on
the parameters of alternative models. We propose a new method for
constructing compatible priors on the parameters of models nested in a
given DAG (Directed Acyclic Graph) model, using a conditioning
approach. We define a class of parameterisations consistent with the
modular structure of the DAG and derive a procedure, invariant within this
class, which we name reference conditioning.

*Keywords:* Bayes factor; Directed acyclic graph; Fisher information
matrix; Graphical model; Invariance; Jeffreys conditioning; Group
reference prior; Reference conditioning; Reparameterisation.
%A Boris Galitsky
%A Andrey Dementiev
%A Sergey Shelepin
%T The set of pair-wise correlated sequence positions contributes to determination of the immunoglobulin fold
%D April 23, 2002
%Z Tue Apr 23 18:22:40 EDT 2002
%I DIMACS
%R 2002-18
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-18.ps.gz
%X
In this study, we are resolving the contradiction that rather limited
number of residues or classes of amino acids (about 10) determines the
fold (Immunoglobulin-like) for the sequence of about 100 residues long.
Immunoglobulin fold comprises the protein super-families with rather
distinguishing sequences with less than 10% identity; their sequence
alignment can be accomplished only taking into account the 3D structure.
Therefore, we believe that discovering the additional common features of
the sequences is necessary to explain the existence of common fold for
these (SCOP) superfamilies. The analysis of pair-wise interconnection
between residues of the multiple sequence alignment helped us to reveal
the set of mutually correlated positions, inherent to almost every
super-family of protein fold. Hence, the set of constant positions plus
the set of variable but mutually correlated ones can serve as a basis of
having the common 3D structure for rather distinguishing protein
sequences.
%A Vasek Chvatal
%T Sylvester-Gallai theorem and metric betweenness
%D April 26, 2002
%Z Fri Jul 12 15:43:03 EDT 2002
%I DIMACS
%R 2002-19
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-19.ps.gz
%X
Sylvester conjectured in 1893 and Gallai proved some forty
years later that every finite set S of points in the plane includes
two points such that the line passing through them includes either no
other point of S or all other points of S. There are several ways of
extending the notion of lines from Euclidean spaces to arbitrary
metric spaces. We present one of them and conjecture that, with lines
in metric spaces defined in this way, the Sylvester-Gallai theorem
generalizes as follows: in every finite metric space, there is a line
consisting of either two points or all the points of the space. Then
we present slight evidence in support of this rash conjecture and
finally we discuss the underlying ternary relation of metric
betweenness.
%A Matthew Andrews
%A Antonio Fern\'andez
%A Ashish Goel
%A Lisa Zhang
%T Source Routing and Scheduling in Packet Networks
%D Apr 26, 2002
%Z Fri Jul 12 15:44:42 EDT 2002
%I DIMACS
%R 2002-20
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-20.ps.gz
%X

We study {\em routing} and {\em scheduling} in packet-switched networks. We assume an adversary that controls the injection time, source, and destination for each packet injected. A set of paths for these packets is {\em admissible} if no link in the network is overloaded. We present the first on-line routing algorithm that finds a set of admissible paths whenever this is feasible. Our algorithm calculates a path for each packet as soon as it is injected at its source using a simple shortest path computation. The length of a link reflects its current congestion. We also show how our algorithm can be implemented under today's Internet routing paradigms.

When the paths are known (either given by the adversary or computed as above) our goal is to schedule the packets along the given paths so that the packets experience small end-to-end delays. The best previous delay bounds for deterministic and distributed scheduling protocols were exponential in the path length. In this paper we present the first deterministic and distributed scheduling protocol that guarantees a polynomial end-to-end delay for every packet.

Finally, we discuss the effects of combining routing with scheduling. We first show that some unstable scheduling protocols remain unstable no matter how the paths are chosen. However, the freedom to choose paths can make a difference. For example, we show that a ring with parallel links is stable for all greedy scheduling protocols if paths are chosen intelligently, whereas this is not the case if the adversary specifies the paths. %A Mayur Datar %A S. Muthukrishnan %T Estimating Rarity and Similarity over Data Stream Windows %D Apr 30, 2002 %Z Fri Jul 12 15:51:09 EDT 2002 %I DIMACS %R 2002-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-21.ps.gz %X In the windowed data stream model, we observe items coming in over time. At any time $t$, we consider the window of the last $N$ observations $a_{t-(N-1)},a_{t-(N-2)},\ldots,a_t$, each $a_i \in \{1,\ldots,u\}$; we are allowed to ask queries about the data in the window, say, we wish to compute the minimum or the median of the items in the window. A crucial restriction is that we are only allowed $o(N)$ (often polylogarithmic in $N$) storage space, that is, space smaller than the window size, so the items within the window can not be archived. Window data stream model arose out of the need to formally reason about the underlying data analyses problems in applications like inter-networking and transactions processing.

In this paper, we study two basic problems in the windowed data stream model. The first is the estimation of the rarity of items in the window. While previous work has studied simple distributional parameters such as the number of distinct items in the window, no prior work has addressed the general problem of estimating the rarity. Our second problem is one of estimating similarity between two data stream windows using the Jacard's coefficient. Prior work has focused on $L_p$ norms and set similarity measures such as the Jacard's coefficient have been studied before in the windowed data stream model. The problems of estimating rarity and similarity have many applications in mining massive data.

We present novel, simple algorithms for estimating
rarity and similarity on windowed data streams, accurate up to
factor $1 \pm \epsilon$ using space only
logarithmic in the window size. In both cases,
our solutions are based on
modifications of the powerful min-wise hashing technique.
We expect our solutions to find applications in practice.
%A Rosario Gennaro
%A Yael Gertner
%A Jonathan Katz
%T Bounds on the Efficiency of Encryption and Digital Signatures
%D May 7, 2002
%Z Fri Jul 12 15:51:37 EDT 2002
%I DIMACS
%R 2002-22
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-22.ps.gz
%X
A central focus of modern cryptography is to investigate the weakest
possible assumptions under which various cryptographic algorithms exist.
Typically, a proof that a ``weak'' primitive (e.g., a one-way
function) implies the existence of some ``strong'' algorithm (e.g., a
private-key encryption scheme) proceeds by giving an explicit construction
of the latter from the former.
Beyond merely showing such a construction, an equally important research
direction is to explore the *efficiency* of the
construction.
One might argue that this line of research has become even more important
now that
minimal assumptions are known for many (but not all) algorithms of
interest.

Protocols for encryption (in both the public- and private-key setting) and for digital signatures are fundamental to cryptography. In this work, we show the first lower bounds on the efficiency of constructions of these protocols based on black-box access to one-way or trapdoor one-way permutations. If $S$ is the assumed security of the permutation $\pi$ (i.e., no adversary of size $S$ can ``break'' $\pi$ in the appropriate sense on a fraction larger than $1/S$ of its inputs), our results show that:

- Any public-key encryption algorithm for $m$-bit messages must query $\pi$ at least $\Omega(m/\log S)$ times. This matches the known upper bound.
- Any private-key encryption algorithm for $m$-bit messages which uses a $k$-bit key must query $\pi$ at least $\Omega(\frac{m-k}{\log S})$ times. This matches the known upper bound.
- Any signature verification algorithm for $m$-bit messages must query $\pi$ at least $\Omega(m/\log S)$ times.

We prove our results in an extension of the Impagliazzo-Rudich model. That is, we show that any black-box construction beating our lower bounds would imply the unconditional existence of a one-way function. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T Generating Dual-Bounded Hypergraphs %D May 10, 2002 %Z Fri Jul 12 15:51:49 EDT 2002 %I DIMACS %R 2002-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-23.ps.gz %X

This paper surveys some recent results on the generation of implicitly given hypergraphs and their applications in Boolean and integer programming, data mining, reliability theory, and combinatorics.

Given a monotone property $\pi$ over the subsets of a finite set $V$, we consider the problem of incrementally generating the family $\cF_{\pi}$ of all minimal subsets satisfying property $\pi$, when $\pi$ is given by a polynomial-time satisfiability oracle. For a number of interesting monotone properties, the family $\cF_{\pi}$ turns out to be {\em uniformly dual-bounded}, allowing for the incrementally efficient enumeration of the members of $\cF_{\pi}$.

Important applications include the efficient generation of minimal infrequent sets of a database (data mining), minimal connectivity ensuring collections of subgraphs from a given list (reliability theory), minimal feasible solutions to a system of monotone inequalities in integer variables (integer programming), minimal spanning collections of subspaces from a given list (linear algebra) and maximal independent sets in the intersection of matroids (combinatorial optimization).

In contrast to these results, the analogous problem of generating the family of all maximal subsets not having property $\pi$ is NP-hard for most of the monotone properties $\pi$ considered in the paper. %A Brenda J. Latka %T A Classification of Antichains of Finite Tournaments %D May 10, 2002 %Z Fri Jul 12 15:52:02 EDT 2002 %I DIMACS %R 2002-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-24.ps.gz %X Tournament embedding is an order relation on the class of finite tournaments. An antichain is a set of finite tournaments that are pairwise incomparable in this ordering. We say an antichain ${\cal A}$ can be extended to an antichain ${\cal B}$ if ${\cal A}\subseteq {\cal B}$. Those finite antichains that can not be extended to antichains of arbitrarily large finite cardinality are exactly those that contain a member of each of four families of tournaments. We give an upper bound on the cardinality of extensions of such antichains. This bound depends on the maximum order of the tournaments in the antichain. %A Yevgeniy Dodis %A Jonathan Katz %A Shouhuai Xu %A Moti Yung %T Strong Key-Insulated Signature Schemes %D May 30, 2002 %Z Fri Jul 12 15:52:16 EDT 2002 %I DIMACS %R 2002-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-25.ps.gz %X

Signature computation is frequently performed on insecure devices --- e.g., mobile phones --- operating in an environment where the private (signing) key is likely to be exposed. Strong key-insulated signature schemes are one way to mitigate the damage done when this occurs. In the key-insulated model \cite{DKXY02}, the secret key stored on an insecure device is refreshed at discrete time periods via interaction with a physically-secure device which stores a ``master key''. All signing is still done by the insecure device, and the public key remains fixed throughout the lifetime of the protocol. In a strong $(t, N)$-key-insulated scheme, an adversary who compromises the insecure device and obtains secret keys for up to $t$ periods is unable to forge signatures for any of the remaining $N-t$ periods. Furthermore, the physically-secure device (or an adversary who compromises only this device) is unable to forge signatures for \emph{any} time period.

We present here constructions of strong key-insulated signature schemes based on a variety of assumptions. First, we demonstrate %and prove secure a generic construction of a strong $(N-1, N)$-key-insulated signature scheme using any standard signature scheme. We then give a %an improved construction of a strong $(t, N)$-signature scheme whose security may be based on the discrete logarithm assumption in the random oracle model. This construction offers faster signing and verification than the generic construction, at the expense of $O(t)$ key update time and key length. % We then give an improved construction of a strong $(t, N)$-signature %scheme whose security may be based on the discrete logarithm %assumption in the random oracle model. Finally, we construct strong $(N-1,N)$-key-insulated schemes based on any ``trapdoor signature scheme'' (a notion we introduce here); our resulting construction in fact serves as an identity-based signature scheme as well. This leads to very efficient solutions based on, e.g., the RSA assumption in the random oracle model. %A Khaled M. Elbassioni %T An Algorithm for Dualization in Products of Lattices %D May 30, 2002 %Z Fri Jul 12 15:52:33 EDT 2002 %I DIMACS %R 2002-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-26.ps.gz %X Let $\cL=\cL_1\times\cdots\times\cL_n$ be the product of $n$ lattices, each of which has a bounded width. Given a subset $\cA\subseteq\cL$, we show that the problem of extending a given partial list of maximal independent elements of $\cA$ in $\cL$ can be solved in quasi-polynomial time. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T Extending the Balas-Yu Bounds on the Number of Maximal Independent Sets in Graphs to Hypergraphs and Lattices %D Jun 3, 2002 %Z Fri Jul 12 15:52:45 EDT 2002 %I DIMACS %R 2002-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-27.ps.gz %X A result of Balas and Yu (1989) states that the number of maximal independent sets of a graph $G$ is at most $\delta^{p} + 1$, where $\delta$ is the number of pairs of vertices in $G$ at distance 2, and $p$ is the cardinality of a maximum induced matching in $G$. In this paper, we give an analogue of this result for hypergraphs and, more generally, for subsets of vectors $\cB$ in the product of $n$ lattices $\cL=\cL_1\times\cdots\times\cL_n$, where the notion of an induced matching in $G$ is replaced by a certain binary tree each internal node of which is mapped into $\cB$. We show that our bounds may be nearly sharp for arbitrarily large hypergraphs and lattices. As an application, we prove that the number of maximal infeasible vectors $x \in \cL=\cL_1\times\cdots\times\cL_n$ for a system of polymatroid inequalities $f_1(x) \ge t_1,\ldots,f_r(x) \ge t_r$ does not exceed $\max\{Q,\beta^{\log t/c(2Q,\beta)}\}$, where $\beta$ is the number of minimal feasible vectors for the system, $Q=|\cL_1|+\ldots+|\cL_n|$, $t=\max\{t_1,\ldots,t_r\}$, and $c(\rho,\beta)$ is the unique positive root of the equation $2^c(\rho^{c/\log \beta}-1)=1$. This bound is nearly sharp for the Boolean case $\cL=\{0,1\}^n$, and it allows for the efficient generation of all minimal feasible sets to a given system of polymatroid inequalities with quasi-polynomially bounded right-hand sides $t_1, \ldots, t_m$. %A I. E. Zverovich %T The Domination Number of $(K_p, P_5)$-Free Graphs %D Jun 7, 2002 %Z Fri Jul 12 15:52:56 EDT 2002 %I DIMACS %R 2002-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-28.ps.gz %X We prove that, for each $p \ge 1$, there exists a polynomial time algorithm for finding a minimum domination set in the class of all $(K_p, P_5)$-free graphs. %A L. Becchetti %A S. Diggavi %A S. Leonardi %A A. Marchetti-Spaccamela %A S. Muthukrishnan %A T. Nandagopal %A A. Vitaletti %T Parallel Scheduling Problems in Next Generation Wireless Networks %D Jun 7, 2002 %Z Fri Jul 12 15:55:31 EDT 2002 %I DIMACS %R 2002-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-29.ps.gz %X Next generation 3G/4G wireless data networks allow multiple codes (or channels) to be allocated to a single user, where each code can support multiple data rates. Providing fine-grained QoS to users in such networks poses the two dimensional challenge of assigning {\em both} power (rate) and codes for every user. This gives rise to a new class of parallel scheduling problems. We abstract general downlink scheduling problems suitable for proposed next generation wireless data systems. This includes a communication-theoretic model for multirate wireless channels. In addition, while conventional focus has been on throughput maximization, we attempt to optimize the maximum response time of jobs, which is more suitable for stream of user requests. We present provable results on the algorithmic complexity of these scheduling problems. In particular, we are able to provide very simple, online algorithms for approximating the optimal maximum response time. This relies on resource augmented competitive analysis. We also perform an experimental study with realistic data of channel conditions and user requests to show that our algorithms are more accurate than our worst case analysis shows, and they provide fine-grained QoS to users effectively. %A Yan Wang %A Xin Gui Fang %A D. Frank Hsu %T On the Edge-Forwarding Indices of Frobenius Graphs %D Jun 19, 2002 %Z Fri Jul 12 15:55:32 EDT 2002 %I DIMACS %R 2002-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-30.ps.gz %X A $G$-Frobenius graph, as defined recently by Fang, Li, and Praeger, is a connected orbital graph of a Frobenius group $G=K{:}H$ with Frobenius kernel $K$ and Frobenius complement $H$. $\Gamma$ is also shown to be a Cayley graph, $\Gamma=Cay(K,S)$ for $K$ and some subset $S$ of the group $G$. On the other hand, a network $N$ with a routing function $R$, written as $(N,R)$, is an undirected graph $N$ together with a routing $R$ which consists of a collection of simple paths connecting every pair of vertices in the graph. The edge-forwarding index $\pi(\Gamma)$ of a network $(N,R)$, defined by Heydemann, Meyer, and Sotteau, is a parameter to describe the maximum load of edges of $N$. In this paper, we study the edge-forwarding index of Frobenius graphs. In particular, we obtain edge-forwarding index of a $G$-Frobenius graph $\Gamma$ with rank$(G)\leq 50$ and those of $\Gamma$ which has $type-(n_1,n_2,...,n_d)$ where $d=n,(1,2,3,...,n); d=2n-1,(1,2,...,n-1,n,n-1,...,2,1); d=2n,(1,2,...,n-1,n,n,n-1,...2,1)$, respectively. %A Brenda J. Latka %T The Well-quasi-ordered Class of Tournaments Defined by the Forbidden Subtournament $N_5$ %D Jul 21, 2002 %Z Fri Oct 11 18:50:13 EDT 2002 %I DIMACS %R 2002-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-31.ps.gz %X The tournament $N_5$ can be obtained from the transitive tournament on $\{1,\ldots,5\}$, with the natural order, by reversing the edges between successive vertices. Tournaments that do {\em not} have $N_5$ as a subtournament are said to {\em omit} $N_5$. We describe the structure of tournaments that omit $N_5$ and use this with Kruskal's Tree Theorem to prove that this class of tournaments is well-quasi-ordered. The proof involves an encoding of the indecomposable tournaments omitting $N_5$ by a finite alphabet.

The main technical problem is giving an explicit description of the indecomposable tournaments omitting $N_5$. The key to the proof that the explicit description is complete is the observation that for any indecomposable tournament $T$ with $n>1$ vertices, there is a proper indecomposable subtournament of $T$ with $n-2$ or $n-1$ vertices. Thus the claim can be verified by a natural inductive procedure; it suffices to check that for any tournament $T$ in the explicitly given list, any indecomposable extension of $T$ by at most 2 vertices that omits $N_5$ will also be found in our list. %A Alexander Barg %A Gilles Zémor %T Error exponents of expander codes under linear-complexity decoding %D Jul 21, 2002 %Z Fri Oct 11 18:50:14 EDT 2002 %I DIMACS %R 2002-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-32.ps.gz %X

A class of codes is said to reach capacity $\cC$ of the binary symmetric channel if for any rate $R<\cC$ and any $\varepsilon>0$ there is a sufficiently large $N$ such that codes of length $\ge N$ and rate $R$ from this class provide error probability of decoding at most $\varepsilon$, under some decoding algorithm.

The study of the error probability of expander codes was initiated in Barg and Z{\'e}mor (2002), where it was shown that they attain capacity of the binary symmetric channel under a linear-time iterative decoding with error probability falling exponentially with code length $N$. In this work we study variations on the expander code construction and focus on the most important region of code rates, close to the channel capacity. For this region we estimate the decrease rate (the error exponent) of error probability of decoding for randomized ensembles codes. The resulting estimate gives a substantial improvement of previous results for expander codes and some other explicit code families. %A R. Collado %A A. Kelmans %A D. Krasner %T On Convex Polytopes in the Plane "Containing" and "Avoiding" Zero %D Jul 26, 2002 %Z Fri Oct 11 18:50:14 EDT 2002 %I DIMACS %R 2002-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-33.ps.gz %X

Let be a finite set of points in the plane, X be a subset of S, and z be a point not in S. X is a z-containing set if z is in the interior of the convex hull of X. Also X is a z-avoiding set if z is not contained in the interior of the convex hull of X. Let C(S) and A(S) denote the sets of minimal z-containing and maximal z-avoiding subsets of S.

E. Boros and V. Gurvich raised the following question:

Is it true that |A(S)| is less than or equal to 2d |C(S)| where d is the dimension of the space?

This inequality is obviously true for d = 1.

In this paper we verify the inequality for d = 2 by proving the following stronger result:

Theorem: Let S be a finite set of points in the plane and z a point not in S. Suppose that z is in the interior of the convex hull of S. Then |A(S)| less than or equal to 3 |C(S)| + 1. Moreover, the equality holds if and only if |S| = 4, the convex hull of S is a quadrilateral Q, and z is the point of intersection of the two diagonals of Q. %A Endre Boros %A Vladimir Gurvich %A Steven Jaslar %A Daniel Krasner %T Stable Matchings in Three-Sided Systems with Cyclic Preferences %D Aug 16, 2002 %Z Fri Oct 11 18:50:15 EDT 2002 %I DIMACS %R 2002-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-34.ps.gz %X We consider generalizations of the Gale-Shapley (1962) Stable Marriage Problem to three-sided families. Alkan (1988) gave an example which shows that in this case stable matchings do not always exist. Here we provide a simpler example demonstrating this fact. Danilov (2001) proved that stable matchings always exist for the special case of certain acyclic preferences and he raised the problem for another special case involving cyclic preferences. Here we show that the answer is still negative by constructing a three-sided system with lexicographically cyclic preferences for which no stable matching exists. Finally, we also consider possible generalizations to $m$-sided families, for $m>3$.

Keywords: Gale-Shapley Theorem, stable matching, stable marriage, cyclic preferences. %A Graham Cormode %A S. Muthukrishnan %T Estimating Dominance Norms of Multiple Data Streams %D Aug 16, 2002 %Z Fri Oct 11 18:50:15 EDT 2002 %I DIMACS %R 2002-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-35.ps.gz %X Data streams often consist of multiple signals. Consider a stream of multiple signals (i,a_i,j) where i's correspond to the domain, j's index the different signals and a_i,j >= 0 to the value of the j'th signal at point i. We study the problem of determining the dominance norms over the multiple signals, in particular the max-dominance norm, defined as sum_i max_j {a_i,j}. It is used in applications to estimate the ``worst case influence'' of multiple processes, for example in IP traffic analysis, electrical grid monitoring and financial domain. Besides finding many applications, it is a natural measure: it generalizes the notion of union of data streams and may be alternately thought of as estimating the L1 norm of the upper envelope of multiple signals.

We present the first known data stream algorithm for estimating max-dominance of multiple signals. In particular, we use workspace and time-per-item that are both sublinear (in fact, poly-logarithmic) in the input size. The algorithm is simple and implementable; its analysis relies on using properties of stable random distributions with small parameter alpha, which may be a technique of independent interest. In contrast we show that other dominance norms -- min-dominance sum_i min_j {a_i,j}, count-dominance (|{i | a_i > b_i}|) or relative-dominance (sum_i a_i/max{1,b_i}), are all impossible to estimate accurately with sublinear space. %A Alexander Barg %A Gregory Kabatiansky %T A class of i.p.p. codes with efficient identification %D Aug 30, 2002 %Z Fri Oct 11 18:50:18 EDT 2002 %I DIMACS %R 2002-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-36.ps.gz %X Let C be a code of length n over a q-ary alphabet. An n-word y is called a descendant of a set of t codewords x^1,...,x^t if y_i \in {x^1_i,...,x^t_i} for all i=1,...,n. A code is said to have the t-identifying parent property (t-i.p.p.) if for any n-word y that is a descendant of at most t parents it is possible to identify at least one of them.

An explicit construction is presented of t-i.p.p. codes of rate bounded away from zero, for which identification can be accomplished with complexity poly(n). %A Paul Lemke %A Steven S. Skiena %A Warren D. Smith %T Reconstructing Sets From Interpoint Distances %D Sep 5, 2002 %Z Fri Oct 11 18:50:19 EDT 2002 %I DIMACS %R 2002-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-37.ps.gz %X Which point sets realize a given distance multiset? Interesting cases include the ``turnpike problem'' where the points lie on a line, the ``beltway problem'' where the points lie on a loop, and multidimensional versions. We are interested both in the algorithmic problem of determining such point sets for a given collection of distances and the combinatorial problem of finding bounds on the maximum number of different solutions. These problems have applications in genetics and crystallography.

We give an extensive survey and bibliography in an effort to connect the independent efforts of previous researchers, and present many new results. In particular, we give improved combinatorial bounds for the turnpike and beltway problems. We present a pseudo-polynomial time algorithm as well as a practical $\bigo ( 2^n n \log n )$-time algorithm that find all solutions to the turnpike problem, and show that certain other variants of the problem are NP-hard. We conclude with a list of open problems. %A V.E. Alekseev %A D.V. Korobitsyn %A V.V. Lozin %T Boundary classes of graphs for the dominating set problem %D Sep 12, 2002 %Z Fri Oct 11 18:50:20 EDT 2002 %I DIMACS %R 2002-38 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-38.ps.gz %X The notion of a boundary class has been recently introduced as a tool for classification of hereditary classes of graphs according to the time complexity of NP-hard graph problems. In the present paper we concentrate on the dominating set problem and obtain three boundary classes for it. %A Carlos Castillo-Chavez %A Fred S. Roberts %T Report on DIMACS Working Group Meeting: Mathematical Sciences Methods for the Study of Deliberate Releases of Biological Agents and their Consequences %D Sep 18, 2002 %Z Fri Oct 11 18:50:21 EDT 2002 %I DIMACS %R 2002-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-39.ps.gz %X Report on DIMACS Working Group Meeting: Mathematical Sciences Methods for the Study of Deliberate Releases of Biological Agents and their Consequences %A Fred S. Roberts %T Challenges for Discrete Mathematics and Theoretical Computer Science in the Defense Against Bioterrorism %D Sep 22, 2002 %Z Fri Oct 11 18:50:21 EDT 2002 %I DIMACS %R 2002-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-40.ps.gz %X Dealing with bioterrorism requires detailed planning of preventive measures and responses. Both planning and response require precise reasoning and extensive analysis of the type that mathematical scientists are very good at. Discrete mathematics and theoretical computer science, broadly defined, seem very relevant. We describe challenges for discrete mathematics and theoretical computer science raised by the need to plan for and defend against bioterrorist attacks %A Victoria Ungureanu %A Phillip G. Bradford %A Michael Katehakis %A Benjamin Melamed. %T Deferred Assignment Scheduling in Clustered Web Servers %D Oct 1, 2002 %Z Fri Oct 11 18:50:23 EDT 2002 %I DIMACS %R 2002-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-41.ps.gz %X This paper proposes new scheduling policies for clustered servers, which are based on job size. The proposed algorithms are shown to be efficient, simple and easy to implement. They differ from traditional methods in the way jobs are assigned to back-end servers. The main idea is to defer scheduling as much as possible in order to make better use of the accumulated information on job sizes. Furthermore, the proposed algorithms are designed to work effectively with the class of job-size distributions often encountered on the Internet. To gauge the efficacy of the proposed algorithms, the paper presents an empirical case study that shows these algorithms perform well on input from real-life trace data measured at Internet clustered servers. %A Boris Galitsky %A Dmitry Vinogradov %T Using reasoning about dynamic domains and inductive reasoning for automatic processing of claims %D Oct 17, 2002 %Z Fri Dec 13 10:55:03 EDT 2002 %I DIMACS %R 2002-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-42.ps.gz %X We report on the novel approach to modeling a dynamic domain with limited knowledge. Such domain may include participating agents such that we are uncertain about motivations and decision-making principles of some of these agents. Our model for such domain includes the deductive and inductive components. The former component is based on situation calculus and describes the behavior of agents with complete information. The latter machine learning-based inductive component that involves its previous experience in prediction the agents^Ò actions.

Suggested reasoning machinery is applied to the problem of processing of claims of unsatisfied customers. The task is to predict the future action of a participating agent (the company that has upset the customer) to determine the required course of actions to settle down the claim. We believe our framework reflect the general setting of reasoning in a dynamic domains in the conditions of uncertainty, merging analytical and analogy-based reasoning. %A I. E. Zverovich %T Unique domination and domination perfect graphs %D Oct 17, 2002 %Z Fri Dec 13 10:55:04 EDT 2002 %I DIMACS %R 2002-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-43.ps.gz %X We review a characterization of domination perfect graphs in terms of forbidden induced subgraphs obtained by Zverovich and Zverovich \cite{ZverovichZ95} using a computer code. Then we apply it to a problem of unique domination in graphs recently proposed by Fischermann and Volkmann. %A Andreas Brandstadt %A Peter L. Hammer %A Van Bang Le %A Vadim V. Lozin %T Bisplit Graphs %D Oct 24, 2002 %Z Fri Dec 13 10:55:05 EDT 2002 %I DIMACS %R 2002-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-44.ps.gz %X A graph is bisplit if it can be partitioned into a stable set and a bi-clique (i.e. a complete bipartite graph). We provide an O(nm) recognition algorithm for these graphs, and characterize them in terms of forbidden induced subgraphs. Moreover, we show that the recognition problem of the slightly larger class of weak bisplit graphs is NP-complete. %A Vadim Mottl %A Ilya Muchnik %T Serial and tree-serial dynamic programming with application to fact identification %D Nov 15, 2002 %Z Fri Dec 13 10:55:06 EDT 2002 %I DIMACS %R 2002-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-45.ps.gz %X This report continues the series of DIMACS Technical Reports on the variational approach to constructing algorithms of data analysis that consists in systematically exploiting the fact that all the kinds of decision making rest on the intent, at least, a mental one, to minimize an appropriate measure of discrepancy between the observed data set and result of its processing in the form of a model of the original data set to be evaluated. The specificity of the optimization problems adequate to a majority of practical data analysis problems is that the variables constituting the sought-for data set model are associated with the nodes of a graph that determines its a priori structure. As a rule, the objective function to be minimized turns out to be pair-wise separable in accordance with this graph, i.e. be the sum of partial functions of one and two variables associated with nodes and edges of graph.

However, for a separable objective function of general kind, the problem of global optimization does not lend itself to an effective algorithmical solution until the variable adjacency graph is assumed to be a tree. For this latter kind of objective functions, the original optimization problem breaks up into a hierarchy of partial problems each of which consists in the optimization of a set of functions of only one variable. The proposed principle of solving tree-like pair-wise separable optimization problems on the basis of a hierarchical elimination of variables is called in this Report the tree-serial dynamic programming as a generalization of the classical serial dynamic programming.

Such a structure of the interaction of variables in an objective function is an intermediate special case between easily solvable serial problems and challenging nonserial ones. It is shown, that such a moderate generalization, on the one hand, retains practically all the computational advantages of the serial problems and, on the other, easily allows for using it, with a bit of heuristics, in solving typical image processing problems.

As a practical problem, we address in this work the problem of face recognition from a single image. The tree-serial dynamic programming is applied to fulfill a nonlinear transformation of one image plane relative to another by spatially constrained elastic matching of two pixel grids for the purpose of featureless face identification by immediate measuring image similarity. The method provides the linear computational complexity with respect to the number of pixels without application of parallel computers.

It will be shown in the next Report that the proposed approach is adequate, along with the problem of face identification, also the problems of face detection and tracking in an image flow and, so, covers the entire concept of dynamic vision. %A Philip MacKenzie %T The PAK Suite: Protocols for Password-Authenticated Key Exchange %D Oct 24, 2002 %Z Fri Dec 13 10:55:07 EDT 2002 %I DIMACS %R 2002-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-46.ps.gz %X In this paper we give a detailed formal description of the PAK password-authenticated key exchange protocol and some variants, and provide provide complete proofs of security which we believe are more straightforward than the original proofs. We also show a new general method (called the Z-method) for making these protocols resilient to server-compromise, so as to not allow an attacker that obtains password verification data from a server to then impersonate a user. When this method is applied to PAK, we call the resulting protocol PAK-Z. Finally, we discuss the current state-of-the-art in password-authenticated key exchange, with respect to both theory and practice. %A I. E. Zverovich %A I. I. Zverovich %T Ratio of generalized parameters %D Nov 12, 2002 %Z Fri Dec 13 10:55:08 EDT 2002 %I DIMACS %R 2002-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-47.ps.gz %X

Let ${\mathcal{P}}$ be a class of graphs. A {\em ${\mathcal{P}}$-set} in a graph $G$ is a vertex subset $X$ such that $G(X) \in {\mathcal{P}}$. We define the {\em ${\mathcal{P}}$-stability number} of a graph $G$, $\alpha_{\mathcal{P}}(G)$, as the maximum cardinality of a ${\mathcal{P}}$-set in $G$.

A {\em ${\mathcal{P}}$-dominating set} in a graph is a dominating ${\mathcal{P}}$-set. Accordingly, the {\em ${\mathcal{P}}$-domination number} of a graph $G$, $\gamma_{\mathcal{P}}(G)$, is the minimum cardinality of a ${\mathcal{P}}$-dominating set in $G$.

For each $r \ge 1$, we define a graph $G$ to be an {\em $\alpha_{\mathcal{P}}/\gamma_{\mathcal{Q}}(r)$-perfect graph} if $\alpha_{\mathcal{P}}(H)/\gamma_{\mathcal{Q}}(H) \le r$ for each induced subgraph $H$ of $G$.

We characterize all classes of $\alpha_{\mathcal{P}}/\gamma_{\mathcal{Q}}(r)$-perfect graphs in terms of forbidden induced subgraphs for all hereditary classes ${\mathcal{P}}$ and ${\mathcal{Q}}$ containing all edgeless graphs, and for all real numbers $r \ge 1$. We propose a number of related open problems and conjectures. %A N. V. R. Mahadev %A Fred S. Roberts %T Consensus List Colorings of Graphs and Physical Mapping of DNA %D Nov 13, 2002 %Z Fri Dec 13 10:55:09 EDT 2002 %I DIMACS %R 2002-48 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-48.ps.gz %X In graph coloring, one assigns a color to each vertex of a graph so that neighboring vertices get different colors. We shall talk about a bioconsensus problem relating to graph coloring and discuss the applicability of the ideas to the DNA physical mapping problem. In many applications of graph coloring, one gathers data about the acceptable colors at each vertex. A list coloring is a graph coloring so that the color assigned to each vertex belongs to the list of acceptable colors associated with that vertex. We consider the situation where a list coloring cannot be found. If the data contained in the lists associated with each vertex are made available to individuals associated with the vertices, it is possible that the individuals can modify their lists through trades or exchanges until the group of individuals reaches a set of lists for which a list coloring exists. We describe several models under which such a consensus set of lists might be attained. In the physical mapping application, the lists consist of the sets of possible copies of a target DNA molecule from which a given clone was obtained and trades or exchanges correspond to correcting errors in data. %A Gabriela Alexe %A Sorin Alexe %A Peter L. Hammer %A Alexander Kogan %T Comprehensive vs. Comprehensible Classifiers in Logical Analysis of Data %D Dec 4, 2002 %Z Fri Dec 13 10:55:11 EDT 2002 %I DIMACS %R 2002-49 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-49.ps.gz %X The main objective of this paper is to compare the classification accuracy provided large, comprehensive collections of patterns (rules) derived from archives of past observations, with that provided by small, comprehensible collections of patterns. This comparison is carried out here on the basis of an empirical study, using several publicly available datasets. The results of this study show that the use of comprehensive collections allows a slight increase of classification accuracy, and that the cost of comprehensibility is small. %A Gabriela Alexe %A Peter L. Hammer %T Spanned Patterns for the Logical Analysis of Data %D Dec 4, 2002 %Z Fri Dec 13 10:55:11 EDT 2002 %I DIMACS %R 2002-50 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-50.ps.gz %X In a finite dataset consisting of positive and negative observations represented as real valued n-vectors, a positive (negative) pattern is an interval in Rn with the property that it contains sufficiently many positive (negative) observations, and sufficiently few negative (positive) ones. A pattern is spanned if it does not include properly any other interval containing the same set of observations. Although large collections of spanned patterns can provide highly accurate classification models within the framework of the Logical Analysis of Data, no efficient method for their generation is currently known. We propose in this paper an incrementally polynomial time algorithm for the generation of all spanned patterns in a dataset, which runs in linear time in the output; the algorithm resembles closely the Blake and Quine consensus method for finding the prime implicants of Boolean functions. The efficiency of the proposed algorithm is tested on various publicly available datasets. In the last part of the paper, we present the results of a series of computational experiments which show the high degree of robustness of spanned patterns. %A Sorin Bastea %A Raffaele Esposito %A Joel L. Lebowitz %A Rossana Marra %T Hydrodynamics of Binary Fluid Phase Segregation %D Dec 4, 2002 %Z Fri Dec 13 10:55:11 EDT 2002 %I DIMACS %R 2002-51 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-51.pdf %X Starting with the Vlasov-Boltzmann equation for a binary fluid mixture, we derive an equation for the velocity field u when the system is segregated into two phases (at low temperatures) with a sharp interface between them. u satisfies the incompressible Navier-Stokes equations together with a jump boundary condition for the pressure across the interface which, in turn, moves with a velocity given by the normal component of u. Numerical simulations of the Vlasov-Boltzmann equations for shear flows parallel and perpendicular to the interface in a phase segregated mixture support this analysis. We expect similar behavior in real fluid mixtures. %A Gabriela Alexe %A Sorin Alexe %A Yves Crama %A Stephan Foldes %A Peter L. Hammer %A Bruno Simeone %T Consensus algorithms for the generation of all maximal bicliques %D Dec 8, 2002 %Z Fri Dec 13 10:55:11 EDT 2002 %I DIMACS %R 2002-52 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-52.ps.gz %X We describe a new algorithm for generating all maximal bicliques (i.e. complete bipartite, not necessarily induced subgraphs) of a graph. The algorithm is inspired by, and is quite similar to, the consensus method used in propositional logic. We show that some variants of the algorithm are totally polynomial, and even incrementally polynomial. The total complexity of the most efficient variant of the algorithms presented here is polynomial in the input size, and only linear in the output size. Computational experiments demonstrate its high efficiency on randomly generated graphs with up to 2,000 vertices and 20,000 edges. %A Dieter Rautenbach %T A Note on Linear Discrepancy and Bandwidth %D Dec 8, 2002 %Z Fri Dec 13 10:55:12 EDT 2002 %I DIMACS %R 2002-53 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-53.ps.gz %X Fishburn, Tanenbaum and Trenk \cite{FTT01a} define the linear discrepancy ${\rm ld}(P)$ of a poset $P=(V,<_P)$ as the minimum integer $k\geq 0$ for which there exists a bijection $f:V\to\{ 1,2,\ldots,|V|\}$ such that $u<_P v$ implies $f(u) < f(v)$ and $u||_P v$ implies $|f(u)-f(v)|\leq k$. In \cite{FTT01b} they prove that the linear discrepancy of a poset equals the bandwidth of its cocomparability graph.

Here we provide partial solutions to some problems formulated in \cite{FTT01a} about the linear discrepancy and the bandwidth of cocomparability graphs. %A Vadim Lozin %A Dieter Rautenbach %T On the band-, tree- and clique-width of graphs with bounded vertex degree %D Dec 8, 2002 %Z Fri Dec 13 10:55:12 EDT 2002 %I DIMACS %R 2002-54 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-54.ps.gz %X The band-, tree- and clique-width are of primary importance in algorithmic graph theory due to the fact that many problems being NP-hard for general graphs can be solved in polynomial time when restricted to graphs where one of these parameters is bounded. It is known that for any fixed $\Delta\geq 3$, all three parameters are unbounded for graphs with vertex degree at most $\Delta$. In this paper, we distinguish representative subclasses of graphs with bounded vertex degree that have bounded band-, tree- or clique-width. Our proofs are constructive and lead to efficient algorithms for a variety of NP-hard graph problems when restricted to those classes. %A T. Laustsen %A F. Verstraete %A S. J. van Enk %T Local vs. joint measurements for the entanglement of assistance %D Dec 8, 2002 %Z Fri Dec 13 10:55:15 EDT 2002 %I DIMACS %R 2002-55 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-55.ps.gz %X We consider a variant of the entanglement of assistance, as independently introduced by D.P. DiVincenzo {\em et al.} ({\tt quant-ph/9803033}) and O. Cohen (Phys. Rev. Lett. {\bf 80}, 2493 (1998)). Instead of considering three-party states in which one of the parties, the assistant, performs a measurement such that the remaining two parties are left with on average as much entanglement as possible, we consider four-party states where two parties play the role of assistants. We answer several questions that arise naturally in this scenario, such as (i) how much more entanglement can be produced when the assistants are allowed to perform joint measurements, (ii) for what type of states are local measurements sufficient, (iii) is it necessary for the second assistant to know the measurement outcome of the first, and (iv) are projective measurements sufficient or are more general POVMs needed? %A Tongyin Liu %A Yanpei Liu %T On Generating Three-connected Planar Graphs %D Dec 8, 2002 %Z Fri Dec 13 10:55:15 EDT 2002 %I DIMACS %R 2002-56 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-56.ps.gz %X In this paper, a new method of generating three connected planar graphs in terms of generators is provided. Also it is shown that a planar graph is three connected planar graph if, and only if, it has a wheel or a pseudo-wheel as a generator. %A Ying Liu %T Lower bounds for the size of binary space partition of rectangles %D Dec 8, 2002 %Z Fri Dec 13 10:55:15 EDT 2002 %I DIMACS %R 2002-57 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-57.ps.gz %X Binary space partition (BSP) tree is one of the most popular data structures in computational geometry. We proved that the lower bound on the exact size of BSP trees for a set of $n$ isothetic rectangles in the plane is $2n-o(n)$ when the rectangles tile the underlying space, this closed the gap between the upper and lower bounds for this case. Also, in general, for a set of $n$ isothetic rectangles in the plane, we improved the lower bound from $\frac{9}{4}n-o(n)$ to $\frac{7}{3}n-o(n)$. %A D. Frank Hsu %A Jacob Shapiro %A Isak Taksa %T Methods of data fusion in information retrieval: rank vs. score combination %D Feb 3, 2003 %Z Fri Apr 18 15:56:00 EDT 2003 %I DIMACS %R 2002-58 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2002/2002-58.doc.gz %X Combination of multiple evidences (multiple query formulations, multiple retrieval schemes or systems) has been shown (mostly experimentally) to be effective in data fusion in information retrieval. However, the question of why and how combination should be done still remains largely unanswered. In this paper, we provide a model for simulation and analysis in the study of data fusion in the information retrieval domain. A rank-score function is defined and the concept of a Cayley graph is used in the design and analysis of our framework. Our model and results have led to better understanding of the data fusion phenomena in the information retrieval domain. In particular, we have shown (analytically) and in simulation that combination using rank performs better than combination using score under certain conditions. %A E. Boros %A V. Gurvich %A Y. Liu %T Comparison of convex hulls and box hulls %D Jan 22, 2003 %Z Fri Apr 18 16:20:18 EDT 2003 %I DIMACS %R 2003-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-01.ps.gz %X A {\it convex hull} of a set of points $X$ is the minimal convex set containing $X$. A {\it box $B$} is an interval $B=\{\vx|\vx\in [\va, \vb], \va, \vb \in \R^n\}$. A {\it box hull} of a set of points $X$ is defined to be the minimal box containing $X$. Because both convex hulls and box hulls are closure operations of points, classical results for convex sets can naturally be extended for box hulls. We consider here the extensions of theorems by Carath\'{e}odory, Helly and Radon to box hulls and obtain exact results. %A Vincent Mousseau %A Lu\'is C. Dias %A Jos\'e R. Figueira %T On the Notion of Category Size in Multiple Criteria Sorting Models %D Jan 22, 2003 %Z Fri Apr 18 16:20:19 EDT 2003 %I DIMACS %R 2003-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-02.ps.gz %X We consider the Multiple Criteria Sorting Problem, that aims at assigning each alternative in a finite set $A$ to one of the predefined categories. We propose a new concept of category size that refers to ``{\it the proportion by which an evaluation vector corresponding to a realistic alternative is assigned to the category}''.

Sorting problems usually refer to absolute evaluation (the assignment of an alternative does not depend on the others), as opposed to ranking and choice problems in which the very purpose is to compare alternatives against each other. Considering constraints concerning category size lead to define a Constrained Sorting Problem which refers both to absolute and relative evaluation.

After identifying decision situations in which category size is useful for modelling purposes, this paper defines the concept of category size and proposes a way to compute the size of categories, even when the set of alternatives and/or the preference information is/are imprecisely known. We show how this notion can be used in a preference elicitation process. Finally, in order to illustrate the use of this concept, we propose a procedure to infer the values for preference parameters that accounts for specifications (provided by the DM) about the size of categories, in the context of the UTADIS sorting model. %A V. Lozin %A D. Rautenbach %T Chordal Bipartite Graphs of Bounded Tree- and Clique-width %D Jan 27, 2003 %Z Fri Apr 18 16:20:20 EDT 2003 %I DIMACS %R 2003-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-03.ps.gz %X A bipartite graph is chordal bipartite if every cycle of length at least six has a chord. In the class of chordal bipartite graphs the tree-width and the clique-width are unbounded.

Our main results are that chordal bipartite graphs of bounded vertex degree have bounded tree-width and that $k$-fork-free chordal bipartite graphs have bounded clique-width, where a $k$-fork is the graph arising from a $K_{1,k+1}$ by subdividing one edge once. This implies polynomial-time solvability for a variety of algorithmical problems for these graphs. %A Xuhui Ao %A Naftaly Minsky %T Flexible Regulation of Distributed Coalitions %D Feb 10, 2003 %Z Fri Apr 18 16:20:21 EDT 2003 %I DIMACS %R 2003-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-04.ps.gz %X There is a growing tendency for organizations to form coalitions in order to collaborate---by sharing some of their resources, or by coordinating some of their activities. The question addressed in this paper is: how should such coalitions be regulated?

Our approach to this question is based on the following definition of the governance of coalitions: A coalition C is a set {E1,E2,...,En} of enterprises, which interoperate under an ensemble of policies [Pc, {Pi}], where Pc is the coalition policy that governs the coalition as a whole, and Pi is the local policy of enterprise Ei (for each i), which governs its participation in the coalition. This means, in particular, that every interaction between an agent Xi of enterprise Ei and an agent Xj of Ej must comply with the local policies Pi and Pj, as well as with the coalition policy Pc.

We also require that the policy-ensemble [Pc, {Pi}] of a coalition would satisfy the following principle of flexibility: Each local policy Pi can be defined and changed independently of other local policies in this ensemble, and without any knowledge of them.

We will describe a regulatory mechanism for coalitions, which provides for efficient and decentralized enforcement of a wide range of policies that might govern a coalition, and which satisfies our principle of flexibility. %A Piotr Berman %A Paul Bertone %A Bhaskar DasGupta %A Mark Gerstein %A Ming-Yang Kao %A Michael Snyder %T Fast Optimal Genome Tiling with Applications to Microarray Design and Homology Search %D Feb 10, 2003 %Z Fri Apr 18 16:20:23 EDT 2003 %I DIMACS %R 2003-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-05.ps.gz %X In this paper we consider several variations of the following basic tiling problem: given a sequence of real numbers with two size bound parameters, we want to find a set of tiles of maximum total weight such that each tiles satisfies the size bounds. A solution to this problem is important to a number of computational biology applications such as selecting genomic DNA fragments for PCR-based amplicon microarrays and performing homology searches with long sequence queries. Our goal is to design efficient algorithms with linear or near-linear time and space in the normal range of parameter values for these problems. For this purpose, we first discuss the solution to a basic online interval maximum problem via a sliding window approach and show how to use this solution in a non-trivial manner for many of the tiling problems introduced. We also discuss NP-hardness results and approximation algorithms for generalizing our basic tiling problem to higher dimensions. Finally, computational results from applying our tiling algorithms to genomic sequences of five model eukaryotes are reported. %A Andras Prekopa %A Xiaoling Hou %T A Data Mining Problem in Stochastic Programming %D Mar 5, 2003 %Z Fri Apr 18 16:20:24 EDT 2003 %I DIMACS %R 2003-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-06.ps.gz %X In this paper we consider a linear programming problem where some or all technology coefficients are deterministic but their values are unknown. Samples are taken to estimate these coefficients and the problem is to determine the optimal sample sizes. If we replace the unknown coefficients by their estimations, then we obtain a random linear programming problem the optimum value of which is also random. We want to find sample sizes such that the confidence interval, created for the unknown deterministic optimum value, by the use of the samples, should cover it by a prescribed large probability, and, subject to this constraint, the total cost of sampling should be minimum. %A Xiaomin Chen %T On a Conjecture of Chvatal %D Mar 11, 2003 %Z Fri Apr 18 16:20:25 EDT 2003 %I DIMACS %R 2003-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-07.ps.gz %X The Sylvester-Gallai theorem asserts that every finite set $S$ of points in two-dimensional Euclidean space includes two points, $a$ and $b$, such that either there is no other point in $S$ is on the line $ab$, or the line $ab$ contains all the points in $S$. Recently, V. Chv\'{a}tal extended the notion of lines to arbitrary metric spaces and made a conjecture that generalizes the Sylvester-Gallai theorem. We prove this conjecture for subspaces of $\ell_1^2$, the two-dimensional space with the $\ell_1$-metric. %A Wen-Hua Ju %A David Madigan %A Steven L. Scott %T On Bayesian Learning of Sparse Classifiers %D Mar 20, 2003 %Z Fri Apr 18 16:20:26 EDT 2003 %I DIMACS %R 2003-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-08.ps.gz %X Figueiredo (2001) and Figueiredo and Jain (2001) described a particular sparseness-inducing Bayesian model for probit regression. For several standard datasets, they reported predictive performance for their model that was as a good as, or better than, previously reported results. This paper explores several aspects of the Figueiredo and Jain model in an attempt to better understand its performance. We modify the Figueiredo and Jain approach in three ways. First, we introduce an alternative prior distribution. Second, we propose a fully Bayesian MCMC learning algorithm. Third, we replace their kernel based classifier with a linear classfier. We measure the impact of these modifications on three publicly available test data sets. Preliminary results indicate that while each change can produce a noticeable impact on test error rates, no one approach dominates the others in all cases. %A Krishnan Kumaran %A Lijun Qian %T Uplink Scheduling in CDMA Packet-Data Systems %D Apr 4, 2003 %Z Fri Apr 18 16:20:27 EDT 2003 %I DIMACS %R 2003-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-09.ps.gz %X Uplink scheduling in wireless systems is gaining importance due to arising uplink intensive data services (ftp, image uploads etc.), which could be hampered by the currently in-built asymmetry in favor of the downlink. In this work, we propose and study algorithms for efficient uplink packet-data scheduling in a CDMA cell. The algorithms attempt to maximize system throughput under transmit power limitations on the mobiles assuming instantaneous knowledge of user queues and channels. However no channel statistics or traffic characterization is necessary. Apart from increasing throughput, the algorithms also improve fairness of service among users, hence reducing chances of buffer overflows for poorly located users.

The major observation arising from our analysis is that it is advantageous on the uplink to schedule ``strong'' users one-at-a-time, and ``weak'' users in larger groups. This contrasts with the downlink where one-at-a-time transmission for all users has shown to be the preferred mode in much previous work. Based on the optimal schedules, we propose less complex and more practical approximate methods, both of which offer significant performance improvement compared to one-at-a-time transmission, and the widely acclaimed {\em Proportional Fair} (PF) algorithm, in simulations. When queue content cannot be fed back, we propose a simple modification of PF, {\em Uplink PF} (UPF), that offers similar improvement. %A E. Boros %A V. Gurvich %T Perfect Graphs, Kernels, and Cores of Cooperative Games %D Apr 1, 2003 %Z Fri Apr 18 16:20:28 EDT 2003 %I DIMACS %R 2003-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-10.ps.gz %X A kernel in a directed graph is an independent set, which is reachable from each other vertex by an arc. A directed graph in which every induced subgraph has a kernel is called kernel-perfect. The existence of a kernel in directed graphs, and in particular, kernel-perfect orientations of undirected graphs is strongly related to perfect graphs, and has several applications in combinatorics and game theory. These and related results are the subject of this survey. Though some of these results are independent of the Strong Perfect Graph Conjecture, the recent solution of this conjecture and the efficient recognition of perfect graphs have several important implications, in particular in game theory.

**Key words:** kernel, kernel-solvable graphs, perfect graphs,
normal hypergraphs, Strong Perfect Graph Conjecture, Berge-Duchet
conjecture; cooperative games, coalition, stable family of coalitions,
game form, effectivity function, stable effectivity functions, Scarf
theorem.
%A Graham Cormode
%A S. Muthukrishnan
%T Radial Histograms for Spatial Streams
%D Apr 1, 2003
%Z Fri Apr 18 16:20:29 EDT 2003
%I DIMACS
%R 2003-11
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-11.ps.gz
%X Many data streams relate to geographic or spatial information such as the
tracking of moving objects, or location-specific measurements and queries.
While several techniques have been developed recently for processing
numerical, text or XML streams, new techniques are needed to process
spatial queries on streaming geometric data.

We propose a novel data structure, the Radial Histogram, to process streams including spatial data points. It allows a number of geometric aggregates involving the spread and extent of the points in the data streams---diameter, furthest neighbors, convex hulls---to be accurately estimated to arbitrary precision. By using multiple Radial Histograms for a set of given facilities, we can process data streams consisting of massive numbers of client points. We can then accurately estimate number of spatial aggregates involving relative placement of clients with respect to the facilities, including reverse nearest neighbors, spatial joins and more.

Radial histograms use very small space and are exceedingly simple to implement. Nevertheless, we prove that they guarantee accurate estimation for spatial aggregate queries mentioned above. An experimental evaluation shows them to perform very well in practice. %A I. E. Zverovich %A V. E. Zverovich %T Basic perfect graphs and their extensions %D Apr 1, 2003 %Z Fri Apr 18 16:20:30 EDT 2003 %I DIMACS %R 2003-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-12.ps.gz %X We characterize basic graphs introduced by Chudnovsky, Robertson, Seymour, and Thomas \cite{ChudnovskyRST} in terms of forbidden induced subgraphs. %The {\em GraphLab} software package \cite{Zverovich-GraphLab} %was used in our research. Then we apply Reducing Pseudopath Method of Zverovich \cite{ZverovichRRR14-2001} to find forbidden induced subgraphs the substitutional closure of the class of basic graphs, thus obtaining a characterization of a hereditary subclass of perfect graphs. %A Gary McGraw %T From the Ground Up: The DIMACS Software Security Workshop %D Apr 4, 2003 %Z Fri Apr 18 16:20:31 EDT 2003 %I DIMACS %R 2003-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-13.ps.gz %X The DIMACS Software Security Workshop %A Piotr Berman %A Bhaskar DasGupta and Ming-Yang Kao %T Tight Approximability Results for Test Set Problems in Bioinformatics %D Apr 11, 2003 %Z Fri Apr 18 16:20:32 EDT 2003 %I DIMACS %R 2003-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-14.ps.gz %X In this paper, we investigate several versions of a test set problem. In the simplest version, given a family of tests, viewed as subsets of the universe, we want to select the smallest subfamily that separates every two elements of the universe. We show that this problem is exactly as difficult to approximate as the set cover problem, up to a factor of 1 plus or minus o(1). The same holds for several more elaborate versions of the problem that have applications in bioinformatics; for example, in one version the universe consists of strings over a finite alphabet and each test checks the presence of a specific substring. We also show that if we allow to form tests as unions of the given tests, the problem has essentially the same level of computational intractability as graph coloring. %A Victoria Ungureanu %A Michael Katehakis %A Benjamin Melamed %T Towards an Efficient Cluster-based E-Commerce Server %D Apr 16, 2003 %Z Fri Apr 18 16:20:34 EDT 2003 %I DIMACS %R 2003-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-15.ps.gz %X Cluster-based server architecture combines combines good performance and low cost, and is commonly used for applications involving high volume traffic of requests. Essentially, a cluster-based server consists of a front-end dispatcher and several back-end servers. The dispatcher receives incoming requests, and then assigns them to back-end servers, which are responsible with request processing. The many benefits of cluster-based servers make them a good choice for e-commerce applications as well. However, applying this type of architecture to e-commerce applications is hindered by the fact that an e-commerce cluster has the additional task of validating requests, i.e. it has to verify that requests comply with contract terms. The problem is further complicated by the fact that contract terms may be expressed in terms of dynamic, mutable state. The problem we are addressing in this paper is how can e-commerce requests be assigned for processing so that load can be balanced among back-end servers and validation of requests can be done in a efficient manner.

To deal with this problem we propose that each stateful contracts is a-priori assigned to a back-end server, called the base of this contract, which is responsible for maintaining its state. The dispatcher assigns requests in the following manner. A request governed by a stateless contracts is assigned to the least loaded server. A requests governed by a stateful contract is assigned to its base whenever the base is not overloaded. If the base is heavily loaded, the dispatcher assigns the request to the least loaded server. Only, in the last case the back-end server needs to contact the base to retrieve the state. To enable a back-end server to detect when the state is needed, we propose a contract formulation that explicitly states the contract type. %A Victoria Ungureanu %A Phillip G. Bradford %A Michael Katehakis %A Benjamin Melamed %T Class-Dependent Assignment in Cluster-based Servers %D Apr 16, 2003 %Z Fri Apr 18 16:20:35 EDT 2003 %I DIMACS %R 2003-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-16.ps.gz %X A cluster-based server consists of a front-end dispatcher and several back-end servers. The dispatcher receives incoming requests, and then assigns them to back-end servers for processing. Our goal is to devise an assignment policy that has good response time performance, and is practical to implement in that the amount of information used by the dispatcher is relatively small, so that the attendant computation and communication overheads are low. In contrast to extant assignment policies that apply the same assignment policy to all incoming jobs, our approach calls for the dispatcher to classify incoming jobs as long or short, and then use class-dependent assignment policies. Specifically, we propose a policy, called CDA (Class Dependent Assignment), where short jobs are assigned Round-Robin as soon as they arrive, while long jobs are deferred and assigned only when a back-end server becomes idle. Furthermore, when processing a long job, a back-end server is not assigned any other jobs.

Our approach is motivated by empirical evidence suggesting that the sizes of files traveling on the Internet follow power-law distributions, where long jobs constituting a small fraction of all incoming jobs actually account for a large fraction of the overall load. To gauge the performance of the proposed policy, we exercised it on empirical data traces measured at Internet sites serving the 1998 World Cup. Since the assignment of long jobs incurs computational overhead as well as extra communication overhead, we studied the performance of CDA as a function of the fraction of jobs classified as long. Our study shows that classification of even a small fraction of jobs as long can have a profound impact on overall response time performance. More specifically, our experimental results show that if less than 3% of the jobs are classified as long, then CDA outperforms traditional policies, such as Round-Robin, by two orders of magnitude. From an implementation viewpoint, these results support our contention that CDA -based assignment is a practical policy combining low overhead and greatly improved performance. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T On the Complexity of Some Enumeration Problems for Matroids %D Apr 16, 2003 %Z Fri Apr 18 16:20:37 EDT 2003 %I DIMACS %R 2003-17 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-17.ps.gz %X We present an incremental polynomial-time algorithm for enumerating all circuits of a matroid or, more generally, all minimal spanning sets for a flat. We also show the NP-hardness of several related enumeration problems. %A Flip Korn %A S. Muthukrishnan %A Yunyue Zhu %T IPSOFACTO: A Visual Correlation Tool for Aggregate Network Traffic Data %D May 20, 2003 %Z Fri Jun 20 16:33:46 EDT 2003 %I DIMACS %R 2003-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-18.ps.gz %X IP network operators collect aggregate traffic statistics on network interfaces via the Simple Network Management Protocol (SNMP). This is part of routine network operations for most ISPs; it involves a large infrastructure with multiple network management stations polling information from all the network elements and collating a real time data feed. This demo will present a tool that manages the live SNMP data feed on a fully operational large ISP at industry scale. The tool primarily serves to study correlations in the network traffic, by providing a rich mix of ad-hoc querying based on a user-friendly correlation interface and as well as canned queries, based on the expertise of the network operators with field experience. The tool is called IPSOFACTO for {\em IP Stream-Oriented FAst Correlation for Traffic Overseeing}. %A Flip Korn %A S. Muthukrishnan %A Yunyue Zhu %T Checks and Balances: Monitoring Data Quality Problems in Network Traffic Databases %D June, 2003 %Z Tue Aug 26 12:19:46 EDT 2003 %I DIMACS %R 2003-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-19.ps.gz %X

Internet Service Providers (ISPs) use real-time data feeds of aggregated traffic in their network to support technical as well as business decisions. A fundamental difficulty with building decision support tools based on aggregated traffic data feeds is one of data quality. Data quality problems stem from network-specific issues (e.g., irregular polling caused by UDP packet drops and delays, topological mislabelings, etc.), and make it difficult to distinguish between artifacts and actual phenomena, rendering data analysis based on such data feeds ineffective.

In principle, traditional integrity constraints and triggers may be used to enforce data quality. In practice, data cleaning is done outside the database and is ad-hoc. Unfortunately, these approaches are too rigid or limited for the subtle data quality problems arising from network data where existing problems morph with network dynamics, new problems emerge over time, and poor quality data in a local region may itself indicate an important phenomenon in the underlying network. We need a new approach -- both in principle and in practice -- to face data quality problems in network traffic databases.

We propose a continuous data quality monitoring approach based on probabilistic, approximate constraints (PACs). These are simple, user-specified rule templates with open parameters for tolerance and likelihood. We rely on statistical techniques to derive effective parameter values from the data, and show how to apply them for monitoring data quality. In principle, our PAC-based approach can be applied to data quality problems in any data feed. We present PACMAN, which is the system that manages PACs for the entire aggregate network traffic database in a large ISP, and show that it is very effective in monitoring data quality problems.

%A Graham Cormode %A S. Muthukrishnan %T Improved Data Stream Summaries: The Count-Min Sketch and its Applications %D June, 2003 %Z Tue Aug 26 12:20:03 EDT 2003 %I DIMACS %R 2003-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-20.ps.gz %X We introduce a new small space data structure -- the Count-Min Sketch -- for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range and inner product queries to be approximately answered quickly and efficiently; in addition it can eb applied to solve several important problems in data stream such as finding quantiles, frequent items, etc. In both cases, the time and space bounds we show using the CM sketch significantly improves those known for previous sketch constructions -- typically from 1/epsilon^2 to 1/epsilon in factor. %A Alexis Tsoukias %T From Decision Theory to Decision Aiding Methodology %D June, 2003 %Z Tue Aug 26 12:21:12 EDT 2003 %I DIMACS %R 2003-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-21.ps.gz %X The paper presents the author's partial and personal historical reconstruction of how decision theory evolved to decision aiding methodology. The presentation shows mainly how ``alternative'' approaches to classic decision theory evolved. In the paper is claimed that all such decision ``theories'' share a common methodological feature which is the use of formal and abstract languages as well as of a model of rationality. Different decision aiding approaches can thus be defined, depending on the origin of the model of rationality used in the decision aiding process. The concept of decision aiding process is then introduced and analysed. The paper ultimate claim is that all such approaches can be seen as part of a decision aiding methodology. %A I. E. Zverovich %T Independent domination on $2P_3$-free perfect graphs %D June, 2003 %Z Tue Aug 26 12:22:33 EDT 2003 %I DIMACS %R 2003-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-22.ps.gz %X We show that the independent domination problem is NP-complete for $2P_3$-free perfect graphs. %A Igor Zverovich %T A new kind of graph colorings %D June, 2003 %Z Tue Aug 26 12:23:13 EDT 2003 %I DIMACS %R 2003-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-24.ps.gz %X A proper k-coloring C1, C2, ..., Ck of a graph G is called strong if for every vertex u in V(G) there exists an index i in {1, 2, ..., k} such that u is adjacent to every vertex of Ci . We consider classes SCOLOR(k) of strongly k-colorable graphs and show that the recognition problem of SCOLOR(k) is NP-complete for every k >= 4, but it is polynomial-time solvable for k = 3. We give a characterization of SCOLOR(3) in terms of forbidden induced subgraphs. Finally, we solve the problem of uniqueness of a strong 3-coloring. %A Igor Zverovich %T On a problem of Lesniak, Polimeni, and Vanderjagt %D June, 2003 %Z Tue Aug 26 12:24:46 EDT 2003 %I DIMACS %R 2003-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-25.ps.gz %X We give almost complete solution to the following problem: for a fixed S, what is the minimum value p = uH (S) such that a pair (S; p) has a Hamiltonian realization? We give a criterion for a pair (S; p) to have a Hamiltonian realization. %A Igor Zverovich %T Algorithmic aspects of Reducing Pseudopath Method %D June, 2003 %Z Tue Aug 26 12:25:51 EDT 2003 %I DIMACS %R 2003-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-26.ps.gz %XLet G and H be graphs. A substitution of H in G instead of a vertex v in V (G) is the graph G(v -> H), which consists of disjoint union of H and G - v with the additional edge-set {xy : x in V (H); y in NG (v)}. For a hereditary class of graphs P, the substitutional closure of P is defined as the class P* consisting of all graphs which can be obtained from graphs in P by repeated substitutions.

Let P be an arbitrary hereditary class for which a characterization in terms of forbidden induced subgraphs is known. Zverovich [20] proposed Reducing Pseudopath Method for constructing a forbidden induced subgraph characterization of P* . However, an implementation of the method is not straightforward.

We find conditions that guarantee the existence of a very simple reducing pseudopath [for example, of length 1]. As a result, a number of known and new characterizations may be obtained immediately by applying the improved method. In particular, we essentially simplify proofs given in Brandstadt, Hoang, and Zverovich [3]. As examples, we consider some interesting hereditary classes.

%A Kazuhisa Makino %A Yushi Uno %A Toshihide Ibaraki %T Minimum Edge Ranking Spanning Trees of Split Graphs %D June, 2003 %Z Tue Aug 26 12:27:12 EDT 2003 %I DIMACS %R 2003-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-27.ps.gz %X Given a graph $G$, the minimum edge ranking spanning tree problem (MERST) is to find a spanning tree of $G$ whose edge ranking is minimum. However, this problem is known to be NP-hard for general graphs. In this paper, we show that the problem MERST has a polynomial time algorithm for split graphs, which have useful applications in practice. The result is also significant in the sense that this is a first non-trivial graph class for which the problem MERST is found to be polynomially solvable. We also show that the problem MERST for threshold graphs can be solved in linear time, where threshold graphs are known to be split. %A Boros, Endre %A Gurvich, Vladimir %T Stable Effectivity Functions and Perfect Graphs %D January 7, 1999 %Z Mon, 20 Dec 1999 16:00:00 GMT %I DIMACS %R 99-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-01.ps.gz %X We consider the problem of characterizing the stability of effectivity functions (EFF), via a combinatorial correspondance between game theoretic and well-known combinatorial concepts. To every EFF we assign a pair of hypergraphs, representing clique covers of two associated graphs, and obtain some necessary and some sufficient conditions for the stability of EFFs in terms of graph-properties. These conditions imply e.g. that to check the stability of an EFF is an NP-complete problem. We also translate some well known conjectures of graph theory into game theoretic language and vice versa. %A Johannes E. Gehrke %A S. Muthukrishnan %A Rajmohan Rajaraman %A Anthony Shaheen %T Scheduling to Minimize Average Stretch %D January 13, 1999 %Z Fri, 26 Feb 1999 18:00:00 GMT %I DIMACS %R 99-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-02.ps.gz %XWe consider the classical problem of online preemptive job scheduling on uniprocessor and multiprocessor machines. For a given job, we measure the quality of service provided by an algorithm by the stretch of the job, which is defined as the ratio of the amount of time that the job spends in the system to the processing time of the job. For a given sequence of jobs, we measure the performance of an algorithm by the average stretch achieved by the algorithm over all the jobs in the sequence.

We first prove that no on-line algorithm can achieve a competitive ratio that is smaller than 1.03. The main contribution of this paper is to show that the shortest remaining processing time algorithm (SRPT) is O(1)-competitive with respect to average stretch for both uniprocessors as well as multiprocessors. For uniprocessors, we prove that SRPT is 2-competitive. We also establish an essentially matching lower bound on the competitve ratio of SRPT on uniprocessors. For multiprocessors, we show that the competitive ratio of SRPT is at most 14.

%A Ilya Muchnik %A Vadim Mottl %A Vladimir Levyant %T Massive Data Set Analysis in Seismic Explorations for Oil and Gas in Crystalline Basement Interval %D January 19, 1999 %Z Fri, 26 Feb 1999 18:00:00 GMT %I DIMACS %R 99-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-03.ps.gz %X On the basis of the optimization-based approach to the analysis of massive ordered data sets, a new method is proposed for computer-aided interpretation of seismic exploratory data from the so-called crystalline basement of the Earth mantle, which underlies the relatively thin sedimentary cover having been, up to now, the almost exceptional object of seismic explorations. The seismic exploratory data sets, seismic sections and cubes, are a class of, respectively, two- and three-dimensional data arrays, which are analyzed in the course of gas and oil reserves prospecting with the purpose of studying the structure of the underground rock mass. The seismic data sets consist of synchronous records of reflected seismic signals registered by a large number of geophones (seismic sensors) placed along a straight line or in the nodes of a rectangular lattice on the earth surface. As the source of the initial seismic pulse, usually serves a series of explosions, responses to which are averaged in a special manner. The vertical time axis forming the resulting two- or three-dimensional picture is identified with depth, so that the peculiarities of the reflected signal under the respective sensor carry an information on the local properties of the rock mass at the respective point of the underground medium. In contrast to the above-lying sedimentary cover, the absence of pronounced reflecting surfaces in a crystalline body results in a great difficulty of inferring the geological information from the basement interval of the seismic picture. The new method of seismic data analysis proposed in this work is aimed at finding fractured zones of the basement rock mass capable to accumulate oil or gas. The essence of the method consists in numerical evaluating distinctions in the local spatial texture of the seismic picture that are caused by differences in physical properties of fractured and monolith rock. The problem of estimating the local texture over the whole data array at once is set as that of minimizing an objective function in that the texture model parameters at all the elements of the array occur as its arguments. A special separable structure of the objective function provides a high speed of the optimization procedure. %A Avishai Wool %A Bulent Yener %T Combinatorial Design of Multi-Ring Networks with Combined Routing and Flow Control %D January 20, 1999 %Z Fri, 26 Feb 1999 18:00:00 GMT %I DIMACS %R 99-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-04.ps.gz %XIn this paper we present a novel design technique, and combined routing and flow control algorithms, for congestion-free packet switched networks.

The design is based on the construction of multiple virtual rings, under the constraint that the path between any two nodes is either confined to a single ring, or traverses exactly two rings (passing through a single bridge node).

Our best designs are constructed by using Finite Generalized Quadrangles of combinatorial design theory, together with a scaling algorithm for realizing networks of arbitrary size. The target topology is obtained by taking the edge union of the multiple virtual rings.

Capitalizing on the underlying topological properties, we design routing, flow and access control protocols. We prove that the proposed network architecture, coupled with our protocols, ensures that (i) no loss due to congestion occurs inside a network, under arbitrary traffic patterns; (ii) all the packets reach their destinations within bounded time; and (iii) the bandwidth is allocated fairly and no host is starved.

Moreover, we achieve these desirable properties much more efficiently than earlier proposals. Our best designs have a maximum route length of O(N^{1/3}) and require O(N^{1/3}) ports to be used at each node for an N-node network. The designs attain these bounds while manifesting a high degree of redundancy, with multiple disjoint paths between all pairs of nodes. This significantly improves upon the designs of [YOY94b,YOY98], where both the route length and number of ports are Omega(sqrt(N)), and only a single path exists between every pair of nodes.

We also provide a theoretical bound for the impact of fairness and congestion-freeness on netwok utilization. We used this design technique to generate network designs of various sizes. We then implemented our protocols and verified their performance by simulations.

%A A. Ashikhmin %A A. Barg %A S. Litsyn %T A New Upper Bound on Codes Decodable into Size-2 Lists %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-05.ps.gz %X A new asymptotic upper bound on the size of binary codes with the property described in the title is derived. The proof relies on the properties of the distance distribution of binary codes established in earlier related works of the authors. %A A. Ashikhmin %A A. Barg %A S. Litsyn %T A New Upper Bound on the Reliability Function of the Gaussian Channel %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-06.ps.gz %XWe derive a new upper bound on the exponent of error probability of decoding for the best possible codes in the Gaussian channel. This bound is tighter than the known upper bounds (the sphere-packing and minimum-distance bounds proved in Shannon's classical 1959 paper and their low-rate improvement by Kabatiansky and Levenshtein). The proof is accomplished by studying asymptotic properties of codes on the Euclidean $n$-dimensional sphere. First we prove that the distance distribution of codes of large size necessarily contains a large component. A general theorem establishing this estimate is proved simultaneously for codes on the Euclidean sphere and in real and complex projective spaces.

To derive specific estimates of the distance distribution, we study the asymptotic behavior of Jacobi polynomials $P_k^{\alpha,\beta}$ as $k\to \infty$ and at least one of the upper indices grows linearly in $k$. This group of results provides the exact behavior of the exponent of Jacobi polynomials in the entire orthogonality segment.

Since on the average there are many code vectors in the vicinity of the transmitted vector $\bfx$, one can show that the probability of confusing $\bfx$ and one of these vectors cannot be too small. This proves a lower bound on the error probability of decoding and the upper bound announced in the title.

%A Andrew Odlyzko %T The Internet and Other Networks: Utilization Rates and Their Implications %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-07.ps.gz %XCosts of communications networks are determined by the maximal capacities of those networks. On the other hand, the traffic those networks carry depends on how heavily those networks are used. Hence utilization rates and utilization patterns determine the costs of providing services, and therefore are crucial in understanding the economics of communications networks.

A comparison of utilization rates and costs of various networks helps disprove many popular myths about the Internet. Although packet networks are often extolled for the efficiency of their transport, it often costs more to send data over internal corporate networks than using modems on the switched voice network. Packet networks are growing explosively not because they utilize underlying transport capacity more efficiently, but because they provide much greater flexibility in offering new services.

Study of utilization patterns shows there are large opportunities for increasing the efficiency of data transport and making the Internet less expensive and more useful. On the other hand, many popular techniques, such as some Quality of Service measures and ATM, are likely to be of limited usefulness.

%A Andrew Odlyzko %T The Economics of the Internet: Utility, Utilization, Pricing, and Quality of Service %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-08.ps.gz %X Can high quality be provided economically for all transmissions on the Internet? Current work assumes that it cannot, and concentrates on providing differentiated service levels. However, an examination of patterns of use and economics of data networks suggests that providing enough bandwidth for uniformly high quality transmission may be practical. If this turns out not to be possible, only the simplest schemes that require minimal involvement by end users and network administrators are likely to be accepted. On the other hand, there are substantial inefficiencies in the current data networks, inefficiencies that can be alleviated even without complicated pricing or network engineering systems. %A Andrew Odlyzko %T Smart and Stupid Networks: Why the Internet Is Like Microsoft %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-09.ps.gz %X Is the Internet growing primarily because it is a dumb network, one that simply delivers packets from one point to another? Probably not. If were a dumb network, we surely would not need huge and rapidly growing ranks of network professionals. A more detailed look suggests that the Internet is succeeding largely for the same reasons that led the PC to dominate the mainframe, and are responsible for the success of Microsoft. Like the PC, the Internet offers an irresistible bargain to a crucial constituency, namely developers, while managing to conceal the burden it places on users. %A Andrew Odlyzko %T Data Networks are Lightly Utilized, and Will Stay That Way %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-10.ps.gz %X The popular press often extolls packet networks as much more efficient than switched voice networks in utilizing transmission lines. This impression is reinforced by the delays experienced on the Internet and the famous graphs for traffic patterns through the major exchange points on the Internet, which suggest that networks are running at full capacity. This paper shows the popular impression is incorrect; data networks are very lightly utilized compared to the telephone network. Even the backbones of the Internet are run at lower fractions (10% to 15%) of their capacity than the switched voice network (which operates at over 30% of capacity on average). Private line networks are utilized far less intensively (at 3% to 5%). Further, this situation is likely to persist. The low utilization of data networks compared to voice phone networks is not a symptom of waste. It comes from different patterns of use, lumpy capacity of transmission facilities, and the high growth rate of the industry. %A K. G. Coffman %A Andrew Odlyzko %T The Size and Growth Rate of the Internet %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-11.ps.gz %X The public Internet is currently far smaller, in both capacity and traffic, than the switched voice network. The private line networks are considerably larger in aggregate capacity than the Internet. They are about as large as the voice network in the U.S., but carry less traffic. On the other hand, the growth rate of traffic on the public Internet, while lower than is often cited, is still about 100% per year, much higher than for traffic on other networks. Hence, if present growth trends continue, data traffic in the U.S. will overtake voice traffic around the year 2002 and will be dominated by the Internet. %A Peter Fishburn %A Andrew Odlyzko %T Dynamic Behavior of Differential Pricing and Quality of Service Options for the Internet %D January 21, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-12.ps.gz %XThe simple model on which the Internet has operated, with all packets treated equally, and charges only for access links to the network, has contributed to its explosive growth. However, there is wide dissatisfaction with the delays and losses in current transmission. Further, new services such as packet telephony require assurance of considerably better service. These factors have stimulated the development of methods for providing Quality of Service (QoS), and this will make the Internet more complicated. Differential quality will also force differential pricing, and this will further increase the complexity of the system.

The solution of simply putting in more capacity is widely regarded as impractical. However, it appears that we are about to enter a period of rapidly declining transmission costs. The implications of such an environment are explored by considering models with two types of demands for data transport, differing in sensitivity to congestion. Three network configurations are considered: (1) with separate networks for the two types of traffic, (2) with a single network that provides uniformly high QoS, and (3) with a single physical network that provides differential QoS. The best solution depends on the assumptions made about demand and technological progress. However, we show that the provision of uniformly high QoS to all traffic may well be best in the long run. Even when it is not the least expensive, the additional costs it imposes are usually not large. In a dynamic environment of rapid growth in traffic and decreasing prices, these costs may well be worth paying to attain the simplicity of a single network that treats all packets equally and has a simple charging mechanism.

%A Jean-Claude Bermond %A Johny Bond %A Carole Martin %A Aleksandar Pekec %A Fred S. Roberts %T Optimal Orientations of Annular Networks %D February 10, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-13.ps.gz %X Annular Network $AN(c,s)$ is a graph representing a $c\times s$ grid in polar coordinates. We give lower bounds for the diameter of orientations of $AN(c,s)$ and provide orientations which show that bounds are tight in most cases. %A Abram Kagan %A Colin L. Mallows %A Larry A. Shepp %A Robert J. Vanderbei %A Yehuda Vardi %T Symmetrization of Binary Random Variables %D February 10, 1999 %Z Tue, 30 Mar 1999 18:00:00 GMT %I DIMACS %R 99-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-14.ps.gz %X A random variable $Y$ is called an
Optimal alphabetic binary trees have a wide variety of applications
in computer science and information systems. Fast algorithms for
building such trees in *O(n log n)* do exist. However, no existing
algorithm makes it possible to insert in (or delete from) the tree
without losing its optimality. In this paper, we propose an
algorithm to insert into or delete from an optimal binary alphabetic
tree in **linear time**
*keeping the tree optimal after insertion or deletion*.
We show that both insertion and deletion of a node can be
done in *O(n)* time provided its weight is not bigger than the higher
weight of its two neighbouring nodes. This algorithm makes it possible
to have a dynamic optimal alphabetic tree with reasonable complexity
and allows us to expand the domain of weight sequences
whose optimal alphabetic trees can be obtained in linear time.

**Key words**: Optimal alphabetic tree, insertion and deletion,
linear-time alphabetic trees.

Active networks is a framework where network elements, primarily routers and switches, are programmable. Programs that are injected into the network are executed by the network elements to achieve higher flexibility and to present new capabilities.

This work describes a novel active network architecture which primarily addresses the management challenges of modern complex networks. Its primary component is an active engine that is attached to any IP router to form an active node. The active engine we designed and implemented executes programs that arrive from the network and monitors and controls the router actions. The design is based on standards (Java, SNMP, ANEP over UDP), and can be easily deployed in todays IP networks.

The contribution of this paper is the introduction of novel architectural features such as: isolation of the active mechanism, the session concept, the ability of active session to control non-active packets, and blind addressing. Implementing these ideas, we built a system that enables the safe execution and rapid deployment of new distributed management applications in the network layer. This system can be gradually integrated in todays IP networks, and allows smooth migration from IP to active networking.

%A Hans van Maaren %T A Short Note on Linear Autarkies, q-Horn Formulas and the Complexity Index %D May 7, 1999 %Z Fri, 21 May 1999 23:00:00 GMT %I DIMACS %R 99-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-26.ps.gz %X It is shown that the tractable class of CNF formulas solvable by Linear Autarkies properly contains the class of q-Horn formulas and that it is incomparable with SLUR. %A Greg Perkins %A Diane Souvaine %T Efficient Radio Wave Propagation for Wireless Radio Signal Prediction %D May 28, 1999 %Z Mon, 21 Jun 1999 21:00:00 GMT %I DIMACS %R 99-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-27.ps.gz %XDue to the ever increasing complexity of wireless network design, wireless network simulation engines have become a necessary tool for wireless network engineers. A signal-strength prediction system lies at the foundation of every wireless simulator, making signal-strength prediction for microcells and picocells an important research topic in the radio field. Efficient and accurate methods are needed for use in current wireless network design and future wireless network research.

This paper describes a system, PREDICT, designed for use in wireless radio signal prediction research. PREDICT tracks the expansion of a transmitter's wavefront by modeling the wavefront as an ever expanding sphere centered upon the transmitter. The sphere's expansion is tracked by discretizing the sphere's surface with a triangulation and then propagating each triangular wavefront through the environment. In the propagation prediction of these triangular wave fronts, PREDICT supports reflection, transmission and diffraction based upon the geometric optics (GO) model of electromagnetic waves. Signal prediction is performed through a special summation of the predicted multi-path at a receiver.

PREDICT's algorithms have been specifically designed for use in radio signal prediction. The current environments supported by PREDICT are urban with flat or varied terrain and a single floor of a building, where all obstacles in the environment have vertical or horizontal walls. The system design is flexible, allowing PREDICT to be expanded for use in many different environments. Modularity allows for easy modification to the GO model or implementation of alternate radio modeling methods (one can easily modify PREDICT, or use the results generated by PREDICT, with one's own radio wave modeling methods). PREDICT is publicaly available for any non-commercial use.

%A E. Boros %A V. Gurvich %T On Parallel Edges in Cycles %D June 2, 1999 %Z Mon, 21 Jun 1999 21:00:00 GMT %I DIMACS %R 99-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-28.ps.gz %X Let $n\leq N$ be positive integers. We consider the problem of finding an $n$-cycle with no parallel edges in a perfect $N$-gon in the Euclidian plane. We prove that there exist no such $n$-cycle if and only if $N=n$ and even. And we show by construction that for every other pair $(N,n)$, $N \geq n \geq 3$, such an $n$-cycle exists. %A V. Gurvich %A Li Sheng %T Camel Sequences and Their Applications %D June 2, 1999 %Z Mon, 21 Jun 1999 21:00:00 GMT %I DIMACS %R 99-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-29.ps.gz %XGiven an even cyclic (0,1) sequence $s = (s_1,..., s_{2n})$ which consists of $n$ ones and $n$ zeros, let us compute $i + j (mod 2n)$ for all $n^2$ (0,1)-pairs $(s_i,s_j)$ and distribute obtained $n^2$ numbers between $2n$ "boxes" $1,2,...,2n-1,2n=0 (mod 2n)$. Obviously, the average cardinality of a box is $\frac{n^2}{2n}=\frac{n}{2}$. The following five sequences have quite remarcable box-distributions, "almost average everywhere with two big humps":

$n=3$, $s = (110100)$, $B = (112113) = (1^2 \, 2 \, 1^2 \, 3)$;

$n=7$, $s = (11101000110100)$, $B = (33333363333337) = (3^6 \, 6 \, 3^6 \, 7)$;

$n=11$, $s = (1110001001011100010110)$, $B = (5^{10} \, 10 \; 5^{10} \, 11)$;

$n=5$, $s = (1110010100)$, $B = (3333133330) = (3^4 \, 1 \, 3^4 \, 0)$;

$n=17$, $s = (1111101000110001011011010001100010)$, $B = (9^{16} \, 1 \, 9^{16} \, 0)$.

In general, given an {\em odd} $n$, box-distributions $(\lfloor\frac{n}{2}\rfloor^{n-1} (n-1) \lfloor\frac{n}{2}\rfloor^{n-1} n)$ and $(\lceil\frac{n}{2}\rceil^{n-1} \; 1 \lceil\frac{n}{2}\rceil^{n-1} \; 0)$ as well as the generating sequences will be called the {\em camel distributions} and the {\em camel sequences}, respectively {\em up-camel} and {\em down-camel}. We conjecture that there are infinitely many of them, both types, though only five are known. The first three sequences above are up-camel, and the last two are down-camel. We consider some applications of camel sequences in extremal graph theory.

**Key Words**: *
(0,1)-sequence, camel sequence, maximin, minimax,
extremal graph theory, vertex-enumerated graphs*

This study addresses the problem of improving the quality of natural language (NL) understanding by means of the involvement of the additional criterion of correct formal representation of the input inquiry. This criterion is based on the logical compatibility between the input inquiry, translated into the formal language, and the domain, encoded in the same language. Inquiry generality is measured as a normalized number of object tuples, which deliver the translation satisfaction for a given domain. Computational experiments in various domains show that the certain diapason of generality indicates the proper translation. Specific heuristics are developed for the transformation of the translation formula to improve the value of the generality criterion.

Usually, if the generality is too high (there is a large number of the tuples of objects) then the semantic analyzer has likely ignored some syntactic constraints for the translation formula. On the contrary, when the generality is too low (there is no object tuples, satisfying the translation), then the semantic analyzer has likely introduced too many syntactic constraints, and some of them have to be eliminated.

Generality criterion helps in the frequent situations, where the morphological and syntactic processing is insufficient for the construction of the proper inquiry translation into the formal language. Our approach is based on the experimentally verified fact that the user poses inquiries with the reasonable generality. This knowledge helps us to improve the semantic analysis to function in the flexible and expandable domains, where the lexical information is rather limited.

Compatibility feedback by means of generality control poses the specific requirements to the formal language of inquiry translation and domain knowledge representation. The semantic processor is implemented as a logical program with metalanguage support to present the complex semantic rules. Suggested approach allowed us to build the NL understanding system with high inquiry complexity, involving up to 3-4 concepts.

%A Markus Jakobsson %A Ari Juels %T Millimix: Mixing in Small Batches %D June 10, 1999 %Z Thu, 22 Jul 1999 23:50:00 GMT %I DIMACS %R 99-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-33.ps.gz %X We present Millimix, a mix network that is highly efficient on small input batches. The construction is conceptually simple, and both robust and private in the face of collusion by any minority set of malicious players. Additionally, Millimix the feature of being publicly verifiable. In other words, the mixing operation yields a transcript that demonstrates correctness to a third party that the mix proceeded correctly. %A Michael Saks %A Aravind Srinivasan %A Shiyu Zhou %A David Zuckerman %T Low Discrepancy Sets Yield Approximate Min-Wise Independent Permutation Families %D October 24, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-34.ps.gz %X Motivated by a problem of filtering near-duplicate Web documents, Broder, Charikar, Frieze \& Mitzenmacher defined the following notion of {\em $\epsilon$-approximate min-wise independent permutation families}. A multiset ${\cal F}$ of permutations of $\{0,1,\ldots,n-1\}$ is such a family if for all $K \subseteq \{0,1,\ldots,n-1\}$ and any $x \in K$, a permutation $\pi$ chosen uniformly at random from ${\cal F}$ satisfies $$|\ \Pr[\min\{\pi(K)\}=\pi(x)]-{1\over |K|}\ |\le{\eps\over |K|}.$$ We show connections of such families with {\em low discrepancy sets for geometric rectangles}, and give explicit constructions of such families ${\cal F}$ of size $n^{O(\sqrt{\log n})}$ for $\eps=1/n^{\Theta(1)}$, improving upon the previously best-known bound of Indyk. We also present polynomial-size constructions when the min-wise condition is required only for $|K| \le 2^{O(\log^{2/3} n)}$, with $\epsilon \geq 2^{-O(\log^{2/3} n)}$. %A Aleksandar Pekec %A Fred S. Roberts %T The Role Assignment Model Nearly Fits Most Social Networks %D June 11, 1999 %Z Thu, 22 Jul 1999 23:50:00 GMT %I DIMACS %R 99-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-35.ps.gz %X Role assignments, introduced by Everett and Borgatti \cite{Everett91}, who called them role colorings, formalize the idea, arising in the theory of social networks, that individuals of the same social role will relate in the same way to individuals playing counterpart roles. If $G$ is a graph, a $k${\em -role assignment} is a surjective function mapping each vertex into a positive integer $1,2,\ldots,k$, so that if $x$ and $y$ have the same role, then the sets of roles assigned to their neighbors are the same. We show that all graphs $G$ having no astronomical discrepancies between the minimum and the maximum degree have a $k$-role assignment. Furthermore, we introduce and study a natural measure expressing how close an onto map $f:V(G)\rightarrow\{1,\ldots, k\}$ is to being a $k$-role assignment of a graph $G=(V,E)$, and show that almost all graphs nearly have a $k$-role assignment. %A Marek Karpinski %T Randomized Complexity of Linear Arrangements and Polyhedra %D June 12, 1999 %Z Thu, 22 Jul 1999 23:50:00 GMT %I DIMACS %R 99-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-36.ps.gz %X We survey some of the recent results on the complexity of recognizing n-dimensional linear arrangements and convex polyhedra by randomized algebraic decision trees. We give also a number of concrete applications of these results. In particular, we derive first nontrivial, in fact quadratic, randomized lower bounds on the problems like Knapsack and Bounded Integer Programming. We formulate further several open problems and possible directions for future research. %A Janos Pach %A Joel Spencer %A Geza Toth %T New Bounds on Crossing Numbers %D June 21, 1999 %Z Thu, 22 Jul 1999 23:50:00 GMT %I DIMACS %R 99-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-37.ps.gz %X
The *crossing number*, ${\mbox {cr}}(G)$, of a graph $G$ is the
least number of crossing points in any drawing of $G$ in
the plane. Denote by $\kappa(n,e)$ the minimum of ${\mbox {cr}}(G)$
taken over all graphs with $n$ vertices and at least
$e$ edges.
We prove a conjecture of P. Erd\H os and R. Guy by showing
that $\kappa(n,e)n^2/e^3$ tends to a positive constant as
$n\rightarrow\infty$ and $n\ll e\ll n^2$. Similar results
hold for graph drawings on any other surface of fixed genus.

We prove better bounds for graphs satisfying some monotone
properties. In particular, we show that if $G$ is a graph with
$n$ vertices and $e\ge 4n$ edges, which does not contain a
cycle of length *four* (resp. *six*), then its
crossing number is at least $ce^4/n^3$ (resp. $ce^5/n^4$),
where $c>0$ is a suitable constant. These results
cannot be improved, apart from the value of the constant.
This settles a question of M. Simonovits.

The Pfaffian of an oriented graph is closely linked to Perfect Matching. It is also naturally related to the determinant of an appropriately defined matrix. This relation between Pfaffian and determinant is usually exploited to give a fast algorithm for computing Pfaffians.

We present the first completely combinatorial algorithm for computing the Pfaffian in polynomial time. Our algorithm works over arbitrary commutative rings. Over integers, we show that it can be computed in the complexity class GapL; this result was not known before. Our proof techniques generalize the recent Mahajan-Vinay combinatorial characterization of determinant in novel ways.

As a corollary, we show that under reasonable encodings of a planar graph, Kasteleyn's algorithm for counting the number of perfect matchings in a planar graph is also in GapL. The combinatorial characterization of Pfaffian also makes it possible to directly establish several algorithmic and complexity theoretic results on Perfect Matching which otherwise use determinants in a roundabout way.

We also present hardness results for computing the Pfaffian of an integer skew-symmetric matrix. We show that this is hard for #L and GapL under logspace many-one reductions.

**Keywords:** pfaffian, perfect matchings, determinant,
planar graphs, GapL

Given an even cyclic $(+1,-1)$ sequence $s = (s_0,..., s_{2n-1})$ which consists of $n$ plus ones and $n$ minus ones, let us compute $i + j \ (mod\ 2n)$ for all $n^2$ $(+1,-1)$-pairs $(s_i,s_j)$ and distribute obtained $n^2$ numbers between $2n$ "boxes" $1,2,...,2n-1,2n=0 \ (mod\ 2n)$. The average cardinality of a box is $\frac{n^2}{2n}=\frac{n}{2}$. Some sequences have quite remarkable box-distributions, "almost average everywhere with two big humps". For example,

$n=5, s = (+1+1-1-1+1-1+1-1-1+1),\ B = (3333133330) = (3^4 1 \, 3^4 0)\ $

$n=7, s = (+1+1+1-1+1-1-1-1+1+1-1+1-1-1),\ B = (33333363333337) = (3^6 6 \, 3^6 7)$.

In general, given an {\em odd} $n$, the box-distributions $(\lfloor\frac{n}{2}\rfloor^{n-1} \ (n-1) \\lfloor\frac{n}{2}\rfloor^{n-1} \ n)$ and $(\lceil\frac{n}{2}\rceil^{n-1} \ 1 \ \lceil\frac{n}{2}\rceil^{n-1} \ 0)$ as well as the sequences, which generate them, will be called the {\em camel distributions} and {\em camel sequences}, respectively {\em up-camel} and {\em down-camel}. For example, the first sequence above is down-camel, and the second one is up-camel. Camel sequences have applications in extremal graph theory. Here we prove that there are infinitely many 'camels' of both types. More precisely, for every prime $n=4j-1$ we construct an up-camel sequence and for every prime $n=4j+1$ a down-camel one. In both cases these sequences are related to quadratic residues and non-residues modulo $n$. We conjecture that there are no other camel sequences.

**Key Words:** *
quadratic residues, camel sequences, maximin, minimax,
extremal graph theory, vertex-enumerated graphs.*

A coterie is a family of subsets such that every pair of subsets has at least one element in common but neither is a subset of the other.

A coterie $C$ is said to be non-dominated (ND) if there is no other coterie $D$ such that, for $\forall Q \in C$, there exists $Q' \in D$ satisfying $Q' \subseteq Q$.

We introduce an operator $\sigma$, which transforms a ND coterie to another ND coterie. A regular coterie is a natural generalization of a ``vote-assignable" coterie, which is used in some practical applications. We show that any ``regular" ND coterie $C$ can be transformed to any other regular ND coterie $D$ by judiciously applying $\sigma$ operations to $C$ at most $|C|+|D|-2$ times.

As another application of the $\sigma$ operation, we present an incrementally-polynomial-time algorithm for generating all regular ND coteries. We then introduce the concept of ``g-regular" function, as a generalization of availability. We show how to construct an optimum coterie $C$ with respect to a g-regular function in $O(n^3|C|)$ time. We also discuss the structures of optimum coteries with respect to a g-regular function.

%A Peter Fratzl %A Oliver Penrose %A Joel L. Lebowitz %T Modelling of Phase Separation in Alloys with Coherent Elastic Misfit %D July 30, 1999 %Z Mon, 2 Aug 1999 19:50:00 GMT %I DIMACS %R 99-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-42.ps.gz %X Elastic interactions arising from a difference of lattice spacing between two coherent phases can have a strong influence on the phase separation (coarsening) behaviour of alloys. If the elastic moduli are different in the two phases, the elastic interactions may accelerate, slow down or even stop the phase separation process. If the material is elastically anisotropic, the precipitates can be shaped like plates or needles instead of spheres and can \sps arrange themselves into highly correlated patterns \cs. Tensions or compressions applied externally to the specimen may have a strong effect on the shapes and arrangement of the precipitates. In this paper, we review the main theoretical approaches that have been used to model these effects and we relate them to experimental observations. The theoretical approaches considered are (i) `macroscopic' models treating the two phases as elastic media separated by a sharp interface (ii) `mesoscopic' models in which the concentration varies continuously across the interface (iii) `microscopic' models which use the positions of individual atoms. %A Tiina Heikkinen %T A Minimax Game of Power Control in a Wireless Network under Incomplete Information %D August 9, 1999 %Z Wed, 11 Aug 1999 23:50:00 GMT %I DIMACS %R 99-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-43.ps.gz %X The purpose of this paper is to study optimal transmit power control and capacity of a wireless multimedia network, where there is incomplete information about the link gain coefficients. Previously, Perron-Frobenius theory has been applied to the study of optimal power control under complete information about the link gain matrix and the interference matrix. This paper derives a characterization of optimal power vector and capacity in terms of an incomplete-information-matrix-game against nature choosing the elements of the link gain and interference matrices. A game theoretical rationale for not using power control is given. %A E. G. Coffman %A George S. Lueker %A Joel Spencer %A Peter M. Winkler %T Packing Random Rectangles %D September 2, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-44.ps.gz %X A random rectangle is the product of two independent random intervals, each being the interval between two random points drawn independently and uniformly from [0,1]. We prove that the number of items in a maximum cardinality disjoint subset of n random rectangles has the tight asymptotic bound Theta(n^{1/2}). Although tight bounds for the problem generalized to d>2 dimensions remain an open problem, we are able to show that Omega(n^{1/2}) and O((n log^{d-1}n)^{1/2}) are asymptotic lower and upper bounds. In addition, we prove that Theta(n^{d/(d+1)}) is a tight asymptotic bound for the case of random cubes. %A Paul A. Dreyer %A Xuerong Yong %T The Adjacency Matrix and Spectrum of a Graph with Negative Third Largest Eigenvalue %D September 13, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-45.ps.gz %XLet $G$ be a simple connected graph with $n$ vertices, let $\lambda_i(G)$ be the $i$th largest eigenvalue of $G$. This paper provides the following results:

1. If $\lambda_3(G)<0$, then the adjacency matrix of $G$, $A(G)$ is cogredient to the matrix:

where $a_{ij}=1$ iff $i\ne j$ and $a_{ij}=0$ otherwise, and $$l_1+l_2+\ldots+l_k\le l.$$ Furthermore, if there exists an index $i$ such that $l_i > 1$, then $ G $ has the eigenvalue $-1$ with multiplicity at least $n-2r-1$, where $2r$ is the rank of $A(G^c)$.

2. $\lambda_3(G)<0 $ implies that $ \lambda_j(G)\ge -1$, $j=3,4,\ldots,n-r $, where $2r$ is the rank of the adjacency matrix of the complement of $G$, $2r \le n-1$.

%A Paul A. Dreyer %A Christopher Malon %A Jaroslav Nesetril %T Universal H-colorable Graphs Without A Given Configuration %D September 13, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-46.ps.gz %XFor every pair of finite connected graphs $F$ and $H$, and every integer $k$, we construct a universal graph $U$ with the following properties:

\begin{enumerate} \item There is a homomorphism $\pi :U \ra H$, but no homomorphism from $F$ to $U$. \item For every graph $G$ with maximal degree no more than $k$ having a homomorphism $h:G \ra H$, but no homomorphism from $F$ to $G$, there is a homomorphism $\alpha :G \ra U$, such that $h = \pi \circ \alpha$. \end{enumerate}

Particularly, this solves a problem regarding the chromatic number of a universal graph.

%A Bernardo Abrego %A Silvia Fernandez-Merchant %T On the Maximum Number of Equilateral Triangles II %D September 15, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-47.ps.gz %X Erdös and Purdy raised the problem of finding the maximum number of equilateral triangles determined by a set of $n$ points in ${\Bbb R}^{d}$. This question is investigated in the first part of this series. Here we study some variations where the sets in consideration are in convex or general position. Non trivial bounds are given for these problems, as well as for the corresponding questions where the triangles at issue have unit length side. %A Harry Buhrman %A Steve Fenner %A Lance Fortnow %A Dieter van Melkebeek %T Optimal Proof Systems and Sparse Sets %D September 17, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-48 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-48.ps.gz %XWe exhibit a relativized world where the class of sparse NP sets has no complete sets. This gives the first relativized world where no optimal proof systems exist.

We also examine under what reductions the class of sparse NP sets can have complete sets. We show a close connection between these issues and reductions from sparse to tally sets. We also consider the question as to whether the sparse NP sets have a computable enumeration.

%A Bhaskar DasGupta %A Georg Schnitger %T On the Computational Power of Analog Neural Networks %D October 8, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-49 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-49.ps.gz %X We survey the computational power of analog neural networks. %A Gabriela Hristescu %A Martin Farach-Colton %T Cluster-preserving Embedding of Proteins %D October 8, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-50 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-50.ps.gz %XSimilarity searching in protein sequence databases is a standard technique for biologists dealing with a newly sequenced protein. Exhaustive search in such databases is prohibitive because of the large sizes of these database and because pairwise comparisons are slow. Heuristic techniques, such as FASTA and BLAST, are useful because they are fast and accurate, though it has been shown that exhaustive search is more accurate. Therefore, there are times when one would like to perform an exhaustive search.

We propose an efficient method, called SparseMap, for preprocessing a database of proteins to support efficient similarity searches using expensive but sensitive distance functions, such as those based on Smith-Waterman similarity. Our method is based on a Low-dimensional Euclidean Embedding approach. We compare our method with other embedding approaches, and show that our method is faster and produces embeddings which preserve more biological information about the proteins, such as pairwise distance and biological clusters.

%A P.G. Kevrekidis %A M.I. Weinstein %T Dynamics of Lattice Kinks %D October 9, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-51 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-51.ps.gz %XWe consider a class of Hamiltonian nonlinear wave equations governing a field defined on a spatially discrete one dimensional lattice, with discreteness parameter, $d=h^{-1}$, where $h>0$ is the lattice spacing. The specific cases we consider in detail are the discrete sine-Gordon (SG) and discrete $\phi^4$ models. For finite $d$ and in the continuum limit ($d\to\infty$) these equations have static kink-like (heteroclinic) states which are stable. In contrast to the continuum case, due to the breaking of Lorentz invariance, discrete kinks cannot be ``Lorentz boosted" to obtain traveling discrete kinks. Peyrard and Kruskal pioneered the study of how a kink, initially propagating in the lattice dynamically adjusts in the absence of an available family of traveling kinks. We study in detail the final stages of the discrete kink's evolution during which it is pinned to a specified lattice site (equilibrium position in the Peierls-Nabarro barrier). We find:

(i) for $d$ sufficiently large (sufficiently small lattice spacing), the state of the system approaches an asymptotically stable ground state static kink (centered between lattice sites).

(ii) for $d$ sufficiently small $d

For discrete SG and discrete $\phi^4$ we have: wobbling kinks which have the same spatial symmetry as the static kink as well as ``g-wobblers'' and ``e-wobblers'', which have different spatial symmetry.

The large time limit of solutions with initial data near a kink is marked by damped oscillation about one of these two types of asymptotic states. In case (i) we compute the characteristics of the damped oscillation (frequency and $d$- dependent rate of decay). In case (ii) we prove the existence of, and give analytical and numerical evidence for the asymptotic stability of wobbling solutions.

The mechanism for decay is the radiation of excess energy, stored in {\it internal modes}, away from the kink core to infinity. This process is studied in detail using general techniques of scattering theory and normal forms. In particular, we derive a {\it dispersive normal form}, from which one can anticipate the character of the dynamics. The methods we use are very general and are appropriate for the study of dynamical systems which may be viewed as a system of discrete oscillators ({\it e.g.} kink together with its {\it internal modes}) coupled to a field ({\it e.g.} dispersive radiation or phonons). The approach is based on and extends an approach of one of the authors (MIW) and A. Soffer in previous work. Changes in the character of the dynamics, as $d$ varies, are manifested in topological changes in the phase portrait of the normal form. These changes are due to changes in the types of resonances which occur among the discrete internal modes and the continuum radiation modes, as $d$ varies.

Though derived from a time-reversible dynamical system, this normal form has a dissipative character. The dissipation is of an internal nature, and corresponds to the transfer of energy from the discrete to continuum radiation modes. The coefficients which characterize the time scale of damping (or {\it lifetime} of the internal mode oscillations) are a nonlinear analogue of ``Fermi's golden rule", which arises in the theory of spontaneous emission in quantum physics.

%A Piotr Berman %A Bhaskar DasGupta %T Improvements in Throughput Maximization for Real-Time Scheduling %D October 11, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-52 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-52.ps.gz %X We consider the problem of off-line throughput maximization for job scheduling on one or more machines, where each job has a release time, a deadline and a profit. Most of the versions of the problem discussed here were already treated by Bar-Noy et al.(Proc. 31st ACM STOC, 622-631, 1999). Our main contribution is to provide algorithms that do not use linear programming, are simple and much faster than the corresponding ones proposed in Bar-Noy et al., while either having the same quality of approximation or improving it. More precisely, compared to the results of in Bar-Noy et al., our pseudo-polynomial algorithm for multiple unrelated machines and all of our strongly-polynomial algorithms have better performance ratios, all of our algorithms run much faster, are combinatorial in nature and avoid linear programming. Finally, we show that algorithms with better performance ratios than 2 are possible if the stretch factors of the jobs are bounded. %A Adrian Dumitrescu %A Geza Toth %T Ramsey-type Results for Unions of Comparability Graphs %D October 25, 1999 %Z Wed, 27 Oct 1999 21:00:00 GMT %I DIMACS %R 99-53 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-53.ps.gz %X It is well known that the comparability graph of any partially ordered set of $n$ elements contains either a clique or an independent set of size at least $\sqrt{n}$. In this note we show that any graph of $n$ vertices which is the union of two comparability graphs on the same vertex set, contains either a clique or an independent set of size at least $n^{1 \over 3}$. On the other hand, there exist such graphs for which the size of any clique or independent set is at most $n^{0.4118}$. Similar results are obtained for graphs which are unions of a fixed number $k$ comparability graphs. We also show that the same bounds hold for unions of perfect graphs. %A Alexander Barg %T On Polynomial Invariants of Codes, Matroids and Finite Interaction Models %D October 28, 1999 %Z Wed, 10 Nov 1999 21:00:00 GMT %I DIMACS %R 99-54 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-54.ps.gz %XA linear code can be thought of as a vector matroid represented by the columns of code's generator matrix; a well-known result in this context is Greene's theorem on a connection of the weight polynomial of the code and the Tutte polynomial of the matroid. We examine this connection from the coding-theoretic viewpoint, building upon the rank polynomial of the code. This enables us

- to relate the weight polynomial of codes and the reliability polynomial of linear matroids and to prove bounds on the latter;
- to prove that the partition polynomial of the Potts model equals the weight polynomial of the cocycle code of the underlying graph, and
- to give a simple proof of Greene's theorem and its generalization.

Multiple sequence comparison is a basic problem for molecular biology and other sciences. In this paper, we introduce the concept of complete information set and some measurement principles for measuring multiple sequence discrepancy. Based on them, we present a new measurement method satisfying the principles for comparing multiple sequences. We show that this method can effectively distinguish different random sequences or DNA sequences, for example, distinguish DNA sequences of length 8000 by comparisons of 6-8 symbol strings or protein sequences of length 8000 by comparisons of 3-4 strings. It can also measure slight changes of a sequence, e.g., insertion or deletion of a symbol. We apply it in the study of molecular evolution; the results show that there is a hierarchic relationship among the cytochrome C protein sequences of different species, much as that in taxonomy; moreover, these results are consistent with previous studies.

**Key Words**: multiple sequence comparison, entropy, DNA, measurement

In this paper we deal with the problem of efficient learning of feedforward neural networks. First, we consider the case when the objective is to maximize the ratio r of the correctly classified points compared to all points in a training set. We show that, for an arbitrary multilayered threshold network with varying input dimension and with a constant n_1 >= 2 neurons in the first hidden layer, it is NP-hard to approximate r within a relative error of at most \varepsilon=1/(68 n_1 2^{n_1}+ 136 n_1^3+ 136 n_1^2 +170n_1). If n_1 >= 3, then the above result is true with \varepsilon=c(n_1-1)/(5n_1^3+2n_1^32^{n_1}+4n_1^5+4n_1^4) for some positive constant c, even if restricted to situations where a solution without any misclassifications exists. Restricted to architectures with only one hidden layer and two hidden neurons approximation of r with relative error at most 1/c is NP-hard even if either (a) c=2244, the threshold activation function in the hidden layer is substituted by the classical sigmoid function, and the situation of \epsilon-separation of the output classification is assumed, or (b) c=2380 and the threshold activation function is substituted by the semilinear activation function commonly used in the neural net literature. For a single hidden layer threshold network with varying input dimension and k hidden nodes, approximating r within a relative error of at most c/k^5, for some positive constant c, is NP-hard even if restricted to situations where the number of examples is at most k^4. Finally, we consider the case when the objective is to minimize the failure ratio f in the presence of missclassification errors. We show that it is NP-hard to approximate f within any constant c>1 for a multilayered threshold network if the input biases are fixed to zero.

%A Myung Ho Kim %T A Geometric Model of Information Retrieval Systems %D December 7, 1999 %Z Wed, 15 Dec 1999 02:00:00 GMT %I DIMACS %R 99-61 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-61.ps.gz %X This decade has seen a great deal of progress in the development of information retrieval systems. Unfortunately, we still lack a systematic understanding of the behavior of the systems and their relationship with documents. In this paper we present a completely new approach towards the understanding of the information retrieval systems. Recently, it has been observed that retrieval systems in TREC 6 show some remarkable patterns in retrieving relevant documents. Based on the TREC 6 observations, we introduce a geometric linear model of information retrieval systems. We then apply the model to predict the number of relevant documents by the retrieval systems. The model is also scalable to a much larger data set. Although the model is developed based on the TREC 6 routing test data, I believe it can be readily applicable to other information retrieval systems. In Appendix, we explained a simple and efficient way of making a better system from the existing systems. %A Edre Boros %A Vladimir Gurvich %A Leonid Khachiyan %A Kazuhise Makino %T Dual-Bounded Hypergraphs: Generating Partial and Multiple Transversals %D December 11, 1999 %Z Thu, 16 Dec 1999 03:00:00 GMT %I DIMACS %R 99-62 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-62.ps.gz %X We consider two natural generalizations of the notion of transversal to a finite hypergraph, so called {\em multiple} and {\em partial} transversals. We show that the hypergraphs of all multiple and all partial transversals are dual-bounded in the sense that in both cases, the size of the dual hypergraph is bounded by a polynomial in the cardinality and the length of description of the input hypergraph. Our bounds are based on new inequalities of extremal set theory and threshold Boolean logic, which may be of independent interest. We also show that the problems of generating all multiple and all partial transversals for a given hypergraph are polynomial-time reducible to the generation of all ordinary transversals for another hypergraph, i.e., to the well-known dualization problem for hypergraphs. As a corollary, we obtain incremental quasi-polynomial-time algorithms for both of the above problems, as well as for the generation of all the minimal Boolean solutions for an arbitrary monotone system of linear inequalities. %A Edre Boros %A Martin Golumbic %A Vadim Levit %T On the Number of Vertices Belonging to all Maximum Stable Sets of a Graph %D December 21, 1999 %Z Wed, 2 Feb 2000 23:00:00 GMT %I DIMACS %R 99-63 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1999/99-63.ps.gz %X Let us denote by $\alpha (G)$ the size of a maximum stable set, and by $\mu(G)$ the size of a maximum matching of a graph $G$, and let $\xi (G)$ be the number of vertices which belong to all maximum stable sets. We shall show that $\xi (G)\geq 1+\alpha (G)-\mu (G)$ holds for any connected graph, whenever $\alpha(G)>\mu(G)$. This inequality improves on related results by Hammer, Hansen and Simeone (1982) and by Levit and Mandrescu (1998). We also prove that on the one hand, $\xi (G)>0$ can be recognized in polynomial time whenever $\mu (G)<\left| V\left( G\right)\right|/3$, and on the other hand that determining whether $\xi (G)>k$ is, in general, NP-complete for any fixed $k\geq 0$. %A Boros, Endre %A Hammer, Peter L. %A Ricca, Frederica %A Simeone, Bruno %T Closed Sets and Generators in Ternary Hamming Spaces %D January 15, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-01.ps.gz %XThe $n$-dimensional ternary Hamming space is $\TT^n$, where $\TT=\{ 0,1,2\}$. Three points in $\TT^n$ form a line if they have in common exactly $n-1$ components. A subset of $\TT^n$ is closed if, whenever it contains two points of a line, it contains also the third one. A generator is a set, whose closure is $\TT^n$. In this paper, we investigate several properties of closed sets and generators. Two alternative proofs of our main result, stating that the minimum cardinality of a generator is $2^n$, are provided.

The present study was motivated by some combinatorial questions concerning origin-destination matrices in transportation systems. %A Boros, Endre %A Hammer, Peter L. %A Ricca, Frederica %A Simeone, Bruno %T On the Complexity of Generation in Ternary Hamming Spaces %D January 15, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-02.ps.gz %X In this paper, a follow-up of DIMACS Report 98-01, we continue to study ternary Hamming spaces $\TT^n$, where $\TT=\{0,1,2\}$, and their generators. In particular, we are interested in the computational complexity of the generation of all vectors with respect to a specific generator $G\subseteq \TT^n$. We consider several parameters, i.e., depth and load, which provide alternative ways to measure the computational effort required in order to generate all points of $\TT^n$ starting from those of $G$. We establish lower and upper bounds on the minimum depth and the minimum load for a generator in, and prove that some of them are sharp. %A Boros, Endre %A Hammer, Peter L. %A Minoux, Michel %A Rader, David %T Optimal Cell Flipping to Minimize Channel Density in VLSI Design and Pseudo-Boolean Optimization %D January 16, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-03.ps.gz %X Cell flipping in VLSI design is an operation in which some of the cells are replaced with their ``mirror images'' with respect to a vertical axis, while keeping them in the same slot. After the placement of all the cells, one can apply cell flipping in order to further decrease the total area, approximating this objective by minimizing total wire length, channel width, etc. However, finding an optimal set of cells to be flipped is usually a difficult problem. In this paper we show that cell flipping can be efficiently applied to minimize channel density in the standard cell technology. We show that an optimal flipping pattern can be found in $O(p(\frac{n}{c})^{c})$ time, where $n$, $p$ and $c$ denote the number of nets, pins and channels, respectively. Moreover, in the one channel case (i.e. when $c=1$) the cell flipping problem can be solved in $O(p\log n)$ time. For the multi-channel case we present both an exact enumeration scheme and a mixed-integer program that generates an approximate solution very quickly. We present computational results on examples up to 139 channels and 65 000 cells. %A Steffen, Eckhard %T Measurements of Uncolorability %D January 16, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-04.ps.gz %X

A snark is a cubic bridgeless graph with chromatic index $\chi'=4$. These graph are also called uncolorable. We introduce parameters measuring the uncolorability of those graphs and relate them to each other.

For $k=2,3$ let $c_i$ be the maximum size of a $k$-colorable subgraph of a cubic graph $G=(V,E)$. We consider $r_3 = |E|-c_3$ and $r_2 = \frac{2}{3}|E| - c_2$. We show that on one side $r_3$ and $r_2$ bound each other, but on the other side that the difference between them can be arbitrarily large.

We also compare them to the oddness $\omega$ of $G$, the smallest possible number of odd circuits in a 2-factor of $G$. We construct cyclically 5-edge connected graphs where $r_3$ and $\omega$ arbitrarily far apart, and show that for each $1 \leq c < 2$ there is a snark such that $\omega \geq c r_3$.

For $k=2,3$ let $\zeta_k$ denote the largest fraction of edges that can be edge $k$-colored. We give best possible bounds for these parameters, and relate them to each other. %A Desper, Richard %T The Set-Maxima Problem, an Overview %D January 19, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-05.ps.gz %X

Sorting problems have long been one of the foundations of theoretical computer science. Sorting problems attempt to learn properties of an unknown total order of a known set. We test the order by comparing pairs of elements, and through repeated tests deduce some order structure on the set.

The set-maxima problem is: given a family S of subsets of a set X, produce the maximal element of each element of S. Local sorting is a sub-problem of set-maxima, when S is a subset of (X (choose) 2), i.e. there exists a graph G with vertex set X and edge set S. We compare algorithms by estimating the number of comparisons needed, as a function of n = |X|, and m = |S|.

In this paper, we review the information-theory lower bounds for the set-maxima and local-sorting problems. We review deterministic algorithms which have optimally solved the set-maxima problem, as a function of m,n, in settings where extra assumptions about S have been made. Also, we review randomized algorithms for local sorting and set-maxima which achieve an optimal expected number of comparisons. %A Wool, Avishai %T Key Management for Encrypted Broadcast %D January 21, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-06.ps.gz %X

We consider broadcast applications where the transmissions need to be encrypted, such as broadband digital TV networks or Internet multicasts. In these applications the number of encrypted TV programs may be very large, but the secure memory capacity at the set-top terminals (STT) is severely limited due to the need to withstand pirate attacks and hardware tampering. Despite this, we would like to allow the vendor to offer different packages of programs to the users. A user who buys a package should be able to view every program belonging to that package, but nothing else. A flexible scheme should allow for packages of various sizes to be offered, from a single program up to all the programs.

We suggest a novel scheme to manage the encryption keys for these applications. The scheme is highly flexible, yet requires very few keys to be stored in the STTs' secure memory. The computational power required of the STTs is very low. The security of this scheme is as good or better than that offered by current technology. %A Jansen, Klaus %T A Characterization for Parity Graphs and a Coloring Problem with Costs %D January 29, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-07.ps.gz %X

In this paper, we give a characterization for parity graphs. A graph is a parity graph, if and only if for every pair of vertices all minimal chains joining them have the same parity. We prove that $G$ is a parity graph, if and only if the cartesian product $G \times K_2$ is a perfect graph.

Furthermore, as a consequence we get a result for the polyhedron corresponding to an integer linear program formulation of a coloring problem with costs. For the case that the costs $k_{v,3} = k_{v,c}$ for each color $c \ge 3$ and vertex $v \in V$, we show that the polyhedron contains only integral $0 / 1$ extrema if and only if the graph $G$ is a parity graph. %A Jansen, Klaus %T The Mutual Exclusion Scheduling Problem for Permutation and Comparability Graphs %D January 29, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-08.ps.gz %X In this paper, we consider the mutual exclusion scheduling problem for comparability graphs. Given an undirected graph $G$ and a fixed constant $m$, the problem is to find a minimum coloring of $G$ such that each color is used at most $m$ times. The complexity of this problem for comparability graphs was mentioned as an open problem by M\"ohring (1985) and for permutation graphs (a subclass of comparability graphs) as an open problem by Lonc (1991). We prove that this problem is already NP-complete for permutation graphs and for each fixed constant $m \ge 6$. %A Ekin, Oya %A Hammer, Peter L. %A Kogan, Alexander %T Convexity and Logical Analysis of Data %D February 2, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-09.ps.gz %X A Boolean function is called $k$-convex if for any pair $x,y$ of its true points at Hamming distance at most $k$, every point ``between'' $x$ and $y$ is also true. Given a set of true points and a set of false points, the central question of Logical Analysis of Data is the study of those Boolean functions whose values agree with those of the given points. In this paper we examine data sets which admit $k$-convex Boolean extensions. We provide polynomial algorithms for finding a $k$-convex extension, if any, and for finding the maximum $k$ for which a $k$-convex extension exists. We study the problem of uniqueness, and provide a polynomial algorithm for checking whether all $k$-convex extensions agree in a point outside the given data set. We estimate the number of $k$-convex Boolean functions, and show that for small $k$ this number is doubly exponential. On the other hand, we also show that for large $k$ the class of $k$-convex Boolean functions is PAC-learnable. %A Goldwasser, Michael H. %T Patience is a Virtue: The Effect of Delay on Competitiveness for Admission Control %D February 4, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-10.ps.gz %X

We consider the problem of scheduling a single resource non-preemptively in order to maximize its utilization. The delay of a job is equal to the gap between its arrival time and the last possible time at which it may be started while still meeting its deadline. We introduce an additional restriction that each job must be willing to accept a delay proportional to its job length. That is, we assume there is some constant w, such that any job of length |J| allows a delay of at least w|J|. This restriction is quite natural for admission control, as it seems reasonable that a job requesting 5 minutes of a resource might be willing to wait at least 10 seconds before beginning if necessary, whereas a job requesting 5 hours of time should be willing to handle a wait of 10 minutes instead.

Our results show that this additional requirement has dramatic effects on the competitiveness of online algorithms. Without this restriction, previous lower bounds show that no algorithm, deterministic or randomized, can achieve any bounded competitiveness for the problem when arbitrarily job lengths are allowed. We show that for any w>0 a simple greedy algorithm is (2 + 1/w)-competitive, and we give lower bounds showing that this is the best possible result for a deterministic algorithm, even when all jobs have one of three distinct lengths. In the special case where all jobs have the same length, previous results give a tight bound of 2 on the competitiveness for deterministic algorithms without a minimum delay. We generalize these results, showing that the competitiveness for any w>0 is exactly 1 + 1/(floor(w)+1). We also give tight bounds for the case where jobs have one of two distinct lengths. %A Abrahams, Julia %T Nonexhaustive Generalized Fibonacci Trees in Unequal Costs Coding Problems %D February 13, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-11.ps.gz %X In a sequence of generalized Fibonacci trees, the k-th tree has the (k-c(i))-th tree as its i-th subtree for a nondecreasing sequence of positive integers c(i), i=1,...,r. For particular initializations, each tree in the generalized Fibonacci sequence solves a minimax coding problem related to Varn coding. Specifically, each symbol from a uniformly distributed source is to be encoded by a string of code symbols associated with the path through the tree from the root to the leaf associated with the source symbol, the i-th code symbol costs c(i), and the goal is to minimize maximum codeword cost. %A Sheng, Li %A Wang, Chi %A Zhang, Peisen %T Tagged Probe Interval Graphs %D February 13, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-12.ps.gz %X A generalization of interval graph is introduced for cosmid contig mapping of DNA. A graph is a tagged probe interval graph if its vertex set can be partitioned into two subsets of probes and nonprobes, and a closed interval can be assigned to each vertex such that two vertices are adjacent if and only if at least one of them is a probe and one end of its corresponding interval is contained in the interval corresponding to the other vertex. We show that tagged probe interval graphs are weakly triangulated graphs, hence are perfect graphs. For a tagged probe interval graph with a given partition, we give a chordal completion that is consistent to any tagged interval completions with respect to the same vertex partition. Forbidden induced subgraph lists are given for trees with or without a given vertex partition. A heuristic that construct map candidates is given for cosmid contig mapping. %A Thomas, Simon %A Velickovic, Boban %T On the Complexity of the Isomorphism Relation for Finitely Generated Groups %D February 18, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-13.ps.gz %X Working within the framework of descriptive set theory, we show that the isomorphism relation for finitely generated groups is a universal essentially countable Borel equivalence relation. We also prove the corresponding result for the conjugacy relation for subgroups of the free group on two generators. The proofs are group-theoretic, and we refer to descriptive set theory only for the relevant definitions and for motivation for the results. %A Awerbuch, Baruch %A Shavitt, Yuval %T Topology Aggregation for Directed Graph %D February 23, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-14.ps.gz %X This paper addresses the problem of aggregating the topology of a sub-network in a compact way with minimum distortion. The problem arises from networks that have a hierarchical structure, where each sub-network must advertise the cost of routing between each pair of its border nodes. The straight-forward solution of advertising the exact cost for each pair has a quadratic cost which is not practical. We look at the realistic scenario of networks where all links are bidirectional, but their cost (or distance) in the opposite directions might differ significantly.

The paper present a solution with distortion that is bounded by the logarithm of the number of border nodes and the square-root of the asymmetry in the cost of a link. This is the first time that a theoretical bound is given to an undirected graph. We show how to apply our solution to the ATM PNNI standard. %A Muchnik, Ilya %A Mottl, Vadim %T Bellman Functions on Trees for Segmentation, Generalized Smoothing, Matching Multi-Alignment in Massive Data Sets %D February 25, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-15.ps.gz %X A massive data set is considered as a set of experimentally acquired values of a number of variables each of which is associated with the respective node of an undirected adjacency graph that presets the fixed structure of the data set. The class of data analysis problems under consideration is outlined by the assumption that the ultimate aim of processing can be represented as a transformation of the original data array into a secondary array of the same structure but with node variables of, generally speaking, different nature, i.e. different ranges. Such a generalized problem is set as the formal problem of optimization (minimization or maximization) of a real-valued objective function of all the node variables. The objective function is assumed to consist of additive constituents of one or two arguments, respectively, node and edge functions. The former of them carry the data-dependent information on the sought-for values of the secondary variables, whereas the latter ones are meant to express the a priori model constraints. For the case when the graph of the pair-wise adjacency of the data set elements has the form of a tree, an effective global optimization procedure is proposed which is based on a recurrent decomposition of the initial optimization problem over all the node variables into a succession of partial problems each of which consists in optimization of an intervening function of only one variable like Bellman functions in the classical dynamic programming. Two kinds of numerical realization of the basic optimization procedure are considered on the basis of parametric representation of the Bellman functions, respectively, for discretely defined and quadratic node and edge functions. The proposed theoretical approach to the analysis of massive data sets is illustrated with its applications to the problems of segmentation, smoothing, fine texture analysis and matching of visual images and geophysical explorative data, as well as to the problem of multi-alignment of long molecular sequences. %A Awerbuch, Baruch %A Du, Yi %A Khan, Bilal %A Shavitt, Yuval %T Routing Through Networks with Hierarchical Topology Aggregation %D March 5, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-16.ps.gz %X In the future, global networks will consist of a hierarchy of subnetworks called domains. For reasons of both scalability and security, domains will not reveal details of their internal structure to outside nodes. Instead, these domains will advertise only a summary, or aggregated view, of their internal structure, e.g., as proposed by the ATM PNNI standard.

This work compares, by simulation, the performance of several different aggregation schemes in terms of network throughput (the fraction of attempted connections that are realized), and network control load (the average number of crankbacks per realized connection.)

Our main results are: Minimum spanning tree is a good aggregation scheme; Exponential link cost functions perform better than min-hop routing; Our suggested logarithmic update scheme that determine when re-aggregation should be computed can significantly reduce the computational overhead due to re-aggregation with a negligable decrease in performance. %A Feder, Tomas %A Shende, Sunil M. %T Online Channel Allocation in FDMA Networks with Reuse Constraints %D March 16, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-17 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-17.ps.gz %X The topology of an FDMA network is modeled by a subdivision of a planar region into hexagonal cells; this defines a graph where every node has at most six neighbors. For channel allocation, every node has a weight that indicates the number of channels that must be assigned to it; the reuse distance defines the minimum distance between nodes that can use the same channel without radio interference. It is known that with reuse distance 2, the number of channels needed and their allocation can be approximated within a factor of $4/3$ both offline and online. We describe an online algorithm for the case of reuse distance 3 that achieves an approximation factor of $7/3$. %A Ibaraki, Toshihide %A Kogan, Alexander %A Makino, Kazuhisa %T Functional Dependencies in Horn Theories %D March 12, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-18.ps.gz %X This paper studies functional dependencies in Horn theories, both when the theory is represented by its clausal form and when it is defined as the Horn envelope of a set of models. We provide polynomial algorithms for the recognition of whether a given functional dependency holds in a given Horn theory, as well as polynomial algorithms for the generation of some representative sets of functional dependencies that hold in a given Horn theory. We show that some functional dependencies inference problems are computationally difficult. We also study the structure of functional dependencies that hold in a Horn theory, show that every such functional dependency is in fact a single positive term Boolean function, and prove that for any Horn theory the set of its minimal functional dependencies is quasi-acyclic. Finally, we consider the problem of condensing a Horn theory, prove that any Horn theory has a unique condensation, and develop an efficient polynomial algorithm for condensing Horn theories. %A Abrego, Bernardo M. %A Fernandez-Merchant, Silvia %T On the Number of Equilateral Triangles in Euclidean Space I %D April 1, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-19.ps.gz %X The following problem was posed by Erd\H{o}s and Purdy: ``What is the maximum number of equilateral triangles determined by a set of $n$ points in ${\Bbb R}^{d}$?'' New bounds for this problem are obtained for dimensions 2, 4 and 5. In addition it is shown that for $d=2$ the maximum is attained by subsets of the regular triangle lattice. %A Saniee, Iraj %A Bienstock, Daniel %T ATM Network Design: Traffic Models and Optimization-Based Heuristics %D April 3, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-20.ps.gz %X We consider the design and capacity expansion of ATM networks as an optimization problem in which flows representing end-to-end variable bit-rate services of different classes are to be multiplexed and routed over ATM trunks and switches so as to minimize the costs of additional switches and transport pipes while meeting service quality and survivability constraints. After discussing the underlying fractional Brownian motion model for aggregate flows, a non-linear multicommodity optimization problem is formulated and heuristics for its approximate solutions are described. Finally, computational results are produced that demonstrate realistic size problems can be solved with the proposed method to shed light on key economic characteristics of ATM traffic, such as safe levels of statistical multiplexing, as well as robust and efficient design alternatives. %A Abbasi, Sarmad %T How tight is the Bollobas-Komlos Conjecture? %D May 6, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-21.ps.gz %X

The bipartite case of the Bollob\'as and Koml\'os conjecture states that for every $\Delta_0, \gamma >0$ there is an $\alpha = \alpha(\Delta_0, \gamma) >0$ such that the following statement holds:

If $G$ is any graph with minimum degree at least

$${ n \over 2} + \gamma n,$$

then $G$ contains as subgraphs all $n$ vertex bipartite graphs, $H,$ satisfying

$$ \Delta(H) \leq \Delta_0 \mbox{ and } b(H) \leq \alpha n.$$

Here $b(H),$ the bandwidth of $H$, is the smallest $b$ such that the vertices of $H$ can be ordered as $v_1, \ldots, v_n$ such that $ v_i \sim_H v_j$ implies $|i-j| \leq b$.

This conjecture has been proved in [1]. In this note we show that this conjecture is tight in the sense that as $\gamma \rightarrow 0$ then $\alpha \rightarrow 0$. More precisely, we show that for any $\gamma \leq { 1 \over 100}$ there is a $\Delta_0$ such that that $\alpha(\Delta_0, \gamma) \leq 4 \gamma $. %A Toth, Geza %A Valtr, Pavel %T Geometric Graphs with Few Disjoint Edges %D May 7, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-22.ps.gz %X

A *geometric graph* is a graph drawn in the plane so that
the vertices are represented by points in general position,
the edges are represented by straight line segments connecting the
corresponding points.

Improving a result of Pach and T\"or\H ocsik, we show that a geometric graph on $n$ vertices with no $k+1$ pairwise disjoint edges has at most $k^3(n+1)$ edges. On the other hand, we construct geometric graphs with $n$ vertices and approximately ${3\over 2}(k-1)n$ edges, containing no $k+1$ pairwise disjoint edges.

We also improve both the lower and upper bounds of Goddard, Katchalski and Kleitman on the maximum number of edges in a geometric graph with no four pairwise disjoint edges. %A Allender, Eric %A Loui, Michael C. %A Regan, Kenneth W. %T Complexity Classes %D May 11, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-23.ps.gz %X This material was written for Chapter 27 of the CRC Handbook of Algorithms and Theory of Computation, edited by Mikhail Atallah. %A Allender, Eric %A Loui, Michael C. %A Regan, Kenneth W. %T Reducibility and Completeness %D May 11, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-24.ps.gz %X This material was written for Chapter 28 of the CRC Handbook of Algorithms and Theory of Computation, edited by Mikhail Atallah. %A Allender, Eric %A Loui, Michael C. %A Regan, Kenneth W. %T Other Complexity Classes and Measures %D May 11, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-25.ps.gz %X This material was written for Chapter 29 of the CRC Handbook of Algorithms and Theory of Computation, edited by Mikhail Atallah. %A Pach, Janos %A Toth, Geza %T Erdos-Szekeres-Type Theorems for Segments and Non-crossing Convex Sets %D May 11, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-26.ps.gz %X A family $\cal F$ of convex sets is said to be in convex position, if none of its members is contained in the convex hull of the others. It is proved that there is a function $N(n)$ with the following property. If $\cal F$ is a family of at least $N(n)$ plane convex sets with non-empty interiors, such that any two members of $\cal F$ have at most two boundary points in common and any three are in convex position, then $\cal F$ has $n$ members in convex position. This result generalizes a theorem of T. Bisztriczky and G. Fejes Tóth \cite{BF1}. The statement does not remain true, if two members of $\cal F$ may share four boundary points. This follows from the fact that there exist infinitely many straight-line segments such that any three are in convex position, but no four are. However, there is a function $M(n)$ such that every family of at least $M(n)$ segments, any four of which are in convex position, has $n$ members in convex position. %A Toth, Geza %T Finding Convex Sets in Convex Position %D May 22, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-27.ps.gz %X Let ${\cal F}$ denote a family of pairwise disjoint convex sets in the plane. ${\cal F}$ is said to be in {\em convex position}, if none of its members is contained in the convex hull of the union of the others. For any fixed $k\geq 5$, we give a linear upper bound on $P_k(n)$, the maximum size of a family ${\cal F}$ with the property that any $k$ members of ${\cal F}$ are in convex position, but no $n$ are. %A Galitsky, Boris %T A Formal Scenario and Metalanguage Support Means to Reason about It %D June 3, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-28.ps.gz %X

We build the reasoning tool of metalanguage support (MS) with the embedded mechanism of the invention of predicates and metapredicates for the comprehension and synthesis tasks. This tool, based on metalanguage representation with weakened soundness, gains the inference flexibility to represent the formal scenarios (for example, a limited class of "logical" anecdotes, derived from NL). We introduce the concept of formal scenario as an alternative to the traditional axiomatic method (logical program) for the practical applications and develop specific MS means to perform the reasoning within a formal scenario. The anecdote scenario is an appealing object of study in the logical programming because it reflects the top level of human intellectual activity on one hand and is rather compact on the other.

We present such applications as scenario comprehension and generation, genetic algorithm of MS formula modifications, NL aspects of scenarios and modeling of the psychological disorder with corrupted pretending and believing concepts (autism). Such MS capabilities for scenario representation as reasoning about action, change and belief, temporal and spatial reasoning, are illustrated.

MS is based on the rules of metalanguage transfer (inconsistency shift, variability shift and assumption enforcement), which connect the object level with the metalevel and introduce a number of ways to construct the scenario explanation predicates. %A Pach, Janos %A Toth, Geza %T Which Crossing Number Is It Anyway? %D June 11, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-29.ps.gz %X A drawing of a graph $G$ is a mapping which assigns to each vertex a point of the plane and to each edge a simple continuous arc connecting the corresponding two points. The crossing number of $G$ is the minimum number of crossing points in any drawing of $G$. We define two new parameters, as follows. The pairwise crossing number (resp. the odd-crossing number) of $G$ is the minimum number of pairs of edges that cross (resp. cross an odd number of times) over all drawings of $G$. We prove that the determination of each of these parameters is an NP-complete problem. We also prove that the largest of these numbers (the crossing number) cannot exceed twice the square of the smallest (the odd-crossing number). Our proof is based on the following generalization of an old result of Hanani, which is of independent interest. Let $G$ be a graph and let $E_0$ be a subset of its edges such that there is a drawing of $G$, in which every edge belonging $E_0$ crosses any other edge an even number of times. Then $G$ can be redrawn so that the elements of $E_0$ are not involved in any crossing. %A Korupolu, Madhukar R. %A Plaxton, C. Greg %A Rajaraman, Rajmohan %T Analysis of a Local Search Heuristic for Facility Location Problems %D June 24, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-30.ps.gz %X In this paper, we study approximation algorithms for several NP-hard facility location problems. We prove that a simple local search heuristic yields polynomial-time constant-factor approximation bounds for the metric versions of the uncapacitated k-median problem and the uncapacitated facility location problem. (For the k-median problem, our algorithms require a constant-factor blowup in the parameter k.) This local search heuristic was first proposed several decades ago, and has been shown to exhibit good practical performance in empirical studies. We also extend the above results to obtain constant-factor approximation bounds for the metric versions of capacitated k-median and facility location problems. %A Krivelevich, Michael %A Sudakov, Benny %T Approximate Coloring of Uniform Hypergraphs %D July 9, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-31.ps.gz %X We consider an algorithmic problem of coloring r-uniform hypergraphs. The problem of finding the exact value of the chromatic number of a hypergraph is known to be NP$hard, so we discuss approximate solutions to it. Using a simple construction and known results on hardness of graph coloring, we show that for any r\>= 3 it is impossible to approximate in polynomial time the chromatic number of r-uniform hypergraphs on n vertices within a factor n^{1-epsilon} for any epsilon>0, unless NP is a subset of ZPP. On the positive side, we present an approximation algorithm for coloring r-uniform hypergraphs on n vertices, whose performance ratio is O(n(loglog n)^2/(\log n)^2). We also describe an algorithm for coloring 3-uniform 2-colorable hypergraphs on n vertices in O(n^{9/41}) colors, thus improving previous results of Chen and Frieze and of Kelsen, Mahajan and Ramesh. %A Alon, Noga %A Krivelevich, Michael %A Sudakov, Benny %T Coloring Graphs with Sparse Neighborhoods %D July 9, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-32.ps.gz %X It is shown that the chromatic number of any graph with maximum degree d in which the number of edges in the induced subgraph on the set of all neighbors of any vertex does not exceed d^2/f is at most O( d/ \log f). This is tight (up to a constant factor) for all admissible values of d and f. %A Hegyvari, Norbert %T Thin Complete Subsequence %D July 13, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-33.ps.gz %X

$A$ is said to be {\em complete} if every sufficiently large integer belongs to the sumset of $A$. $A'$ is {\em thin comlete subsequence} of $A$ if $A'$ is complete and $A'(x)=(1+o(1))\log_2x$.

It is proved that $\lim_{n\to \infty}a_{n+1}/a_n=1$ implies the existence of thin complete subsequence. %A Hegyvari, Norbert %T On the Dimension of the Hilbert-Cubes %D July 13, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-34.ps.gz %X Let $A$ be a sequence of positive integers with positive density. Then $A\cap\{1, 2, \ldots , n\}$ contains a Hilbert (or combinatorial) cube of dimension $c\log\log n$. We prove that this bound can not be replaced by $c'\sqrt{\log n\log\log n}$. %A Abbasi, Sarmad %T Do Answers Help in Posing Questions? %D July 14, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-35.ps.gz %X Let $T_n$ denote a complete binary tree of depth $n$. Each internal node, $v$, of $T_n$ has two children denoted by $\lf(v)$ and $\rt(v)$. Consider the following game between two Players, Paul and Carole. For each internal node, $v$, Carole chooses $X(v) \in \{ \lf(v) , \rt(v) \}$. This naturally defines a path from the root, $\lambda$, of $T_n$ to one of its leaves as follows: $$\lambda, X(\lambda), X^2(\lambda ), \ldots, X^n(\lambda).$$ Paul has to find this path by asking questions of the following form:

\smallskip \centerline{$Q_v$:``Is $X(v) = \lf(v)$?''} \smallskip

\noindent The game proceeds in $r$ rounds. In every round Paul can
$k$ questions. Carole supplies the answers to these $k$ questions
in parallel.
We give necessary and sufficient conditions for Paul to win this game.
%A Liberatore, Vincenzo
%T Broadcast Disk Paging with a Small Cache
%D July 16, 1998
%Z Sat, 19 Sep 1998 20:00:00 GMT
%I DIMACS
%R 98-36
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-36.ps.gz
%X We consider the broadcast disk paging problem when the cache size is 2 and n pages are transmitted. We show an algorithm
that is (n - 2 / (n - 2))-competitive and prove that no better competitive ratio is possible.
%A Kalantari, Bahman
%T Scaling Dualities and Self-Concordant Homogeneous Programming in Finite Dimensional Spaces
%D July 20, 1998
%Z Mon, 14 Sep 1998 20:00:00 GMT
%I DIMACS
%R 98-37
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-37.ps.gz
%X In this paper first we prove four fundamental theorems of the
alternative, called scaling dualities, characterizing exact and
approximate solvability of four significant conic problems in finite
dimensional spaces, defined as: homogeneous programming (HP), scaling
problem (SP), homogeneous scaling problem (HSP), and algebraic scaling
problem (ASP), the problem of testing the solvability of the scaling
equation (SE), a fundamental equation inherited from properties of
homogeneity. These four problems together with the scaling dualities
offer a new point of view into the theory and practice of convex and
nonconvex programming. Nontrivial special cases over the nonnegative
orthant include: testing if the convex-hull of a set of points
contains the origin (equivalently, testing the solvability of
Karmarkar's canonical LP), computing the minimum of the
arithmetic-geometric mean ratio over a subspace, testing the
solvability of the diagonal matrix scaling equation (DADe=e), as well
as NP-complete problems. The scaling dualities closely relate these
seemingly unrelated problems. Via known conic LP dualities, convex
programs can be formulated as HP. Scaling dualities go one step
further, allowing us to view HP as a problem dual to the corresponding
SP, HSP, or ASP. This duality is in the sense that HP is solvable if
and only if the other three are not. Our scaling dualities give
nontrivial generalization of the arithmetic-geometric mean,
trace-determinant, and Hadamard inequalities; matrix scaling theorems;
and the classic duality of Gordan. We describe potential-reduction
and path-following algorithms for these four problems, resulting in
novel and conceptually simple polynomial-time algorithms for linear,
quadratic, semidefinite, and self-concordant programming.
Furthermore, the algorithms are more powerful than their existing
counterparts, since they also establish the polynomial-time
solvability of the corresponding SP, HSP, as well as many cases of
ASP. The scaling problems either have not been addressed in the
literature, or have been treated in very special cases. The
algorithms are based on the scaling dualities, significant bounds
obtained in this paper, properties of homogeneity, as well as Nesterov
and Nemirovskii's machinery of self-concordance. In particular, our
results extend the polynomial-time solvability of matrix scaling
equation (even in the presence of a subspace) to general cases of SE
over the nonnegative cone, or the semidefinite cone, or the
second-order cone.
%A Kalantari, Bahman
%T On the Arithmetic-Geometric Mean Inequality and Its Relationship to Linear Programming, Matrix Scaling, and Gordan's Theorem
%D July 20, 1998
%Z Mon, 14 Sep 1998 20:00:00 GMT
%I DIMACS
%R 98-38
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-38.ps.gz
%X It is a classical inequality that the minimum of the ratio of the
(weighted) arithmetic mean to the geometric mean of a set of positive
variables is equal to one, and is attained at the center of the
positivity cone. While there are numerous proofs of this fundamental
homogeneous inequality, in the presence of an arbitrary subspace,
and/or the replacement of the arithmetic mean with an arbitrary linear
form, the new minimization is a nontrivial problem. We prove a
generalization of this inequality, also relating it to linear
programming, to the diagonal matrix scaling problem, as well as to
Gordan's theorem. Linear programming is equivalent to the search for
a nontrivial zero of a linear or positive semidefinite quadratic form
over the nonnegative points of a given subspace. The goal of this
paper is to present these intricate, surprising, and significant
relationships, called {\it scaling dualities}, and via an elementary
proof. Also, to introduce two conceptually simple polynomial-time
algorithms that are based on the scaling dualities, significant
bounds, as well as Nesterov and Nemirovskii's machinery of
self-concordance. The algorithms are simultaneously applicable to
linear programming, to the computation of a separating hyperplane, to
the diagonal matrix scaling problem, and to the minimization of the
arithmetic-geometric mean ratio over the positive points of an
arbitrary subspace. The scaling dualities, the bounds, and the
algorithms are special cases of a much more general theory on convex
programming, developed by the author. For instance, via the scaling
dualities semidefinite programming is a problem dual to a
generalization of the classical trace-determinant ratio minimization
over the positive definite points of a given subspace of the Hilbert
space of symmetric matrices.
%A Alexander K. Kelmans
%T Crossing Properties of Graph Reliability Functions
%D August 10, 1998
%Z Thu, 19 Nov 1998 20:00:00 GMT
%I DIMACS
%R 98-39
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-39.ps.gz
%X We consider undirected graphs, and assume that each edge of $G$ exists
with probability $p \in (0,1)$.
The *all--terminal reliability function* of such a random graph $G$ is the
probability that the spanning subgraph formed by the existing edges is connected.
A * two--graph* is a graph
with two distinguished vertices called * terminals*.
The * two--terminal reliability function* of a two--graph $G$ is the
probability that the subgraph of $G$ induced by the existing edges contains
a path connecting the terminals.
Till recently no
examples of pairs of graphs have been known whose all--terminal or two--terminal
reliabilities cross more than twice and only one example of exactly two
crossings. In [8] we proved that for every integer $N$ there exist
pairs of graphs (two--graphs) whose all--terminal (respectively, two--terminal)
reliabilities cross more than
$N$ times. In this paper we give simple constructions that provide,
for every set $\{(n_1,m_1), \ldots (n_k,m_k)\}$ of ordered pairs of integers
(where all $m_i$ are distinct), a pair of graphs (two--graphs)
of the same size whose all--terminal (respectively, two--terminal) reliability
functions have exactly
$n_1 + \ldots + n_k$ crossings and exactly $n_i$ crossings
of multiplicity $m_i$ for every $i = 1, \ldots , k$.
%A Muthukrishnan, S.
%A Rajaraman, R.
%T An Adversarial Model for Distributed Dynamic Load Balancing
%D August 10, 1998
%Z Mon, 14 Sep 1998 20:00:00 GMT
%I DIMACS
%R 98-40
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-40.ps.gz
%X

We study the problem of balancing the load on processors of an arbitrary network. If jobs arrive or depart during the process of load balancing, we have the dynamic load balancing problem; otherwise, we have the static load balancing problem. While static load balancing on arbitrary and special networks has been well studied, very little is known about dynamic load balancing. The difficulty lies in modeling the arrivals and departures of jobs in a clean manner.

In this paper, we initiate the study of dynamic load balancing by modeling job traffic using an adversary. Our main result is that a simple, local control distributed load balancing algorithm maintains the load of the network within a stable level against this powerful adversary. Our results hold for different models of traffic patterns and processor communication. %A Barg, Alexander %A Guritman, Sugi %A Simonis, Juriaan %T Strengthening the Varshamov-Gilbert Bound %D August 13, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-41.ps.gz %X The paper discusses some ways to strengthen (nonasymptotically) the Gilbert-Varshamov bound for linear codes. The unifying idea is to study a certain graph constructed on vectors of low weight in the cosets of the code, which we call the Varshamov graph. Various simple estimates of the number of its connected components account for better lower bounds on the minimum distance of codes, some of them known in the literature. %A Yu, Yang %T Note on the Permanent Rank of a Matrix %D September 1, 1998 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 98-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-42.ps.gz %X Define the perrank of a matrix $A$ to be the size of a largest square submatrix of $A$ with nonzero permanent. Motivated in part by the Alon-Jaeger-Tarsi Conjecture \cite{at}, we prove several results on perranks. %A Ahmed Belal %A Magdy A.Ahmed %A Shymaa M.Arafat %D September 13, 1998 %Z Wed, 25 Nov 1998 20:00:00 GMT %I DIMACS %R 98-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-43.ps.gz %X

Two-dimensional alphabetic trees have many applications in a
wide variety of diverse fields. Despite the existense of a
relatively fast algorithm that finds an approximate 2-dimensional
optimal alphabetic tree(OAT), a dynamic programming approach must still
be used to determine the exact solution.To apply dynamic programming,
the (OAT) can be found by examining all the nodes
in the 2-d array of weights as possible roots for the optimal tree. In
this paper we introduce the concept of *goodness* for each cut to limit
the search for an optimal solution.The measure of goodness is a value we
call *the expense of the cut*, only cuts with expense less than a given
limit L
are considered *good cuts*. At every level of the tree only the good cuts
are tested as possible candidates for an optimal solution.

**Key Words :** alphabetic tree, optimal tree, dynamic programming, two
dimensional alphabetic trees, branch and bound.

*Key words and phrases:*
Distribution-free tests; Kolmogorov-Smirnov test;
Kuiper's test; Cram\'er-von Mises test;
Watson $U_n^{\,2}$ test.

Quasi-conformal mappings were first introduction by L. Ahlfors in 1935, in connection with open Riemann surfaces and value distribution theory, and then studied by O. Teichm\"uller in 1938. In 1956 A. Beurling and Ahlfors proved an extension theorem which states that a quasi-symmetric map $\phi:{\bold R\/} \to {\bold R\/}$ can be extended to a quasiconformal map of the upper half plane. In 1964 Ahlfors had proven a similar theorem for quasi-symmetric mappings of ${\bold R\/}^2$. Then in 1979 P. Tukia and J. V\"ais\"al\"a a proved quasiconformal extension to dimensions $\geq 5$. The case of extension from 3 to 4 dimensions is still unknown.

In this paper we prove three main quasi-conformal extension results. First we prove the Tukia-V\"ais\"al\"a theorem, which states that a quasiconformal map of upper half space ${\bold R_+^{n+1} \to {\bold R\/}_+^{n+1}$ provided $n \geq 5$. The second result is a similar extension theorem for the strongly uniform domains introduced by Heinonen and Yang 1993. Finally, we solve a problem posed Pansu in 1989 regarding the existence of a quasiconformal extension theorem for complex hyperbolic spaces. We can in fact prove this result for manifolds of pinched negative curvature.

%A Oliver Attie %A Jonathan Block %T Poincaré Duality for LAbstract: We use stochastic population models to study the evolution of Ultraselfish Gene Complexes (USGC's). USGC's are chromosomal regions characterized by segregation distortion: a heterozygote bearing the USGC passes it to more than 50 of offspring. USGC-bearing homozygotes are sterile. USGC's promote themselves at the expense of other genes in the same genome. They have been observed in animal, plant and fungal species and may lie, undetected, in many others. While the molecular drive mechanisms differ, USGC's exhibit similar genetic features, suggesting that the fundamental evolutionary mechanisms that allow their emergence may be shared.

We present Markov models and Monte Carlo simulations of genetic drift in populations of USGC's. Genetic drift, the stochastic behavior of allele frequencies in small populations, is a fundamental force in the process of evolution, yet the role of genetic drift in the evolution of USGC's is not well understood. Our analysis shows how genetic drift causes USGC's to evolve different characteristics in species typified by small and large populations, respectively. We apply our results to two well studied USGC's: the $t$-haplotype in Mus musculus and Segregation Distorter in Drosophila melanogaster.

%A Shahrokhi, Farhad %A Ondrej Sykora %A Laszlo A. Szekely %A Imrich Vrto %T A New Lower Bound for the Bipartite Crossing Number with Applications %D November 19, 1998 %Z Thu, 19 Nov 1998 20:00:00 GMT %I DIMACS %R 98-52 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-52.ps.gz %X Let $G$ be a connected bipartite graph.% on $n$-vertices. We give a short proof, using a variation of Menger's Theorem, for a new lower bound which relates theIn addition to other support, Aravind Srinivasan's research was supported in part by NSF grant NSF-STC91-19999 to DIMACS and by support to DIMACS from the New Jersey Commission on Science and Technology. %A Carpenter, Tamra %A Cosares, Steven %A Saniee, Iraj %T Demand Routing and Slotting on Ring Networks %D January 28, 1997 %Z Thu, 13 Feb 1997 21:46:48 GMT %I DIMACS %R 97-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-02.ps.gz %X We describe an important class of problems that arise in the economic design of "survivable" networks. Such networks are capable of accommodating all of the traffic between pairs of locations, even if some arbitrary link or node in the network is rendered unusable. Cycles play an important role in the design of survivable networks because they represent two-connected subnetworks of minimal size. To cost-effectively utilize cycles, we must determine the minimum capacity required for the links in the cycle, subject to constraints on how traffic must be routed and how capacity must be utilized. Depending upon the situation being modeled, different versions of the problem arise. The least restrictive versions are solvable in polynomial time, while the more restrictive (and more realistic) versions are NP-hard. This paper focuses on variants of the problem in which time-slot assignment constraints are enforced to model the operation of the equipment placed at the nodes of a SONET ring. We present several simple heuristic methods for addressing the problem, and we show that they are 2-approximation algorithms. %A Todorcevic, Stevo %A Vaananen, Jouko %T Trees and Ehrenfeucht-Fraisse games %D February 5, 1997 %Z Wed, 5 Feb 1997 18:47:25 GMT %I DIMACS %R 97-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-03.ps.gz %X We study trees T of height at most omega-1 with no uncountable branches, and their applications in the study of pairs (A,B) of non-isomorphic structures over a fixed vocabulary. There is a natural quasi-ordering of such trees in terms of the existence of a strictly increasing mapping from one tree to another. We investigate in depth the structure of this quasi-ordering and relate its properties to properties of pairs (A,B) of structures. Many new constructions of pairs of highly equivalent non-isomorphic structures are given. %A Fang, Weiwu %T On a Global Optimization Problem in the Study of Information Discrepancy %D February 5, 1997 %Z Wed, 5 Feb 1997 18:50:20 GMT %I DIMACS %R 97-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-04.ps.gz %X A nonlinear function has been introduced for indexing the disagreement degree of a group of judgment matrices (in FW, 1994). It has many good properties and may be applied in decision making and information process. In this paper, we will discuss a global optimization problem concerned with the global maximum of this function which is constrained on some sets of matrices. Because the size of matrix groups in the problem is arbitrary and the number of local maximum solutions increases exponentially, numerical methods are not suitable and formalized results are desired for the problem. By an approach somewhat similar to the branch and bounded method, we have obtained some formulae on global maximums, a sufficient and necessary condition of the function's taking the maximums, and its some maximum solution sets. %A Lisitsa, Alexei %A Sazonov, Vladimir %T On linear ordering finitely-branching graphs and non-well-founded sets %D February 6, 1997 %Z Tue, 25 Feb 1997 19:16:50 GMT %I DIMACS %R 97-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-05.ps.gz %X Definability of a linear order in finite structures of some given class by logical means is an important issue of finite model theory and descriptive complexity theory. For example, first-order logic extended by least fixed point operator (FO+LFP) describes exactly PTIME-computability over finite linear ordered structures \cite{Imm82,Var82,Imm86}. If the structures considered are not linearly ordered in advance, but some order may be uniformly definable in FO+LFP then the same result holds for this class of structures, as well.

We will show in this paper that this is the case for the class of finite strongly extensional (SE) graphs (i.e. arbitrary graphs considered up to bisimulation equivalence relation). The vertices of such graphs serve as a faithful representation of hereditarily-finite non-well-founded sets HFA (with finite transitive closure). Actually, we define in FO+IFP a linear ordering on arbitrary SE finitely branching graphs which may be infinite and so represent a more general class of hereditarily-finite non-well-founded sets HFA^{\infty} (with infinite transitive closure).

Our interest to these questions arose from our work on characterizing the complexity classes of computable set-theoretic operations by some bounded set-theory (BST) languages \cite{Saz87,Saz93,Saz95,SaLi95, LiSa97}. So, definability in the language $\Delta_{D}$ considered in \cite{Saz95} over HFA-sets was characterised in terms of definablity in FO+LFP over finite graphs of the abovementioned kind.

Now, having definable linear ordering, we answer affirmatively the question in op. cit. on coincidence of $\Delta_{D}$-definability with PTIME-computability over HFA.

We propose two different definitions of linear ordering, one of which is an application of the method of A.Dawar, S.Lindell and S.Weinstein to the case of SE graphs and give a comparison of both approaches in terms of coherence of the orders on different graphs.

Cf. also related papers in ftp://ftp.botik.ru/rented/logic/papers/ http://www.botik.ru/PSI/AIReC/logic/

-------------------

Both authors from Program Systems Institute of Russian Acad. of Sci., Pereslavl-Zalessky, 152140, Russia.

e-mail: sazonov@logic.botik.rulisitsa@logic.botik.ru

phones: +7-08535-98945 and 98942, fax: +7-08535-20566.

Supported by RBRF (project 96-01-01717).

The work on this paper was started when the second author visited Princeton and Rutgers Universities (DIMACS) in 1996. %A Shelah, Saharon %T Erdos and Renyi Conjecture %D February 11, 1997 %Z Tue, 4 Mar 1997 16:06:41 GMT %I DIMACS %R 97-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-06.ps.gz %X Affirming a conjecture of P. Erdos and R\'enyi we prove that for any $c_1 > 0$ for some $c_2 > 0$, if a graph $G$ has no $c_1\ \ \log n$ nodes on which the graph is complete or edgeless (i.e. $G$ exemplifies $|G| \nrightarrow (c_1 \text{ log } n)^2_2$) \underbar{then} $G$ has at least $2^{c_2n}$ non-isomorphic (induced) subgraphs. %A Valtr, Pavel %T On geometric graphs with no k pairwise parallel edges %D March 6, 1997 %Z Fri, 15 Mar 1997 00:17:08 GMT %I DIMACS %R 97-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-07.ps.gz %X A {\em geometric graph\/} is a graph $G=(V,E)$ drawn in the plane so that the vertex set $V$ consists of points in general position and the edge set $E$ consists of straight line segments between points of $V$. Two edges of a geometric graph are said to be {\em parallel\/}, if they are opposite sides of a convex quadrilateral.

In this paper we show that, for any fixed $k\ge3$, any geometric graph on $n$ vertices with no $k$ pairwise parallel edges contains at most $O(n)$ edges, and any geometric graph on $n$ vertices with no $k$ pairwise crossing edges contains at most $O(n\log n)$ edges. We also prove a conjecture of Kupitz that any geometric graph on $n$ vertices with no pair of parallel edges contains at most $2n-2$ edges. %A Derrida, B. %A Lebowitz, J. L. %A Speer, E. R. %T Shock Profiles for the Partially Asymmetric Simple Exclusion Process %D March 7, 1997 %Z Fri, 7 Mar 1997 19:47:32 GMT %R 97-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-08.ps.gz %X The asymmetric simple exclusion process (ASEP) on a one-dimensional lattice is a system of particles which jump at rates $p$ and $1-p$ (here $p>1/2$) to adjacent empty sites on their right and left respectively. The system is described on suitable macroscopic spatial and temporal scales by the inviscid Burgers' equation; the latter has shock solutions with a discontinuous jump from left density $\rho_-$ to right density $\rho_+$, $\rho_-<\rho_+$, which travel with velocity $(2p-1)(1-\rho_+-\rho_-)$. In the microscopic system we may track the shock position by introducing a second class particle, which is attracted to and travels with the shock. In this paper we obtain the time invariant measure for this shock solution in the ASEP, as seen from such a particle. The mean density at lattice site $n$, measured from this particle, approaches $\rho_{\pm}$ at an exponential rate as $n\to\pm\infty$, with a characteristic length which becomes independent of $p$ when $p/(1-p)>\sqrt{\rho_+(1-\rho_-)/\rho_-(1-\rho_+)}$. When $p/(1-p)=\rho_+(1-\rho_-)/\rho_-(1-\rho_+)$ the measure is Bernoulli, with density $\rho_-$ on the left and $\rho_+$ on the right. In the weakly asymmetric limit, $2p-1\to0$, the microscopic width of the shock diverges as $(2p-1)^{-1}$. The stationary measure is then essentially a superposition of Bernoulli measures, corresponding to a convolution of a space-dependent density profile described by the viscous Burgers equation with a well-defined distribution for the location of the second class particle. %A Nielaba, P. %A Lebowitz, J.L. %T Phase Transitions in the Multicomponent Widom--Rowlinson Model and in Hard Cubes on the BCC--Lattice %D March 14, 1997 %Z Fri, 14 Mar 1997 23:48:30 GMT %I DIMACS %R 97-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-09.ps.gz %X We use Monte Carlo techniques and analytical methods to study the phase diagram of the $M$--component Widom--Rowlinson model on the bcc--lattice: there are $M$ species all with the same fugacity $z$ and a nearest neighbor hard core exclusion between unlike particles. Simulations show that for $M \geq 3$ there is a ``crystal phase'' for $z$ lying between $z_c(M)$ and $z_d(M)$ while for $z > z_d(M)$ there are $M$ demixed phases each consisting mostly of one species. For $M=2$ there is a direct second order transition from the gas phase to the demixed phase while for $M \geq 3$ the transition at $z_d(M)$ appears to be first order putting it in the Potts model universality class. For $M$ large, Pirogov-Sinai theory gives $z_d(M) \sim M-2+2/(3M^2) + ... $. In the crystal phase the particles preferentially occupy one of the sublattices, independent of species, i.e.\ spatial symmetry but not particle symmetry is broken. For $M \to \infty$ this transition approaches that of the one component hard cube gas with fugacity $y = zM$. We find by direct simulations of such a system a transition at $y_c \simeq 0.71$ which is consistent with the simulation $z_c(M)$ for large $M$. This transition appears to be always of the Ising type. %A Malkhi, Dahlia %A Reiter, Michael %A Wool, Avishai %T Optimal Byzantine Quorum Systems %D March 17, 1997 %Z Wed, 9 Apr 1997 18:02:30 GMT %I DIMACS %R 97-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-10.ps.gz %X Replicated services accessed via quorums enable each access to be performed at only a subset (quorum) of the servers, and achieve consistency across accesses by requiring any two quorums to intersect. Recently, b-masking quorum systems, whose intersections contain at least 2b+1 servers, have been proposed to construct replicated services tolerant of b arbitrary (Byzantine) server failures. In this paper we consider a hybrid fault model allowing benign failures in addition to the Byzantine ones. We present four novel constructions for b-masking quorum systems, each of which has optimal load (the probability of access of the busiest server) or optimal availability (probability of some quorum surviving failures). To show optimality we also prove lower bounds on the load and availability of any b-masking quorum system in this model. %A Valtr, Pavel %T On the density of subgraphs in a graph with bounded independence number %D March 18, 1997 %Z Fri, 21 Mar 1997 16:40:05 GMT %I DIMACS %R 97-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-11.ps.gz %X Let $\sigma(n,m,k)$ be the largest number $\sigma\in[0,1]$ such that any graph on $n$ vertices with independence number at most $m$ has a subgraph on $k$ vertices with at least $\sigma\cdot{k\choose 2}$ edges. Up to a constant multiplicative factor, we determine $\sigma(n,m,k)$ for all $n,m,k$. For $\log n\le m=k\le n$, our result gives $\sigma(n,m,m)=\Theta({\log(n/m)\over m})$, which was conjectured by Alon. %A Chen, Hui %T Threshold of Broadcast in Random Graphs %D April 2, 1997 %Z Tue, 2 Apr 1997 00:10:03 GMT %R 97-12 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-12.ps.gz %X Broadcasting process is to send a piece of information which resides at one nodein the graph to all the remaining nodes. At each time step, a node knows about the information can send it to one of its neighbors. The broadcast problem is to find the minimum time step needed. The problem is NP hard in general. For a random graph $G_{n,p}$, we are interested in at what value of $p$ there exists a broadcast tree of depth exactly $\lc \log _2n\rc $. Frieze and Molloy \cite{FriMo} show that $p$ is of magnitude $\Theta ( \ln n/n)$. In this paper, we give the exact threshold of theedge probability for the existence of such a tree. %A Karolyi, Gyula %A Pach, Janos %A Toth, Geza %A Valtr, Pavel %T Ramsey-Type Results for Geometric Graphs. II %D April 4, 1997 %Z Mon, 7 Apr 1997 19:57:59 GMT %R 97-13 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-13.ps.gz %X We show that for any 2--coloring of the ${n \choose 2}$ segments determined by $n$ points in the plane, one of the color classes contains non-crossing cycles of lengths $3,4,\ldots,\lfloor\sqrt{n/2}\rfloor$. This result is tight up to a multiplicative constant. Under the same assumptions, we also prove that there is a non-crossing path of length $\Omega(n^{2/3})$, all of whose edges are of the same color. In the special case when the $n$ points are in convex position, we find longer monochromatic non-crossing paths, of length $\lceil\frac{n+1}{2}\rceil$. This bound cannot be improved. We also discuss some related problems and generalizations. In particular, we give sharp estimates for the largest number of disjoint monochromatic triangles that can always be selected from our segments. %A Khachiyan, Leonid %A Porkolab, Lorant %T Computing Integral Points in Convex Semi-algebraic Sets %D April 10, 1997 %Z Tue, 15 Apr 1997 19:46:12 GMT %I DIMACS %R 97-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-14.ps.gz %X Let $Y$ be a convex set in $R^k$ defined by polynomial inequalities and equations of degree $d \ge 2$ with integer coefficients of binary length $l$. We show that if $Y \cap Z^k \ne \emptyset$, then $Y$ contains an integral point of binary length $ld^{O(k^4)}$. For fixed $k$, our bound implies a polynomial-time algorithm for computing an integral point $y \in Y$. In particular, we extend Lenstra's theorem on the polynomial-time solvability of linear integer programming in fixed dimension to semidefinite integer programming. %A Reiter, Michael K. %A Rubin, Aviel D. %T Crowds: Anonymity for Web Transactions %D April 15, 1997 %Z Thu, 16 Jun 1997 19:20:23 GMT %I DIMACS %R 97-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-15new.ps.gz %X In this announcement we introduce a system called Crowds for protecting users' anonymity on the world-wide-web. Crowds, named for the notion of ``blending into a crowd'', operates by grouping users into a large and geographically diverse group (crowd) that collectively issues requests on behalf of its members. Web servers are unable to learn the true source of a request because it is equally likely to have originated from any member of the crowd, and indeed collaborating crowd members cannot distinguish the originator of a request from a member who is merely forwarding the request on behalf of another. Our security analysis introduces {\em degrees of anonymity} as an important tool for describing and proving anonymity properties. %A Muchnik, Ilya %A Mottl, Vadim %T Optimization algorithms for separable functions with tree-like adjacency of variables and their application to the analysis of massive data sets %D April 17, 1997 %Z Thu, 24 Apr 1997 23:45:56 GMT %I DIMACS %R 97-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-16.ps.gz %X A massive data set is considered as a set of experimentally acquired values of a number of variables each of which is associated with the respective node of an unoriented conjugacy graph that presets the fixed structure of the data set. The class of data analysis problems under consideration is outlined by the assumption that the ultimate aim of processing can be represented as a transformation of the original data array into a secondary array of the same structure but with node variables of, generally speaking, different nature, i.e. different ranges. Such a generalized problem is set as the formal problem of optimization (minimization or maximization) of a real-valued objective function of all the node variables. The objective function is assumed to consist of additive constituents of one or two arguments, respectively, node and edge functions. The former of them carry the data-dependent information on the sought-for values of the secondary variables, whereas the latter ones are meant to express the a priori model constraints. For the case when the graph of the pair-wise adjacency of the data set elements has the form of a tree, an effective global optimization procedure is proposed which is based on a recurrent decomposition of the initial optimization problem over all the node variables into a succession of partial problems each of which consists in optimization of a function of only one variable like Bellman functions in the classical dynamic programming. Two kinds of numerical realization of the basic optitimization procedure are considered on the basis of parametrical representation of Bellman functions, respectively, for discretely defined and quadratic node and edge functions. The proposed theoretical approach to the analysis of massive data sets is illustarted with its applications to the problems of the analysis of long molecular sequences and large visual images. %A Erdos, Peter L. %A Steel, Michael A. %A Szekely, Laszlo A. %A Warnow, Tandy J. %T Constructing big trees from short sequences %D May 6, 1997 %Z Tue, 6 May 1997 19:20:21 GMT %R 97-17 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-17.ps.gz %X The construction of evolutionary trees is a fundamental problem in biology, and yet methods for reconstructing evolutionary trees are not reliable when it comes to inferring accurate topologies of large divergent evolutionary trees from realistic lengh sequences. We address this problem and present a new polynomial time algorithm for reconstructing evolutionary trees called the Short Quartets Method which is consistent and which has greater statistical power than other polynomial time methods, such as Neighbor-Joining and the 3-approximation algorithm by Agarwala et al. (and the "Double Pivot" variant of the Agarwala et al. algorithm by Cohen and Farach) for the $L_{\infty}$-nearest tree problem. Our study indicates that our method will produce the correct topology from shorter sequences than can be guranteed using these other methods. %A DasGupta, Bhaskar %A He, Xin %A Jiang, Tao %A Li, Ming %A Tromp, John %T On the Linear-Cost Subtree-Transfer Distance between Phylogenetic Trees %D May 9, 1997 %Z Mon, 12 May 1997 17:23:13 GMT %R 97-18 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-18.ps.gz %X Different phylogenetic trees for the same group of species are often produced either by procedures that use diverse optimality criteria [14] or from different genes [10] in the study of molecular evolution. Comparing these trees to find their similarities and dissimilarities (i.e. distance) is thus an important issue in computational molecular biology. Several distance metrics including the nearest neighbor interchange (nni) distance and the subtree-transfer distance have been proposed and extensively studied in the literature. This article considers a natural extension of the subtree-transfer distance, called the linear-cost subtree-transfer distance, and studies the complexity and efficient approximation algorithms for this distance as well as its relationship to the nni distance. The linear-cost subtree-transfer model seems more logical than the (unit-cost) subtree-transfer model in some applications. The following is a list of our results.

1. The linear-cost subtree-transfer distance is in fact identical to the nni distance on unweighted phylogenies.

2. There is an algorithm to compute an optimal linear-cost subtree-transfer sequence between unweighted phylogenies in O(n^2 log n + n 2^{O(d)}) time, where d denotes the linear-cost subtree-transfer distance. Such an algorithm is usual when d is small.

3. Computing the linear-cost subtree-transfer distance between two weighted phylogenetic trees is NP-hard, provided we allow multiple leaves of a tree to share the same label (i.e. the trees are not necessarily uniquely labeled).

4. There is an efficient approximation algorithm for computing the linear-cost subtree-transfer distance between weighted phylogenies with performance ratio 2. %A Kayll, P. Mark %T Asymptotics of the total chromatic number for multigraphs %D May 14, 1997 %Z Thu, 22 May 1997 21:15:51 GMT %R 97-19 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-19.ps.gz %X or loopless multigraphs, the total chromatic number is asymptotically its fractional counterpart as the latter invariant increases without bound. This extends DIMACS Technical Report 96-20, where the same assertion, restricted to simple graphs, was proved. The proof for multigraphs --- presented in this note --- is based on a recent theorem, due to Kahn, establishing the analogous asymptotic behaviour of the list-chromatic index.

FOOTNOTE

This note is a companion article to DIMACS Technical Report 96-20. %A Blass, Andreas %A Gurevich, Yuri %A Shelah, Saharon %T Choiceless polynomial time %D June 2, 1997 %Z Mon, 2 Jun 1997 19:09:53 GMT %I DIMACS %R 97-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-20.ps.gz %X Turing machines define polynomial time (PTime) on strings but cannot deal with structures like graphs directly, and there is no known, easily computable string encoding of isomorphism classes of structures. Is there a computation model whose machines do not distinguish between isomorphic structures and compute exactly PTime properties? This question can be recast as follows: Does there exist a logic that captures polynomial time (without presuming the presence of a linear order)? Earlier, one of us conjectured the negative answer. The problem motivated a quest of stronger and stronger PTime logics. All these logics avoid arbitrary choice. Here we attempt to capture the choiceless fragment of PTime. Our computation model is a version of abstract state machines (formerly called evolving algebras). The idea is to replace arbitrary choice with parallel execution. The resulting logic is more expressive than other PTime logics in the literature. A more difficult theorem shows that the logic does not capture all PTime. %A Lisitsa, Alexei %A Sazonov, Vladimir %T Bounded Hyperset Theory and Web-like Data Bases %D June 6, 1997 %Z Mon, 9 Jun 1997 20:58:01 GMT %I DIMACS %R 97-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-21.ps.gz %X It is demonstrated how a Hyperset Theory (satisfying P.Aczel's Anti-Foundation Axiom) naturally arises from the idea of World-Wide Web (WWW). Alternatively, Web serves as an illustration and possible application of the abstract notion of antifounded sets. A $\Delta$-language of Bounded Hyperset Theory is presented as a query language to the Web or, more generally, to Web-like Data Bases (WDB). It is shown that it describes exactly all abstract (or generic, up to bisimulation) PTIME-computable queries with respect to (possibly cyclic) graph encoding of hypersets. This result is based (i) on reducing of the $\Delta$-language for hypersets to the language FO+LFP ( = First-Order Logic with the Least Fixed-Point operator) over graphs considered up to bisimulation relation and (ii) on definability in FO+LFP of a linear ordering on any strongly extensional finite graph by a single formula (cf.\ also DIMACS TR-97-05). The case of finitely branching, possibly infinite graphs and corresponding hypersets is also discussed. It corresponds to finitely-branching, but infinite Web. However, it deserves special further investigation. %A Valtr, Pavel %T Graph drawings with no k pairwise crossing edges %D June 10, 1997 %Z Wed, 11 Jun 1997 22:39:24 GMT %I DIMACS %R 97-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-22.ps.gz %X A geometric graph is a graph $G=(V,E)$ drawn in the plane so that the vertex set $V$ consists of points in general position and the edge set $E$ consists of straight line segments between points of $V$. It is known that, for any fixed $k$, any geometric graph $G$ on $n$ vertices with no $k$ pairwise crossing edges contains at most $O(n log n)$ edges. In this paper we give a new, simpler proof of this bound, and show that the same bound holds also when the edges of $G$ are represented by $x$-monotone curves (Jordan arcs). %A Vanderbei, Robert J. %T Extension of Piyavskii's Algorithm to Continuous Global Optimization %D June 24, 1997 %Z Mon, 30 Jun 1997 22:50:12 GMT %I DIMACS %R 97-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-23.ps.gz %X We use the simple, but little-known, result that a uniformly continuous function on a convex set is $\epsilon$-Lipschitz (as defined below) to extend Piyavskii's algorithm for Lipschitz global optimization to the larger domain of continuous (not-necessarily-Lipschitz) global optimization. %A Kagan, Abram %A Mallows, Colin %A Shepp, Larry %A Vanderbei, Robert J. %A Vardi, Yehuda %T Symmetrization of Binary Random Variables %D June 24, 1997 %Z Mon, 30 Jun 1997 23:30:12 GMT %I DIMACS %R 97-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-24.ps.gz %X A random variable $Y$ is called an {\em independent symmetrizer} of a given random variable $X$ if (a) it is independent of $X$ and (b) the distribution of $X+Y$ is symmetric about $0$. In cases where the distribution of $X$ is symmetric about its mean, it is easy to see that the constant random variable $Y = - \Exp X$ is a minimum-variance independent symmetrizer. Taking $Y$ to have the same distribution as $-X$ clearly produces a symmetric sum but it may not be of minimum variance. We say that a random variable $X$ is {\em symmetry resistant} if the variance of any symmetrizer, $Y$, is never smaller than the variance of $X$. Let $X$ be a binary random variable: $\Prob \{ X = a \} = p$ and $\Prob \{ X = b \} = q$ where $a \ne b$, $0 < p < 1$, and $q = 1-p$. We prove that such a binary random variable is symmetry resistant if (and only if) $p \ne 1/2$. Note that the minimum variance as a function of $p$ is discontinuous at $p = 1/2$. Dropping the independence assumption, we show that the minimum-variance reduces to $pq - \min (p,q)/2$, which is a continuous function of $p$. %A Alon, Noga %A Dietzfelbinger, Martin %A Miltersen, Peter Bro %A Petrank, Erez %A Tardos, Gabor %T Linear Hashing %D June 27, 1997 %Z Mon, 3 Jul 1997 17:34:00 GMT %I DIMACS %R 97-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-25.ps.gz %X

Consider the set ${\cal H}$ of all linear (or affine) transf ormations between two vector spaces over a finite field $F$. We study how good $\cal H$ is as a class of hash functions, namely we consider hashing a set $S$ of size $n$ into a range having the same cardinali ty $n$ by a randomly chosen function from ${\cal H}$ and look at the expected size of the largest hash bucket. $\cal H$ is a universal class of hash functions for any fini te field, but with respect to our measure different fields behave differen tly.

If the finite field $F$ has $n$ elements then there is a bad set $S\subset F^2$ of size $n$ with expected maximal bucket size $\Omega(n^{1/3})$. If $n$ is a perfect square then there is even a bad set with largest bucket size {\em always} at least $\sqrt n$. (This is worst possible, since with respect to a universal class of hash functions every set of size $n$ has expected largest bucket size below $\sqrt n+1/2$.)

If, however, we consider the field of two elements then we get much better bounds. The best previously known upper bound on the expected size of the largest bucket for this class was $O( 2^{\sqrt{\log n}})$. We reduce this upper bound to $O(\log n\log\log n)$. Note that this is not far from the guarantee for a random function. There, the average largest bucket would be $\Theta(\log n/\log \log n)$.

In the course of our proof we develop a tool which may be of independent interest. Suppose we have a subset $S$ of a vector space $D$ over ${\bf Z}_2$, and consider a random linear mapping of $D$ to a smaller vector space $R$. If the cardinality of $S$ is larger than $c_\e|R|\log|R|$ then with probability $1-\e$, the image of $S$ will cover all elements in the range. %A Petrank, Erez %A Rackoff, Charles %T CBC MAC for Real-Time Data Sources %D June 27, 1997 %Z Mon, 3 Jul 1997 17:34:00 GMT %I DIMACS %R 97-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-26.ps.gz %X The Cipher Block Chaining (CBC) Message Authentication Code (MAC) is an authentication method which is widely used in practice. It is well known that the naive use of CBC MAC for variable length messages is not secure, and a few thumb rules for the correct use of CBC MAC are known by ``folklore''. The first rigorous proof of the security of CBC MAC, when used on fixed length messages, was given only recently by Bellare, Kilian and Rogaway \cite{bkr}. They also suggested variants of CBC MAC that handle variable length messages but in these variants the length of the message has to be known in advance (i.e., before the message is processed). \par We study CBC authentication of real time applications in which the length of the message is not known until the message ends, and furthermore, since the application is real-time, it is not possible to start processing the authentication only after the message ends. Providing authentication for real time communication is an important task, which involves authenticating real time speech transmissions, real time camera source of video transmission, and other human-driven multi media interaction. This also involves fax transmissions, in which the number of pages is not known in advance, and we would like to send the authentication as soon as the last page has been fed into the machine. \par We first present a variant of CBC MAC, called {\em double MAC} (DMAC) which handles messages of variable unknown lengths. Computing DMAC on a message is virtually as simple and as efficient as computing the standard CBC MAC on the message. We provide a rigorous proof that its security is implied by the security of the underlying block cipher. Next, we argue that the basic CBC MAC is secure when applied to prefix free message space. A message space can be made prefix free by authenticating also the (usually hidden) last character which marks the end of the message. %A Kilian, Joe %A Petrank, Erez %A Tardos, Gabor %T Probabilistically Checkable Proofs with Zero Knowledge %D June 27, 1997 %Z Mon, 3 Jul 1997 17:34:00 GMT %I DIMACS %R 97-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-27.ps.gz %X We construct PCPs with strong zero-knowledge properties. First, we construct polynomially bounded (in size) PCP's for {\sf NP} which can be checked using poly-logarithmic queries, with polynomially low error, yet are statistical zero-knowledge against an adversary that makes $U$ arbitrary queries, where $U$ can be set to any polynomial. Second, we construct PCPs for {\sf{NEXPTIME}} that can be checked using polynomially many queries, yet are statistically zero-knowledge against any polynomially bounded adversary. These PCPs are exponential in size and have exponentially low error. Previously, it was only known how to construct zero-knowledge PCPs with a constant error probability. \par In the course of constructing these PCP's we abstract a tool we call {\em locking systems}. We provide the definition and also a locking system with very efficient parameters. This mechanism may be useful in other settings as well. %A Kilian, Joe %A Petrank, Erez %T Identity Escrow %D June 27, 1997 %Z Mon, 3 Jul 1997 17:34:00 GMT %I DIMACS %R 97-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-28.ps.gz %X We introduce the notion of {\em escrowed identity}, an application of key-escrow ideas to the problem of identification. In escrowed identity, one party $A$ does {\em not} give his identity to another party $B$, but rather gives him information that would allow an authorized third party $E$ to determine $A$'s identity. However, $B$ receives a guarantee that $E$ can indeed determine $A$'s identity. We give protocols for escrowed identity based on the El-Gamal (signature and encryption) schemes and on the RSA function. A useful feature of our protocol is that after setting up $A$ to use the system, $E$ is only involved when it is actually needed to determine $A$'s identity. %A Krivelevich, Michael %A Sudakov, Benny %T The chromatic numbers of random hypergraphs %D June 29, 1997 %Z Mon, 2 Jul 1997 21:03:00 GMT %I DIMACS %R 97-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-29.ps.gz %X For a pair of integers 0 < t < r, the t-chromatic number of an r-uniform hypergraph H=(V,E) is the minimal k, for which there exists a partition of V into color classes U1,...,Uk such that every edge of H intersects every color class in at most t vertices. In this paper we determine the asymptotic behavior of the t-chromatic number of the random r-uniform hypergraph on n vertices with edge probability p=p(n) for all possible values of t and for a wide range of values of p. %A Derrida, B. %A Lebowitz, J. L. %A Speer, E. R. %T Shock Profiles for the Asymmetric Simple Exclusion Process in One Dimension %D July 2, 1997 %Z Mon, 7 Jul 1997 16:27:00 GMT %I DIMACS %R 97-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-30.ps.gz %X The asymmetric simple exclusion process (ASEP) on a one-dimensional lattice is a system of particles which jump at rates $p$ and $1-p$ (here $p>1/2$) to adjacent empty sites on their right and left respectively. The system is described on suitable macroscopic spatial and temporal scales by the inviscid Burgers' equation; the latter has shock solutions with a discontinuous jump from left density $\rho_-$ to right density $\rho_+$, $\rho_-<\rho_+$, which travel with velocity $(2p-1)(1-\rho_+-\rho_-)$. In the microscopic system we may track the shock position by introducing a second class particle, which is attracted to and travels with the shock. In this paper we obtain the time invariant measure for this shock solution in the ASEP, as seen from such a particle. The mean density at lattice site $n$, measured from this particle, approaches $\rho_{\pm}$ at an exponential rate as $n\to\pm\infty$, with a characteristic length which becomes independent of $p$ when $p/(1-p)>\sqrt{\rho_+(1-\rho_-)/\rho_-(1-\rho_+)}$. For a special value of the asymmetry, given by $p/(1-p)=\rho_+(1-\rho_-)/\rho_-(1-\rho_+)$, the measure is Bernoulli, with density $\rho_-$ on the left and $\rho_+$ on the right. In the weakly asymmetric limit, $2p-1\to0$, the microscopic width of the shock diverges as $(2p-1)^{-1}$. The stationary measure is then essentially a superposition of Bernoulli measures, corresponding to a convolution of a density profile described by the viscous Burgers equation with a well-defined distribution for the location of the second class particle. %A Toth, Geza %A Valtr, Pavel %T Note on the Erdos-Szekeres theorem %D July 4, 1997 %Z Fri, 11 Jul 1997 19:20:15 GMT %I DIMACS %R 97-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-31.ps.gz %X Let $g(n)$ denote the least integer such that among any $g(n)$ points in general position in the plane there are always $n$ in convex position. In 1935 P. Erd\H os and G. Szekeres showed that $g(n)$ exists and $2^{n-2}+1\le g(n)\le {2n-4\choose n-2}+1$. Recently, the upper bound has been slightly improved by Chung and Graham and by Kleitman and Pachter. In this note we further improve the upper bound to $$g(n)\le {2n-5\choose n-2}+2.$$ %A Beimel, Amos %A Kushilevitz, Eyal %T Learning Boxes in High Dimension %D June 27, 1997 %Z Fri, 11 Jul 1997 19:20:15 GMT %I DIMACS %R 97-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-32.ps.gz %X We present exact learning algorithms that learn several classes of (discrete) boxes in $\{0,\ldots,\ell-1\}^n$. In particular we learn: (1) The class of unions of $O(\log n)$ boxes in time $\poly(n,\log\ell)$ (solving an open problem of \cite{GGM94,BGGM94}; in~\cite{BBBKV97} this class is shown to be learnable in time $\poly(n,\ell)$). (2) The class of unions of disjoint boxes in time $\poly(n,t,\log\ell)$, where $t$ is the number of boxes. (Previously this was known only in the case where all boxes are disjoint in one of the dimensions; in~\cite{BBBKV97} this class is shown to be learnable in time $\poly(n,t,\ell)$). In particular our algorithm learns the class of decision trees over $n$ variables, that take values in $\{0,\ldots,\ell-1\}$, with comparison nodes in time $\poly(n,t,\log\ell)$, where $t$ is the number of leaves (this was an open problem in \cite{Bsh93} which was shown in \cite{BBBKV96} to be learnable in time $\poly(n,t,\ell)$). (3) The class of unions of $O(1)$-degenerate boxes (that is, boxes that depend only on $O(1)$ variables) in time $\poly(n,t,\log\ell)$ (generalizing the learnability of $O(1)$-DNF and of boxes in $O(1)$ dimensions). The algorithm for this class uses only equivalence queries and it can also be used to learn the class of unions of $O(1)$ boxes (from equivalence queries only). %A Farach, Martin %A Liberatore, Vincenzo %T On Local Register Allocation %D July 11, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-33.ps.gz %X

In this paper, we consider the problem of Local Register Allocation (LRA): given a sequence of instructions (basic block) and a number of general purpose registers, find the schedule of variables in registers that minimizes the total traffic between CPU and the memory system. Local register allocation has been studied for more than thirty years in the theory and compiler communities.

It was not known if LRA is NP-hard, but no subexponential time algorithm was known. Furthermore, the most popular heuristics in use in compilers can perform arbitrarily poorly in the worst case. In this paper, we present the following results:

- We show that the Local Register Allocation problem is NP-hard.
- We show that a variant of the furthest-first heuristic achieves a good approximation ratio.
- We give a 2-approximation algorithm for LRA.

We report the experimental performance of a branch-and-bound algorithm and both approximation algorithms on standard benchmarks. %A Boros, Endre %A Cepek, Ondrej %A Kogan, Alexander %T Horn Minimization by Iterative Decomposition %D July 11, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-34.ps.gz %X The problem of Horn minimization can be stated as follows: given a Horn CNF representing a Boolean function $f$, find a CNF representation of $f$ which consists of a minimum possible number of clauses. This problem is the formalization of the problem of knowledge compression for speeding up queries to propositional Horn expert systems, and it is known to be NP-hard. In this paper we present a linear time algorithm which takes a Horn CNF as an input, and through a series of decompositions reduces the minimization of the input CNF to the minimization problem on a ``shorter" CNF. The correctness of this decomposition algorithm rests on several interesting properties of Horn functions which, as we prove here, turn out to be independent of the particular CNF representations. %A Spencer, Joel %A Thoma, Lubos %T On the Limit Values of Probabilities for the First Order Properties of Graphs %D July 21, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-35.ps.gz %X Consider the random graph ${\cal G}(n,p),$ where $p=p(n)$ is any threshold function satisfying $p(n) = \Theta(\ln n / n).$ We give a full characterization of the limit values of probabilities of ${\cal G}(n,p)$ having a property $\psi,$ where $\psi$ is any sentence of the first order theory of graphs. %A Fang, Weiwu %T FDOD Function and the Information Discrepancy Contained in Multiple Probability Distributions %D July 23, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-36.ps.gz %X

The concept of Shonnon information has played a significant role in a variety of scientific and engineering areas. The question naturally arises: how can we measure information discrepancy contained in two or more probability distributions? The answer to this problem will be very interesting in both theory and practice. Some measures for the cases of two or three distributions have presented by the pioneers, but these measures have some disadvantages; moreover, there doesn't exist a measure for $n$ distributions so far.

A FDOD function with many good properties has been introduced in the study of information discrepancy of judgments of multiple experts ( FW 1994). In this paper, based on the ideas concerned with Shannon information and measures of difference, we propose an axiom set for measuring the information discrepancy contained in a group of distributions, and prove that the only function satisfying the axiom set is of the FDOD form. The final results and even the intermediate results in deed show the close connection of the FDOD function with Shannon information and the measures of difference in statistics. %A Muramatsu,Masakazu %A Vanderbei, Robert J. %T Affine Scaling Algorithms Fail for Semidefinite Programming %D July 25, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-37.ps.gz %X In this paper, we give an example of a semidefinite programming problem in which primal-dual affine-scaling algorithms using the XZ, MT, and AHO directions fail. We prove that each of these algorithm can generate a sequence converging to a non-optimal solution, and that, for the AHO direction, even its associated continuous trajectory can converge to a non-optimal point. In contrast with these directions, we show that the primal-dual affine-scaling algorithm using the NT direction for the same semidefinite programming problem always generates a sequence converging to the optimal solution. Both primal and dual problems have interior feasible solutions, unique optimal solutions which satisfy strict complementarity, and are nondegenerate everywhere. %A Lo, Eddie H. %A Ostheimer, Gretchen %T A Practical Algorithm for Finding Matrix Representations for Polycyclic Groups %D August 10, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-38 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-38.ps.gz %X We describe a new algorithm for finding matrix representations for polycyclic groups given by finite presentations. In contrast to previous algorithms, our algorithm is efficient enough to construct representations for some interesting examples. The examples which we studied included a collection of free nilpotent groups, and our results here led us to a theoretical result concerning such groups. %A Grunewald, Stefan %A Steffen, Eckhard %T Chromatic-Index Critical Graphs of Even Order %D August 21, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-39.ps.gz %X

A $k$-critrical graph $G$ has maximum degree $k \geq 0$, chromatic index $\chi'(G) = k+1$ and $\chi'(G-e) < k+1$ for each edge $e$ of $G$.

The Critical Graph Conjecture independently stated by I.T. Jakobsen (On critical graphs with chromatic index 4, Disc. Math. 9 (1974) 265-276), and L. W. Beineke, R. J. Wilson (On the edge chromatic number of a graph, Disc. Math. 5 (1973) 15-20) claims that every $k$-critical graph is of odd order. S. Fiorini, R. J. Wilson (Edge-colourings of graphs, in L. W. Beineke, R. J. Wilson (eds.) Selected Topics in Graph Theory, Academic Press London (1978) 103-126) conjectured that every $k$-critical graph of even order has a 1-factor. A. G. Chetwynd and H. P. Yap (Chromatic index critical graphs of order 9, Disc. Math. 47 (1983) 23-33) stated the problem whether it is true that if $G$ is a $k$-critical graph of odd order, then $G-v$ has a 1-factor for every vertex $v$ of minimum degree. These conjectures are disproved and the problem is answered in the negative for $k \in \{3,4\}$.

We disprove these conjectures and answer the problem in the negative for all $k \geq 3$.

We also construct $k$-critical graphs on $n$ vertices with degree sequence $23^24^{n-3}$, answering a question of Yap (Some topics in graph theory, London Math.~Soc.~LNS 108, Cambridge University Press (1986)). %A Allender, Eric %A Beals, Robert %A Ogihara, Mitsunori %T The Complexity of Matrix Rank and Feasible Systems of Linear Equations %D August 27, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-40.ps.gz %X

We characterize the complexity of some natural and important problems in linear algebra. In particular, we identify natural complexity classes for which the problems of (a) determining if a system of linear equations is feasible and (b) computing the rank of an integer matrix, (as well as other problems), are complete under logspace reductions.

As an important part of presenting this classification, we show that the ``exact counting logspace hierarchy'' collapses to near the bottom level. (We review the definition of this hierarchy below.) We further show that this class is closed under NC^1-reducibility, and that it consists of exactly those languages that have logspace uniform span programs (introduced by Karchmer and Wigderson) over the rationals.

In addition, we contrast the complexity of these problems with the complexity of determining if a system of linear equations has an integer solution. %A Beimel, Amos %A Franklin, Matthew %T Reliable Communication over Partially Authenticated Networks %D August 28, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-41.ps.gz %X Reliable communication between parties in a network is a basic requirement for executing any protocol. In this work, we consider the effect on reliable communication when some pairs of parties have common authentication keys. The %%%distribution of pairs sharing keys define a natural ``authentication graph'', which may be quite different from the ``communication graph'' of the network. We characterize when reliable communication is possible in terms of these two graphs, focusing on the very strong setting of a Byzantine adversary with unlimited computational resources. %A DIMACS %T Proceedings of Third DIMACS Workshop on DNA Based Computers %D June 23, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-42 %U http://dimacs.rutgers.edu/Workshops/DNA3/index.html %X Proceedings of Third DIMACS Workshop on DNA Based Computers %A DIMACS %T DIMACS Workshop on Design and Formal Verification of Security Protocols %D September 3, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-43 %U http://dimacs.rutgers.edu/Workshops/Security/index.html %X DIMACS Workshop on Design and Formal Verification of Security %A Ostheimer, Gretchen %T Practical Algorithms for Polycyclic Matrix Groups %D September 7, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-44.ps.gz %X Many fundamental problems are undecidable for infinite matrix groups. Polycyclic matrix groups represent a large class of groups for which these same problems are known to be decidable. In this paper we describe a suite of new algorithms for studying polycyclic matrix groups --- algorithms for testing membership and for uncovering the polycyclic structure of the group. We also describe an algorithm for deciding whether or not a group is solvable, which, in the important context of subgroups of GL(n,Z), is equivalent to deciding whether or not a group is polycyclic. In contrast to previous algorithms, the algorithms in this paper are practical: experiments show that they are efficient enough to be useful in studying some reasonably complex examples using current technology. %A Allender, Eric %T Some Pointed Questions Concerning Asymptotic Lower Bounds %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-45.ps.gz %X This column was originally written to appear in the Bulletin of the EATCS. In this column, I survey some aspects of complexity theory that can hardly be considered ``recent''. Instead, I will focus on some fundamental truths relating complexity theory to practical computing. These truths deserve repeating, lest they become forgotten. %A Reinhardt, Klaus %A Allender, Eric %T Making Nondeterminism Unambiguous %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-46.ps.gz %X We show that in the context of nonuniform complexity, nondeterministic logarithmic space bounded computation can be made unambiguous. An analogous result holds for the class of problems reducible to context-free languages. In terms of complexity classes, this can be stated as: NL/poly = UL/poly LogCFL/poly = UAuxPDA(log n, poly)/poly %A Mundhenk, Martin %A Goldsmith, Judy %A Lusena, Christopher %A Allender, Eric %T Encyclopaedia of Complexity Results for Finite-Horizon Markov Decision Process Problems %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-47.ps.gz %X The computational complexity of finite horizon policy evaluation and policy existence problems are studied for several policy types and representations of Markov decision processes. In almost all cases, the problems are shown to be complete for their complexity classes; classes range from nondeterministic logarithmic space and probabilistic logarithmic space (highly parallelizable classes) to exponential space. In many cases, this work shows that problems that already were widely believed to be hard to compute are probably intractable (complete for NP, NP^PP , or PSPACE), or provably intractable (EXPTIME-complete or worse). The major contributions of the paper are to pinpoint the complexity of these problems; to isolate the factors that make these problems computationally complex; to show that even problems such as median-policy or average-policy evaluation may be intractable; and the introduction of natural NP^PP-complete problems. %A Agrawal, Manindra %A Allender, Eric %A Datta, Samir %T On TC^0, AC^0, and Arithmetic Circuits %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-48 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-48.ps.gz %X Continuing a line of investigation that has studied the function classes #P, #SAC^1, #L, and #NC^1, we study the class of functions #AC^0. One way to define #AC^0 is as the class of functions computed by constant-depth polynomial-size arithmetic circuits of unbounded fan-in addition and multiplication gates. In contrast to the preceding function classes, for which we know no nontrivial lower bounds, lower bounds for #AC^0 follow easily from established circuit lower bounds.

One of our main results is a characterization of TC^0 in terms of #AC^0: A language A is in TC^0 if and only if there is a #AC^0 function f and a number k such that x is in A iff f(x) = 2^|x|^k. Using the naming conventions of this area of research, this yields: TC^0 = PAC^0 = C=AC^0

Another restatement of this characterization is that TC^0 can be simulated by constant-depth arithmetic circuits, with a single threshold gate. We hope that perhaps this characterization of TC^0 in terms of AC^0 circuits might provide a new avenue of attack for proving lower bounds.

Our characterization differs markedly from earlier characterizations of TC^0 in terms of arithmetic circuits over finite fields. Using our model of arithmetic circuits, computation over finite fields yields ACC.

We also prove a number of closure properties and normal forms for #AC^0. %A Allender, Eric %T Circuit Complexity before the Dawn of the New Millennium %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-49 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-49.ps.gz %X The 1980's saw rapid and exciting development of techniques for proving lower bounds in circuit complexity. This pace has slowed recently, and there has even been work indicating that quite different proof techniques must be employed to advance beyond the current frontier of circuit lower bounds. Although this has engendered pessimism in some quarters, there have in fact been many positive developments in the past few years showing that significant progress is possible on many fronts. This paper is a (necessarily incomplete) survey of the state of circuit complexity as we await the dawn of the new millennium. %A Allender, Eric %A Lange, Klaus-Joern %T RUSPACE(log n) is contained in DSPACE(log^2 n / loglog n) %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-50 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-50.ps.gz %X We present a deterministic algorithm running in space O(log^2 n/ loglog n) solving the connectivity problem on strongly unambiguous graphs. In addition, we present an O(log n) time-bounded algorithm for this problem running on a parallel pointer machine. %A Allender, Eric %T The Permanent Requires Large Uniform Threshold Circuits %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-51 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-51.ps.gz %X A recent paper by Caussinus, McKenzie, Therien, and Vollmer shows that there are problems in the counting hierarchy that require superpolynomial-size uniform TC^0 circuits. Their proof uses ``leaf languages'' as a tool in obtaining their separations, and their proof does not immediately yield larger lower bounds for the complexity of these problems, and it also does not yield a lower bound for any particular problem at any fixed level of the counting hierarchy. (It only shows that hard problems must exist at some level.) In this paper, we give a simple direct proof, showing that any problem that is hard for the complexity class C_=P requires more than size T(n), if for all k, T^{(k)}(n) = o(2^n). Thus, in particular, the permanent, and any problem hard for PP or #P require circuits of this size. Related and somewhat weaker lower bounds are also presented, extending the theorem of (Caussinus et al.) showing that ACC^0 is properly contained in ModPH. %A Fredman, Michael L. %T Pairing Heaps Are Suboptimal %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-52 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-52.ps.gz %X Pairing heaps were introduced as a self-adjusting alternative to Fibonacci heaps. They provably enjoy log n amortized costs for the standard heap operations. Although it has not been verified that pairing heaps perform the decrease key operation in constant amortized time, this has been conjectured and extensive experimental evidence supports this conjecture. Moreover, pairing heaps have been observed to be superior to Fibonacci heaps in practice. However, as demonstrated in this paper, pairing heaps do not accommodate decrease key operations in constant amortized time. %A Dreyer, Paul A. Jr. %A Biedl, Therese C. %T Rectangle Breaking in Grids %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-53 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-53.ps.gz %X Given an $n \times n$ square grid, there are the outlines of $n^2$ $1 \times 1$ squares, $(n-1)^2$ $2 \times 2$ squares, and so on. What is the minimum number of edges that can be removed from the grid such that there is no complete outline of a square remaining? Using techniques from tiling problems and other graph theoretic methods, we solve the problem for all $n$ and prove similar results for rectangular grids for which the side lengths differ by no more than six. We also introduce another version of the problem where we ask for the minimum number of edges to ``break'' all of the rectangles in a grid. %A Kolaitis, Phokion %T Special Year on Logic and Algorithms Tutorial Notes: Expressive Powers of Logics (Tutorial Lectures by Phokion Kolaitis) %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-54 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-54.ps.gz %X

The 1995-1996 DIMACS Special Year on Logic and Algorithms began with three week-long tutorial sessions on the topics of the special year:

- Finite Model Theory
- Proof Complexity
- Computer-Aided Verification

These notes on the lectures given by Phokion Kolaitis were compiled by Eric Allender from the notes supplied by Randall J. Pruim, Brenda J. Latka, Jim Rogers, Maria-Luisa Bonet, and Kousha Etessami. %A Immerman, Neil %T Special Year on Logic and Algorithms Tutorial Notes: Descriptive Complexity (Tutorial Lectures by Neil Immerman) %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-55 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-55.ps.gz %X

The 1995-1996 DIMACS Special Year on Logic and Algorithms began with three week-long tutorial sessions on the topics of the special year:

- Finite Model Theory
- Proof Complexity
- Computer-Aided Verification

These notes on the lectures given by Neil Immerman were compiled by Eric Allender from the notes supplied by A. Arratia, Sandeep K. Shukla, J. Avigad, and D. Sivakumar. %A Lynch, James %T Special Year on Logic and Algorithms Tutorial Notes: Random Finite Models (Tutorial Lectures by James Lynch) %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-56 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-56.ps.gz %X

The 1995-1996 DIMACS Special Year on Logic and Algorithms began with three week-long tutorial sessions on the topics of the special year:

- Finite Model Theory
- Proof Complexity
- Computer-Aided Verification

These notes on the lectures given by James Lynch were compiled by Eric Allender from the notes supplied by Thomas Wilke, James Lynch, Sandeep K. Shukla, Helmut Veith, and Ida Pu. %A Kurshan, Robert %A Shukla, Sandeep %T Special Year on Logic and Algorithms Tutorial Notes: Complexity Issues in Automata-Theoretic Verification: The COSPAN Approach to Deal with These Issues(Tutorial Lectures by Robert Kurshan) %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-57 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-57.ps.gz %X The 1995-1996 DIMACS Special Year on Logic and Algorithms began with three week-long tutorial sessions on the topics of the special year: Finite Model Theory Proof Complexity Computer-Aided Verification These notes on the lectures given by Robert Kurshan were compiled by Sandeep K. Shukla. %A Bacso, Gabor %A Boros, Endre %A Gurvich, Vladimir %A Maffray, Frederic %A Preissmann, Myriam %T On Minimally Imperfect Graphs with Circular Symmetry %D September 8, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-58 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-58.ps.gz %X Results of Lovász and Padberg entail that the class of so-called partitionable graphs contains all the potential counterexamples to Berge's famous Strong Perfect Graph Conjecture (which asserts that the only minimally imperfect graphs are the odd chordless cycles with at least five vertices and their complements). Only two constructions (due to Chvátal, Graham, Perold and Whitesides) are known for making partitionable graphs. The first one does not produce any counterexample to Berge's Conjecture, as shown by Sebö. Here we prove that the second construction does not produce any counterexample either. We conjecture that every partitionable graph with circular symmetry can be generated by this second construction, and give results in this direction. We also show that, regardless of this conjecture, every minimally imperfect graph with circular symmetry must have an odd number of vertices. %A Brinkmann, Gunnar %A Steffen, Eckhard %T Chromatic Index Critical Graphs of Orders 11 and 12 %D September 9, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-59 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-59.ps.gz %X A chromatic-index-critical graph $G$ on $n$ vertices is non-trivial if it has at most $\Delta \lfloor \frac{n}{2} \rfloor$ edges. We prove that there is no chromatic-index-critical graph of order 12, and that there are precisely two non-trivial chromatic index critical graphs on 11 vertices. Together with known results this implies that there are precisely three non-trivial chromatic-index-critical graphs of order $\leq 12$. %A Agrawal, Manindra %A Allender, Eric %A Rudich, Steven %T Reductions in Circuit Complexity: An Isomorphism Theorem and a Gap Theorem %D September 11, 1997 %Z Fri, 23 Jul 1999 20:00:00 GMT %I DIMACS %R 97-60 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-60.ps.gz %X We show that all sets that are complete for NP under non-uniform AC^0 reductions are isomorphic under non-uniform AC^0-computable isomorphisms. Furthermore, these sets remain NP-complete even under non-uniform NC^0 reductions. More generally, we show two theorems that hold for any complexity class C closed under (uniform) NC^1-computable many-one reductions. Gap: The sets that are complete for C under AC^0 and NC^0 reducibility coincide. Isomorphism: The sets complete for C under AC^0 reductions are all isomorphic under isomorphisms computable and invertible by AC^0 circuits of depth three. Our Gap Theorem does not hold for strongly uniform reductions: we show that there are Dlogtime-uniform AC^0-complete sets for NC^1 that are not Dlogtime-uniform NC^0-complete.} (This technical report replaces DIMACS TR 96-04.) %A Allender, Eric %A Arvind, V %A Mahajan, Meena %T Arithmetic Complexity, Kleene Closure, and Formal Power Series %D September 12, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-61 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-61.ps.gz %X

In this paper we show how two fundamental operations used in formal language theory provide useful tools for the investigation of arithmetic complexity classes. More precisely, we use Kleene closure of languages and inversion of formal power series to investigate subclasses of the complexity class GapL. GapL is the complexity class that characterizes the complexity of computing the determinant; it corresponds to counting the number of accepting and rejecting paths of nondeterministic logspace-bounded Turing machines.) We define a counting version of Kleene closure and show that it is intimately related to inversion within the complexity classes GapL and GapNC^1. In particular, we prove that Kleene closure and inversion are both hard operations in the following sense

- There is a set in AC^0 for which Kleene closure is NL-complete and inversion is GapL-complete.
- There is a finite set for which Kleene closure is NC^1-complete and inversion is GapNC^1-complete.

Furthermore, we classify the complexity of the Kleene closure of finite languages. We formulate the problem in terms of finite monoids and relate its complexity to the internal structure of the monoid. %A Brightwell, Graham R. %A Winkler, Peter %T Graph Homomorphisms and Phase Transitions %D September 17, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-62 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-62.ps.gz %X We model physical systems with "hard constraints" by the space Hom(G,H) of homomorphisms from a locally finite graph G to a fixed finite constraint graph H. For any assignment $\lambda$ of positive real activities to the nodes of H, there is at least one Gibbs measure on Hom(G,H); when G is infinite, there may be more than one.

When G is a regular tree, the simple, invariant Gibbs measures on Hom(G,H) correspond to node-weighted branching random walks on H. we show that such walks exist for every H and $\lambda$, and characterize those H which, by admitting more than one such construction, exhibit phase transition behavior. %A Durand, D. %A Farach, M. %A Ravi, R. %A Singh, M. %T A Short Course in Computational Molecular Biology %D September 17, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-63 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-63.ps.gz %X The advent of recombinant DNA technology during the 1970s has led to an inundation of biological sequence data. The compilation and analysis of DNA and protein sequences is now a fundamental task in molecular biology requiring. Computational Molecular Biology is the field of computer science that has emerged to solve algorithmic problems in determining sequences and analyzing them. Specific research efforts in this area include sequencing and mapping, pairwise and multiple sequence comparison, protein structure determination and evolutionary tree reconstruction. Solutions to these problems contribute both to basic scientific research and product development in the biotechnology industry. We have designed a course to give a basic introduction to the major algorithmic research areas in computational biology. %A Malkhi, Dahlia %A Reiter, Michael %A Rubin, Avi %T The Design and Implementation of a Java Playground %D September 19, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-64 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-64.ps.gz %X Mobile code presents a number of threats to the machines that execute it. In this paper we introduce an approach for protecting machines and the resources they hold from mobile code, and describe a system based on our approach for protecting host machines from Java applets. In our approach, each Java applet downloaded to the protected domain is rerouted to a dedicated machine (or set of machines), the playground, at which it is executed. Prior to execution the applet is transformed to use the downloading user's web browser as a graphics terminal for its input and output, and so the user has the illusion that the applet is running on her own machine. In reality, however, mobile code runs only in the sanitized environment of the playground, where user files cannot be mounted and from which only limited network connections are accepted by machines in the protected domain. We describe the design, implementation, performance and limitations of our system, and directions for future work. %A Abbasi, Sarmad %T Longest Common Consecutive Substring in Two Random Strings %D October 7, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-65 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-65.ps.gz %X Let $\Sigma$ be a finite alphabet with $C$ letters. For any two strings $x$ and $y$ of length $n$, we let $S(x,y)$ denote the size of the longest common consecutive substring between $x$ and $y$; that is, $S(x,y)$ is the largest $k$ such that, $$ x_i \cdots x_{i+k} = y_j \cdots y_{j+k}$$ for some $i$ and $j$. We show that for $x$ and $y$ chosen uniformly among all possible strings of length $n$, $S(x,y)$ is highly concentrated around $2 \log_C n$. More precisely, for any $a \geq 1$ $$ \Pr [ |S(x,y) - 2 \log_C n | > a] \leq {18 C^{-a} \over n^2}.$$ This enables us to show that the expected value of $S(x,y)$ is with a constant additive factor of $2 \log n$. %A Pach, Janos %A Toth, Geza %T A Generalization of the Erdos-Szekeres Theorem to Disjoint Convex Sets %D October 13, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-66 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-66.ps.gz %X Let ${\cal F}$ denote a family of pairwise disjoint convex sets in the plane. ${\cal F}$ is said to be in {\em convex position}, if none of its members is contained in the convex hull of the union of the others. For any fixed $k\geq 3$, we estimate $P_k(n)$, the maximum size of a family ${\cal F}$ with the property that any $k$ members of ${\cal F}$ are in convex position, but no $n$ are. In particular, for $k=3$, we improve the triply exponential upper bound of T. Bisztriczky and G. Fejes T\'oth by showing that $P_3(n)<16^n$. %A Shepp, Larry %T Linear Programming in Tomography, Probability, and Finance %D October 15, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-67 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-67.ps.gz %X

At the first conference (at DIMACS), it became clear that linear, or convex, programming is going to play a key role in the algorithmics of "discrete tomography", where one attempts to reconstruct a finite subset, $S$, of the integer lattice, $Z^2$, or $Z^3$, from its line sums in a few directions obtained via transmission electron microscopy. The usual convolution-back-projection algorithms for continuous tomography are useless for this new application to crystal microscopy because only very few directions are available for which the line sums are measured. Since the problem of determining a subset $S$ with the given line sums is an integer programming problem the standard convexification of the problem reduces it to linear programming (Peter Fishburn, Peter Schwander, Robert Vanderbei, The Discrete Radon Transform and its Approximate Inversion via Linear programming, Discrete Applied Math., 1997, to appear). This showed that $S$ may be "effectively" reconstructed in the sense that the method of finding a feasible interior point in the convex hull of what is called "fuzzy sets", ie a function, $0 \le f(z) \le 1$ for all $z \in Z^2$, or $Z^3$, "typically" produces a reasonablly satisfactory reconstruction on simulated phantoms. Thus the situation is more or less analogous to conventional medical CT scanning, where even though one cannot hope to uniquely reconstruct a function, $f(x,y)$ from its line integrals in a finite number of fixed directions (even if the line integrals are noiseless) the usual algoorithms are seen to work well, as demonstrated via simulations.

In the above paper written over a year ago, available by request at fish@research.att.com, the method is made clear. Here however I have a different goal: several disparate problems in applied mathematics can be solved via non-trivial use of infinite linear programming and I thought that I would share these with the audience in order to illustrate the scope and the power of linear programming and to bring the ideas to wider view, after first trying to set into perspective the application of linear programming to discrete tomography via a review. %A Odlyzko, Andrew %T A Modest Proposal for Preventing Internet Congestion %D October 17, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-68 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-68.ps.gz %X A simple approach, called PMP (Paris Metro Pricing), is suggested for dealing with congestion in packet networks such as the Internet. It is to partition a network into several logical networks, each of which would treat all packets equally on a best effort basis, just as the current Internet does. There would be no formal guarantees of quality of service. The separate networks would differ only in the prices paid for using them. Networks with higher prices would attract less traffic, and thereby provide better service. Price would be the primary tool of traffic management. %A Liberatore, Vincenzo %T Uniform Multipaging Reduces to Paging %D October 15, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-69 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-69.ps.gz %X Multipaging is the paging problem when more than one page can be requested at each step. In the uniform cost model, a paging algorithm is charged for the number of pages loaded from disk to fast memory. In this note, we establish a general reduction from uniform multipaging to paging. As a consequence, we obtain the first strongly competitive randomized algorithm for uniform multipaging. %A Coffman, E. G. %A Flajolet, Philippe Jr. %A Leopold Flatto %A Micha Hofri %T The Maximum of a Random Walk and Its Application to Rectangle Packing %D October 15, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-70 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X Let $S_0, \ldots ,S_n$ be a symmetric random walk that starts at the origin ($S_0=0$), and takes steps uniformly distributed on $[-1,+1]$. We study the large-$n$ behavior of the expected maximum excursion and prove the estimate \[ \exd \max_{0 \leq k \leq n} S_k = \sqrt{\frac{2n}{3\pi}} - c + \frac{1}{5}\sqrt{\frac{2}{3\pi}} n^{-1/2} + O(n^{-3/2}), \] where $c=0.297952\ldots$. This estimate applies to the problem of packing $n$ rectangles into a unit-width strip; in particular, it makes much more precise the known upper bound on the expected minimum height, $\frac{n}{4} + \frac{1}{2} \exd \max_{0 \leq j \leq n} S_j +\frac{1}{2} = \frac{n}{4} + O(n^{1/2}),$ when the rectangle sides are $2n$ independent uniform random draws from $[0,1]$. %A Erdos, P. L. %A Steel, M. A. %A Szekely, L. A. %A Warnow, T. J. %T A few logs suffice to build (almost) all trees (I) %D October 17, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-71 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X A phylogenetic tree (also called an ``evolutionary tree'') is a leaf-labelled tree which represents the evolutionary history for a set of species, and the construction of such trees is a fundamental problem in biology. Here we address the issue of how many sequence sites are required in order to recover the tree with high probability when the sites evolve under standard Markov- style (i.i.d.) mutation models. We provide analytic upper and lower bounds for the required sequence length, by developing a new (and polynomial time) algorithm. In particular we show that when the mutation probabilities are bounded the required sequence length can grow surprisingly slowly (a power of log n) in the number n of sequences, for almost all trees. %A Erdos, P. L. %A Steel, M. A. %A Szekely, L. A. %A Warnow, T. J. %T A few logs suffice to build (almost) all trees (II) %D October 17, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-72 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X Inferring evolutionary trees is an interesting and important problem in biology that is very difficult from a computational point of view as most associated optimization problems are NP-hard. Although it is known that many methods are provably statistically consistent (i.e. the probability of recovering the correct tree converges on 1 as the sequence length increases), the actual rate of convergence for different methods has not been well understood. In a recent paper we introduced a new method for reconstructing evolutionary trees called the Dyadic Closure Method (DCM), and we showed that DCM has a very fast convergence rate. DCM runs in O(n^5 log n) time, where n is the number of sequences, so although it is polynomial it has computational requirements that are potentially too large to be of use in practice. In this paper we present another tree reconstruction method, the Witness-Antiwitness Method, or WAM. WAM is significantly faster than DCM, especially on random trees, and converges at the same rate as DCM. We also compare WAM to other methods used to reconstruct trees, including Neighbor Joining (possibly the most popular method among molecular biologists), and new methods introduced in the computer science literature. %A Pach, Janos %A Thiele, Torsten %A Toth, Geza %T Three-dimensional Grid Drawings of Graphs %D October 28, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-73 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X A three-dimensional {\em grid drawing} of a graph $G$ is a placement of the vertices at distinct integer points so that the straight-line segments representing the edges of $G$ are pairwise non-crossing. It is shown that for any fixed $r\geq 2$, every $r$-colorable graph of $n$ vertices has a three-dimensional grid drawing that fits into a box of volume $O(n^2)$. The order of magnitude of this bound cannot be improved. %A Muchnik, Andrei %A Romashchenko, Andrei %A Shen, Alexander %A Vereshagin, Nikolai %T Upper semi-lattice of binary strings with the relation ``x is simple conditional to y'' %D November 5, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-74 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X We study the properties of the set of binary strings with the relation ``the Kolmogorov complexity of x conditional to y is small''. We prove that there are pairs of strings which have no greatest common lower bound with respect to this pre-order. We present several examples when the greatest common lower bound exists but its complexity is much less than mutual information (extending the Gacs and Koerner result). %A Foldes, Stephan %T On Automorphism Groups of Graphs and Distributive Lattices %D November 12, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-75 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X Birkhoff's theorem that every group is isomorphic to the automorphism group of a distributive lattice is extended in a direction that parallels similar results in graph theory. %A Gruenewald, Stefan %A Steffen, Eckhard %T Cyclically 5-edge Connected Con-bicritical Critical Snarks %D November 17, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-76 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X Snarks are bridgeless cubic graphs with chromatic index $\chi' = 4$. A snark $G$ is called critical if $\chi'(G-\{v,w\}) = 3$, for any two adjacent vertices $v$ and $w$. For any $k \geq 2$ we construct cyclically 5-edge connected critical snarks $G$ having an independent set $I$ of at least $k$ vertices such that $\chi'(G-I)=4$. For $k=2$ this solves a problem of Nedela and \v{S}koviera stated in "Decompositions and Reductions of Snarks" J. of Graph Th. 22 253-279 (1996). %A Jia, Ning %A Jia, Xiaofeng %T Some Results Concerning the Ends of Minimal Cuts of Simple Graphs %D November 21, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-77 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X Let S be a cut of a simple connected graph G.If S has no proper subset that is a cut, we say S is a minimal cut of G. To a minimal cut S, a connected component of G-S is called a fragment. And a fragment with no proper subset that is also a fragment is called an end. We characterized ends in our discussions and proved that to a connected graph G=(V,E), the number of its ends<=|V(G)|. %A DasGupta, Bhaskar %T Remarks on the equivalence problem for PL-sets %D November 29, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-78 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X The following problem arises in connection with checking the equivalence piecewise-linear (PL) sets. Consider two-variable polynomials with non-negative integer coefficients, and given two such polynomials, decide if they are equivalent to each other modulo the following equalities $x=2x+1$, $y^2=2y^2+y$ and $y=x+y+1$. In this paper, we show that this problem can be solved in polynomials time in both the unit-cost and the logarithmic-cost model. %A Ekin, Oya %A Foldes, Stephan %A Hammer, Peter L. %A Hellerstein, Lisa %T Equational Theories of Boolean Functions %D December 10, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-79 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X Several noteworthy classes of Boolean functions are characterized by algebraic identities. For a given DNF-specified class, such characteristic identities exist if and only if the class is closed under the operation of forming Boolean minors by variable identification. A single identity suffices to characterize a class if and only if the number of forbidden identification minors minimal in a specified sense is finite. If general first-order sentences are allowed instead of identities only, then essentially all classes can be described by an appropriate set of sentences. %A Feigenbaum, Joan %T DIMACS Research and Education Institute (DREI '97) Cryptography and Network Security (July 28 - August 15, 1997) Abstracts of Talks Presented %D December 12, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-80 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X During the summer of 1997 we held a three-week program of research workshops aimed at scientists from academia and industry, and open to graduate students and postdoctoral fellows, along with an educational program for high school and college teachers, undergraduates and others. In addition, there were plenary sessions and evening lectures aimed at meshing the two groups in discussion and problem-solving and creating partnerships in the educational enterprise. %A Beimel, Amos %A Gal, Anna %T On Arithmetic Branching Programs %D December 12, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-81 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X We consider the model of arithmetic branching programs, which is a generalization of modular branching programs. We show that, up to a polynomial factor in size, arithmetic branching programs are equivalent to complements of dependency programs, a model introduced by Pudlak and Sgall in 1996. Using this equivalence we prove that dependency programs are closed under conjunction, answering an open problem of Pudlak and Sgall. Furthermore, we show that span programs, an algebraic model of computation introduced by Karchmer and Wigderson in 1993, are at least as strong as arithmetic programs; every arithmetic program can be simulated by a span program of size not more than twice the size of the arithmetic program. Using the above results we give a new proof that the class NL/poly is contained in the class parity-L/poly, first proved by Wigderson in 1994. Our simulation of NL/poly is more efficient, and it holds for logspace counting classes over every field. %A Boros, Endre %A Unluyurt, Tonguc %T Diagnosing Double Regular Systems %D December 16, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-82 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X We consider the problem of testing sequentially the components of a double regular system, when the testing of each component is costly. Generalizing earlier results about $k$-out-of-$n$ systems, we provide a polynomial time algorithm for the most cost-effficient sequential testing of double regular systems. The algorithm can be implemented to work efficiently both for explicitly given systems, and for systems given by an oracle. %A Landweber, Laura F. %T RNA Based Computing %D December 20, 1997 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 97-83 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-70.ps.gz %X

This report will focus on a combined treatment of the fields of RNA evolution and DNA computers. Using examples of in vitro selection of functional RNA molecules from large pools of random sequences, I show that these experiments can be viewed as an "RNA computer", in the language of DNA computation. By applying a strict set of criteria to a pool of heterogeneous DNA or RNA sequences, both types of experiments find sequences that contain solutions to hard molecular computational problems. These can even be nondigital problems that an electronic computer cannot solve. For example, a problem I addressed using a pool of RNA molecules was one of catalysis, or the ability to ligate a small substrate RNA molecule to itself. In vitro evolution searches for the best solution or class of solutions contained within a large library of approximately 10^15 unique sequences. The advantage of these "molecular computers" is their ability to examine and eliminate billions of possible solutions in parallel. We also describe an example of a cellular paradigm for RNA-based computing: RNA editing.

From the Second Annual DIMACS Meeting on DNA Based Computers, June 10-12, 1996 and the DIMACS Special Focus on DNA Computing. %A Belegradek, Oleg V. %A Stolboushkin, Alexei P. %A Taitslin, Michael A. %T On Order-Generic Queries %D January 31, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-01 %X We consider relational databases organized over an ordered domain with some additional relations---a typical example is the ordered domain of rational numbers together with the ternary relation $+$ of addition. In the focus of our study are the first order queries that are invariant under order-preserving ``permutations''---such queries are called order-generic. In fact, we consider two formalizations of this notion: ``generic'', and ``locally generic'', queries. For several domains order-generic queries fail to express more than pure order queries, for example, every order-generic query over rational numbers with $+$ can be rewritten without $+$. Our goal is to find general conditions on the domain that allow for such a simplification of order-generic queries. An important difference of this paper from a recent series of related papers (see, for example,~\cite{PBG95,BDLW95}) is that we generalize all notions to the case of finitely representable database states---as opposed to finite states---and develop a general lifting technique that, essentially, allows us to extend any result of the kind we are interested in, from finite to finitely-representable states, and thus all the results in this paper are proved for the general case of {\em constraint databases.} On the simplification of order-generic queries, we offer two types of results. The results of the first type address {\em equivalence\/} of order-generic queries that use the additional predicates (extended queries) to pure order queries. Here we establish the necessary and sufficient condition for an extended query to be equivalent to a pure order query, in terms of behavior of this extended query over the ultra-product of this domain by a non-principal ultra-filter over $\omega$. This characterization relies upon the Continuum Hypothesis, and is used as instrumental in establishing equivalence of generic, as well as locally generic, extended queries to pure order queries for several classes of domains, including in particular the so-called $o$-minimal domains, integers with $+$, and such. Although, by Shoenfield's Absoluteness Theorem, the Continuum Hypothesis can be eliminated from these latter results, the fact remains, no translation algorithm can be extracted from arguments of this type. That is why we also offer {\em effective algorithms\/} of translating extended to pure order queries for divisible ordered Abelian groups. Such groups---including real or rational numbers with $+$---are all $o$-minimal. %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-01.ps.gz %A Blaze, Matt %A Feigenbaum, Joan %A Leighton, F. T. %T Master-Key Cryptosystems %D February 4, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-02 %I DIMACS %X We initiate the study of a new class of secret-key cryptosystems, called "master-key cryptosystems," in which an authorized third party possesses a "master key" that allows efficient recovery of the cleartext without knowledge of the session key. One motivation for this study is that master-key cryptosystems could provide a less cumbersome alternative to "key escrow" in situations in which third-party access is required. We demonstrate that designing a master-key cryptosystem with acceptable performance is roughly equivalent to designing a public-key cryptosystem in which encryption is much faster than is possible with current public-key techniques. %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-02.ps.gz %A Allender, Eric %A Jiao, Jia %A Mahajan, Meena %A Vinay, V. %T Non-Commutative Arithmetic Circuits: Depth Reduction and Size Lower Bounds %D February 7, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-03 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-03.ps.gz %X We investigate the phenomenon of depth-reduction in commutative and non-commutative arithmetic circuits. We prove that in the commutative setting, uniform semi-unbounded arithmetic circuits of logarithmic depth are as powerful as uniform arithmetic circuits of polynomial degree; earlier proofs did not work in the uniform setting. This also provides a unified proof of the circuit characterizations of LOGCFL and #LOGCFL. We show that AC^1 has no more power than arithmetic circuits of polynomial size and degree n^{O(log log n)} (improving the trivial bound of n^{O(log n)}). Connections are drawn between TC^1 and arithmetic circuits of polynomial size and degree. Then we consider non-commutative computation, and show that some depth reduction is possible over the algebra (Sigma^*, max, concat), thus establishing that OptLOGCFL is in AC^1. This is the first depth-reduction result for arithmetic circuits over a noncommutative semiring, and it complements the lower bounds of Kosaraju and Nisan showing that depth reduction cannot be done in the general noncommutative setting. We define new notions called ``short-left-paths'' and ``short-right-paths'' and we show that these notions provide a characterization of the classes of arithmetic circuits for which optimal depth-reduction is possible. This class also can be characterized using the AuxPDA model. Finally, we characterize the languages generated by efficient circuits over the (union, concat) semiring in terms of simple one-way machines, and we investigate and extend earlier lower bounds on non-commutative circuits. %A Agrawal, Manindra %A Allender, Eric %T An Isomorphism Theorem for Circuit Complexity %D February 26, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-04 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-04.ps.gz %X We show that all sets complete for NC^1 under AC^0 reductions are isomorphic under AC^0-computable isomorphisms. Although our proof does not generalize directly to other complexity classes, we do show that, for all complexity classes C closed under NC^1-computable many-one reductions, the sets complete for C under NC^0 reductions are all isomorphic under AC^0-computable isomorphisms. Our result showing that the complete degree for NC^1 collapses to an isomorphism type follows from a theorem showing that in NC^1, the complete degrees for AC^0 and NC^0 reducibility coincide. This theorem does not hold for strongly uniform reductions: we show that there are Dlogtime-uniform AC^0-complete sets for NC^1 that are not Dlogtime-uniform NC^0-complete. %A Bisztriczky, Tibor %A Karolyi, Gyula %T Subpolytopes of Cyclic Polytopes %D February 26, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-05 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-05.ps.gz %X A remarkable result of I. Shemer [4] states that the combinatorial structure of a neighbourly $2m$-polytope determines the combinatorial structure of each of its subpolytopes. From this, it follows that every subpolytope of a cyclic $2m$-polytope is cyclic. In this note, we present a direct proof of this consequence that also yields that certain subpolytopes of a cyclic $(2m+1)$-polytope are cyclic. %A Kaplan, Haim %A Tarjan, Robert E. %T Purely Functional Representations of Catenable Sorted Lists %D February 8, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-06 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-06.ps.gz %X The power of purely functional programming in the construction of data structures has received much attention, not only because functional languages have many desirable properties, but because structures built purely functionally are automatically {\em fully persistent}: any and all versions of a structure can coexist indefinitely. Recent results illustrate the surprising power of pure functionality. One such result was the development of a representation of double-ended queues with catenation that supports all operations, including catenation, in worst-case constant time~\cite{KaTar}. This paper is a continuation of our study of pure functionality, especially as it relates to persistence. For one purposes, a purely functional data structure is one built only with the LISP functions car, cons, cdr. We explore purely functional representations of sorted lists, implemented as finger search trees. We describe three implementations. The most efficient of these achieves logarithmic access, insertion, and deletion time, and double-logarithmic catenation time. It uses one level of structural bootstrapping to obtain its efficiency. The bounds for find, insert, and delete are the same as the best known bounds for an ephemeral implementation of these operations using finger search trees. The representations we present are the first that address the issues of persistence and pure functionality, and the first for which fast implementations of catenation and split are presented. They are simple to implement and could be efficient in practice, especially for applications that require worst-case time bounds or persistence. %A Karolyi, Gyula %A Pach, Janos %A Tardos, Gabor %A Toth, Geza %T An algorithm for finding many disjoint monochromatic edges in a complete 2-colored geometric graph %D April 9, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-07 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-07.ps.gz %X We present an $O(n^{\log\log n+2})$-time algorithm for finding $n$ disjoint monochromatic edges in a complete geometric graph of $3n-1$ vertices, where the edges are colored by two colors. %A Sazonov, Vladimir Yu %A Sviridenko, Dmitri I. %T Abstract Deducibility and Domain Theory %D April 9, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-08 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-08.ps.gz %X According to the thesis ``computability = deducibility'' [D.Scott, LNCS 140] there are investigated intensional aspects of domain theory as mathematical theory of computability.

A logistic system is any pair of sets , where R \subseteq Conf(A) := Powerset(A) x (A union {#}), # \notin A. The intended interpretation: A is a set of sentences, R is a rule of inference, and # is a contradiction sign. As usually, R induces a relation |-_R \subseteq Conf(A) of (reflexive) deductive inference and also the classes Cl() \subseteq Powerset(A) of the closed sets under |-_R and Th() \subseteq Cl() of consistent closed sets (theories) partially ordered by the inclusion relation. The followimg more general notion of deducibility ||-_R, which may be non-reflexive, playes an important role. Let G ||-_R f iff there exists a (well-founded) tree of inference G ||-_R f which contains at least one configuration in R (i.e. is non-trivial). By imposing, if necessary, on deducibility notion suitable finitarity conditions (and others) it is possible to characterise rather naturally, from the point of view of the abovementioned thesis, various classes of domains, e.g. classes of all complete lattices with a base, conditionally complete partially ordered sets with a base, complete f_0-spaces (defined in [Ju.L.Ershov, Algebra and Logic, 11, N4], the same as Scott's algebraic domains; cf. also [D.Scott, LNCS 140] where only finitary reflexive deducibility is considered), Ershov's complete A_0-spaces [Algebra and Logic, 12, N4] = Scott's continuous domains, and Scott's continuous lattices. For example, Th(< A,|- >) is an (arbitrary) complete A_0-space under \subseteq if for some R \subseteq Conf(A) there holds

(1) G/f \in R => G is finite, (i.e. R is finitary), (2) G ||-_R f => G^ ||-_R f, where G^ := union {g^ : g \in G and g^ := {h : g ||-_R h}, and (3) G |- f <=> G ||-_R f^ and $G |- # <=> G ||-_R #.

The goal of this paper is just to give an English extended version of the above text published only in Russian [V.Yu.Sazonov and D.I.Sviridenko, Abstract Deducibility and Domain Theory, Seventh All Union Conference on Mathematical Logic, Abstracts, Novosibitsk, 1984, p. 158] in connection with a related recent paper [R.Hoofman, Continuous Information Systems, Information and Computation 105, 42--71 (1993)]. It contains also an Appendix to this Abstract (written by the first author) with additional details, proofs and some comparisons with Hoofman's approach. %A Gibbons, Luana E. %A Hearn, Donald W. %A Pardalos, Panos M. %A Ramana, Motakuri V. %T Continuous Characterizations of the Maximum Clique Problem %D April 9, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-09 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-09.ps.gz %X Given a graph $G$ whose adjacency matrix is $A$, the Motzkin-Strauss formulation of the Maximum-Clique Problem is the quadratic program $\max\{x^TAx|x^Te=1,x\geq 0\}$. It is well known that the global optimum value of this QP is $(1-1/\omega(G))$, where $\omega(G)$ is the clique number of $G$. Here, we characterize the following: 1) first order optimality 2) second order optimality 3) local optimality 4) strict local. These characterizations reveal interesting underlying discrete structures, and are polynomial time verifiable. A parametrization of the Motzkin-Strauss QP is then introduced and its properties are investigated. Finally, an extension of the Motzkin-Strauss formulation is provided for the weighted clique number of a graph and this is used to derive a maximin characterization of perfect graphs. %A Komlos, Janos %A Simonovits, Miklos %T Szemeredi's Regularity Lemma and its applications in graph theory %D April 7, 1995 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-10.ps.gz %X Szemer\'edi's Regularity Lemma is an important tool in discrete mathematics. It says that, in some sense, all graphs can be approximated by random-looking graphs. Therefore the lemma helps in proving theorems for arbitrary graphs whenever the corresponding result is easy for random graphs. Recently quite a few new results were obtained by using the Regularity Lemma, and also some new variants and generalizations appeared. In this survey we describe some typical applications and some generalizations. %A Cherlin, Gregory L. %A Latka, Brenda J. %T A Decision Problem Involving Tournaments %D May 13, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-11.ps.gz %R 96-11 %X A class of finite tournaments determined by a set of ``forbidden subtournaments'' is well-quasi-ordered if and only if it contains no infinite antichain (a set of incomparable elements). It is not known if there is an algorithm which decides whether or not a class of finite tournaments determined by a finite set of forbidden subtournaments has this property. We prove a noneffective finiteness theorem bearing on the problem. We show that for each fixed $k$, there is a finite set of infinite antichains, $\Lambda_k$, with the following property: if any class defined by $k$ forbidden subtournaments contains an infinite antichain, then a cofinite subset of an element of $\Lambda_k$ must be such an antichain. By refining this analysis and using an earlier result giving an explicit algorithm for the case $k=1$, we show that there exists an algorithm which decides whether or not a class of finite tournaments determined by two forbidden subtournaments is well-quasi-ordered.

footnote:

This work was the result of the authors' collaboration during the 95/96 Special Year on Logic and Algorithms. %A Trenk, Ann N. %T k-Weak Orders: Recognition and a Tolerance Result %D June 24, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-12 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-12.ps.gz %X In this paper we introduce a family of ordered sets we call {\em k-weak orders\/} which generalize weak orders, semi-orders, and bipartite orders. For each $k$, we give a polynomial-time recognition algorithm for $k$-weak orders and a partial characterization. In addition, we prove that among 1-weak orders, the classes of bounded bitolerance orders and totally bounded bitolerance orders are equal. This enables us to recognize the class of totally bounded bitolerance orders for 1-weak orders. %A Annexstein, Fred %A Berman, Ken %A Swaminathan, Ram %T Independent Spanning Trees with Small Stretch Factors %D June 28, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-13.ps.gz %X A pair of spanning trees rooted at a vertex "r" are independent if for every vertex "v" the pair of unique tree paths from "v" to the root "r" are disjoint. This paper presents the first analysis of the path lengths involved in independent spanning trees in 2-edge-connected and 2-vertex-connected graphs. We present upper and lower bounds on the stretch factors of pairs of independent spanning trees, where the stretch factor of a spanning tree is defined to be the maximum ratio between the length of paths in the tree to the root to the length of the shortest path in the graph to the root. We prove that if "G" is a 2-edge-connected graph with the property that every edge lies on a cycle of size at most "h", then we can construct in linear time a pair of edge-independent spanning trees whose stretch factors are bounded by O(h). In fact, we prove a more general result, namely that the stretch factor of both independent trees can be bounded by a minimax length of ears with respect to a certain class of ear decompositions of the graph. We demonstrate analogous constructions of vertex-independent spanning trees with bounded stretch factors for the class of 2-vertex-connected graphs. We show that our upper bounds are existentially optimal, that is, there are classes of graphs for which our bounds are tight. --------------------------------------- The last author would like to acknowledge the hospitality of DIMACS Center. %A Thorup, Mikkel %T Sorting in O(n log log n) time and linear space using addition, shift, and bit-wise boolean operatations %D June 28, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-14 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-14.ps.gz %X We present a randomized sorting algorithm doing as described in the title. %A Brayton, R. %A Emerson, A. %A Feigenbaum, J. %T Workshop Summary: Computational and Complexity Issues in Automated Verification %D June 28, 1996 %Z Mon, 2 Dec 1996 15:27:22 GMT %R 96-15 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-15.ps.gz %X The correctness of computer hardware and software is an area of growing theoretical interest and practical importance. It is now widely acknowledged that effective reasoning about program correctness requires: (i) the use of appropriate formalisms, such as temporal logic or automata, for rigorously specifying correct behavior, and (ii) the use of mechanical reasoning algorithms, such as model checkers, to permit proofs of correctness to be constructed automatically that could not reasonably be constructed by hand owing to the intractable amount of tedious detail. A principal limiting factor in automated verification is the computational complexity of various associated mechanical reasoning problems; this complexity might be prohibitively high because of the combinatorial state explosion problem, which appears in many guises.

The DIMACS Workshop on Computational and Complexity Issues in Automated Verification, held at Rutgers University on March 25-28, 1996, and organized by Bob Brayton (University of California at Berkeley), Allen Emerson (University of Texas), Joan Feigenbaum (AT&T Research), featured both theoretical and practical work in this important area of computer science. This report contains short abstracts of all talks given at the workshop, as well as pointers to sources of more information. %A Feigenbaum, Joan %A Fortnow, Lance %A Laplante, Sophie %A Naik, Ashish %T On Coherence, Random-Self-Reducibility, and Self-Correction %D June 28, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-16 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-16.ps.gz %X We address two questions about self-reducibility -- the power of adaptiveness in examiners that take advice and the relationship between random-self-reducibility and self-correctability. We first show that adaptive examiners are more powerful than nonadaptive examiners, even if the nonadaptive ones are nonuniform. Blum et al. [Blum, Luby and Rubinfeld, Journal of Computer and System Sciences, 59:549--595, 1993] showed that every random-self-reducible function is self-correctable. However, whether self-correctability implies random-self-reducibility is unknown. We show that, under a reasonable complexity hypothesis, there exists a self-correctable function that is not random-self-reducible. For P-sampleable distributions, however, we show that constructing a self-correctable function that is not random-self-reducible is as hard as proving that P is not equal to PP.

This work was presented in preliminary form at the IEEE Conference on Computational Complexity, Philadelphia PA, May 1996. %A Blaze, Matt %A Feigenbaum, Joan %A Lacy, Jack %T Decentralized Trust Management %D June 28, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-17 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-17.ps.gz %X We identify the "trust management problem" as a distinct and important component of security in network services. Aspects of the trust management problem include formulating security policies and security credentials, determining whether particular sets of credentials satisfy the relevant policies, and deferring trust to third parties. Existing systems that support security in networked applications, including X.509 and PGP, address only narrow subsets of the overall trust management problem and often do so in a manner that is appropriate to only one application. We present a comprehensive approach to trust management, based on a simple language for specifying trusted actions and trust relationships. We also describe a prototype implementation of a "trust management system," called PolicyMaker, that can facilitate the development of security features in a wide range of network services.

This paper was presented at the IEEE Symposium on Security and Privacy, Oakland CA, May 1996. %A Porkolab, Lorant %A Khachiyan, Leonid %T On the Complexity of Semidefinite Programs %D June 28, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-18 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-18r.ps.gz %X We show that the feasibility of a system of $m$ linear inequalities over the cone of symmetric positive semidefinite matrices of order $n$ can be tested in $mn^{O(\min\{m,n^2\})}$ arithmetic operations over $ln^{O(\min\{m,n^2\})}$-bit numbers, where $l$ is the maximum binary size of the input coefficients. We also show that any feasible system of dimension $(n,m)$ has a solution $\M{X}$ such that $\log \|\M{X}\| \le ln^{O(\min\{m,n^2\})}$. %A Steel, Mike %A Szekely, Laszlo A. %A Erdos, Peter L. %T The number of nucleotide sites needed to accurately reconstruct large evolutionary trees %D June 28, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %R 96-19 %I DIMACS %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-19.ps.gz %X Biologists seek to reconstruct evolutionary trees for increasing number of species, $n$, from aligned genetic sequences. How fast the sequence length $N$ must grow, as a function of $n$, in order to accurately recover the underlying tree with probability $1-\epsilon$, if the sequences evolve according to simple stochastic models of nucleotide substitution? We show that for a certain model, a reconstruction method exists for which the sequence length $N$ can grow surprisingly slowly with $n$ (sublinearly for a wide range of parameters, and even as a power of $\log n$ in a narrow range, which roughly meets the lower bound from information theory). By contrast a more traditional technique (maximum compatibility) provably requires $N$ to grow faster than linearly in $n$. Our approach is based on a new, and computationally efficient approach for reconstructing phylogenetic trees from aligned DNA sequences. %A Kayll, P. Mark %T Asymptotics of the total chromatic number for simple graphs %D July 2, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-20.ps.gz %X For simple graphs, the ratio of the total chromatic number to its linear analogue, the fractional total chromatic number, tends to one as the latter parameter grows large. Two short proofs of this fact, first observed here, are presented. One is based on a theorem of Kahn ({\em J.\ Combin.\ Theory Ser.\ A} {\bf 73} (1996), 1--59) concerning the asymptotics of the list-chromatic index, the other on recent progress of Molloy and Reed in the direction of the ``Behzad-Vizing Conjecture''. The main result has the flavour of several recent others establishing asymptotic agreement of certain integral (hyper)graph parameters with their fractional analogues. %A Kupferman, Orna %A Vardi, Moshe Y. %T Verification of Fair Transition Systems %D July 8, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-21.ps.gz %X In {\em program verification\/}, we check that an {\em implementation\/} meets its {\em specification}. Both the specification and the implementation describe the possible behaviors of the program, though at different levels of abstraction. We distinguish between two approaches to implementation of specifications. The first approach is {\em trace-based implementation}, where we require every computation of the implementation to correlate to some computation of the specification. The second approach is {\em tree-based implementation}, where we require every computation tree embodied in the implementation to correlate to some computation tree embodied in the specification. The two approaches to implementation are strongly related to the linear-time versus branching-time dichotomy in temporal logic.

In this work we examine the trace-based and the tree-based approaches from a {\em complexity-theoretic\/} point of view. We consider and compare the complexity of verification of {\em fair transition systems}, modeling both the implementation and the specification, in the two approaches. We consider {\em unconditional, weak\/}, and {\em strong\/} fairness. For the trace-based approach, the corresponding problem is {\em fair containment}. For the tree-based approach, the corresponding problem is {\em fair simulation}. We show that while both problems are PSPACE-complete, their complexities in terms of the size of the implementation do not coincide and the trace-based approach is easier. As the implementation is normally much bigger than the specification, we see this as an advantage of the trace-based approach. Our results are at variance with the known results for the case of transition systems with no fairness, where no approach is evidently advantageous. %A Boerger, E. %A Mazzanti, S. %T A Correctness Proof for Pipelining in RISC Architecture %D July 8, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-22.ps.gz %X We describe a technique for specifying and verifying the control of pipelined microprocessors which can be used where traditional or purely automatic methods do not scale up to complex commercial microprocessor design. We illustrate our approach through a formal specification of Hennessy's and Patterson's RISC processor DLX for which we prove the correctness of its pipelined model with respect to the sequential model.

First we concentrate our attention on the provably correct refinement of the sequential ground model DLX to the pipelined parallel version DLX_p in which structural hazards (resource conflicts) are eliminated. Then we extend the result to the model DLX_data in which also data hazards for not jump instructions are treated. The next step consists of building the model DLX_control in which control hazards are eliminated. In the last step we define DLX_pipe and prove that it refines DLX_control correctly and takes care also of data hazards relative to jump instructions.

Due to the systematic use of successive refinements, which are organized around the different pipelining problems and the methods for their solution, our approach can be applied for the design-driven verification as well as for the verification-driven design of RISC cores at any level of abstraction. The proof method supports the designer's intuitive reasoning; ``local'' argumentations, which are typical for incremental design and for optimizations, are supported by the semantical modularity of the design and proof method. The specification method supports the descriptive part of the designer's actual work; it is general and can be applied to other microprocessors and pipelining techniques as well. Since our models come in the form of evolving algebras, they can be made executable by evolving algebra interpreters and can thereby be used for (prototypical) simulations.

NOTE: The first author wants to express his thanks to $DIMACS$ for the hospitality during the Fall of 1995 when part of this work was done. %A Bloch, Stephen %T Integer NC^1 is equal to Boolean NC^1 %D July 8, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-23.ps.gz %X We show that the product of $n$ $3 \times 3$ matrices of $n$-bit integers can be computed in $P$-uniform $FNC^1$. Since this problem is complete~\cite{BOC:const_registers} for formul\ae\ in $\{+, \cdot\}$ on $n$-bit integers, we conclude that ``algebraic $NC^1$'' on integers is equal to the usual Boolean notion of $NC^1$ functions. %A Koiran, Pascal %T Elimination of Constants from Machines over Algebraically Closed Fields %D July 8, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-24.ps.gz %X Let $\k$ be an algebraically closed field of characteristic 0. We show that constants can be removed efficiently from any machine over $\k$ solving a problem which is definable without constants. This gives a new proof of the transfer theorem of Blum, Cucker, Shub \& Smale for the problem $\p \stackrel{?}{=}\np$. We have similar results in positive characteristic for non-uniform complexity classes. We also construct explicit and correct test sequences (in the sense of Heintz and Schnorr) for the class of polynomials which are easy to compute.

An earlier version of this paper appeared as NeuroCOLT Technical Report 96-43. The present paper contains in particular a new bound for the size of explicit correct test sequences. %A Koiran, Pascal %A Moore, Cristopher %T Closed-form Analytic Maps in One and Two Dimensions Can Simulate Turing Machines %D July 15, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-25.ps.gz %X We show closed-form analytic functions consisting of a finite number of trigonometric terms can simulate Turing machines, with exponential slowdown in one dimension or in real time in two or more. %A Saks, Michael %A Srinivasan, Aravind %A Zhou, Shiyu %T Explicit OR-Dispersers with Polylogarithmic Degree %D July 30, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-26.ps.gz %X An $(N,M,T)$-OR-disperser is a bipartite multigraph $G=(V,W,E)$ with $|V|=N$, and $|W|=M$, having the following expansion property: any subset of $V$ having at least $T$ vertices has a neighbor set of size at least $M/2$. For any pair of constants $\xi,\lam, 1\ge\xi>\lam\ge 0$, any sufficiently large $N$, and for any $T\ge 2^{(\log N)^{\xi}}$, $M \leq 2^{(\log N)^{\lam}}$, we give an explicit elementary construction of an $(N,M,T)$-OR-disperser such that the out-degree of any vertex in $V$ is at most polylogarithmic in $N$. Using this with known applications of OR-dispersers yields several results. First, our construction implies that the complexity class Strong-RP defined by Sipser, equals RP. Second, for any fixed $\eta > 0$, we give the first polynomial-time simulation of RP algorithms using the output of any ``$\eta$-minimally random'' source. For any integral $R > 0$, such a source accepts a single request for an $R$-bit string and generates the string according to a distribution that assigns probability at most $2^{-R^\eta}$ to any string. It is minimally random in the sense that any weaker source is insufficient to do a black-box polynomial-time simulation of RP algorithms. %A Koiran, Pascal %T Hilbert's Nullstellensatz is in the Polynomial Hierarchy %D July 30, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-27.ps.gz %X We show that if the Generalized Riemann Hypothesis is true, the problem of deciding whether a system of polynomial equations in several complex variables has a solution is in the second level of the polynomial hierarchy. In fact, this problem is in AM, the ``Arthur-Merlin'' class (recall that $\np \subseteq \am \subseteq \rp^{\tiny \np} \subseteq \Pi_2$). The best previous bound was PSPACE.

An earlier version of this paper was distributed as NeuroCOLT Technical Report~96-44. The present paper includes in particular a new lower bound for unsatisfiable systems, and remarks on the Arthur-Merlin class. %A Therien, Denis %A Wilke, Thomas %T Temporal Logic and Semidirect Products: An Effective Characterization of the Until Hierarchy %D July 30, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-28.ps.gz %X We reveal an intimate connection between semidirect products of finite semigroups and substitution of formulas in linear temporal logic. We use this connection to obtain an algebraic characterization of the `until' hierarchy of linear temporal logic; the k-th level of that hierarchy is comprised of all temporal properties that are expressible by a formula of nesting depth k in the `until' operator. Applying deep results from finite semigroup theory we are able to prove that each level of the `until' hierarchy is decidable. %A Bafna, Vineet %A Berman, Piotr %A Fujito, Toshihiro %T Constant Ratio Approximations of Feedback Vertex Sets in Weighted Undirected Graphs %D July 30, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-29.ps.gz %X A feedback vertex set of a graph is a subset of vertices that contains at least one vertex from every cycle in the graph. We show that a feedback vertex set approximating a minimum one within a constant factor can be efficiently found in undirected graphs. In fact the derived approximation ratio matches the best constant ratio known today for the vertex cover problem, improving the previous best results for both weighted and unweighted cases. The existence of an approximation preserving reduction of the vertex cover problem to the feedback vertex set problem suggests that further improvement of the obtained ratio will require entirely new ideas.

We extend our approach to handle graphs of bounded vertex degree, and show an improved performance ratio for this case. These approximation bounds are obtained by using a non--trivial generalization of the classical local ratio principle of Bar--Yehuda and Even. %A Bafna, V. %A Muthukrishnan, S. %A Ravi, R. %T Computing similarity between RNA strings %D July 30, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-30.ps.gz %X Ribonucleic acid (RNA) strings are strings over the four-letter alphabet {A,C,G,U} with a secondary structure of base-pairing between A-U and C-G pairs in the string. Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing base-pairing naturally leads to a tree-like representation of the secondary structure of RNA strings.

In this paper, we address several notions of similarity between two RNA strings that take into account both the primary sequence and secondary base-pairing structure of the strings. We present efficient algorithms for exact matching and approximate matching between two RNA strings. We define a notion of alignment between two RNA strings and devise algorithms based on dynamic programming. We then present a method for optimally aligning a given RNA sequence with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known. The techniques employed to prove our results include reductions to well-known string matching problems, allowing wild cards and ranges, and speeding up dynamic programming by using the tree structures implicit in the secondary structure of RNA strings. %A Kelmans, Alexander K. %T Packing of induced stars in a graph %D July 31, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-31.ps.gz %X We consider simple undirected graphs. An edge subset $A$ of $G$ is called an {\em induced $n$--star packing} of $G$ if every component of the subgraph $G[A]$ induced by $A$ is a star with at most $n$ edges and is an induced subgraph of $G$. We consider the problem of finding an induced $n$--star packing of $G$ that covers the maximum number of vertices. This problem is a natural generalization of the classical matching problem. We show that many classical results on matchings (such as the Tutte 1-Factor Theorem, the Berge Duality Theorem, the Gallai--Edmonds Structure Theorem, the Matching Matroid Theorem) can be extended to induced $n$--star packings in a graph. %A Apartsin, A. %A Ferapontova, E. %A Gurvich, V. %T A Circular Graph - Counterexample to the Duchet Kernel Conjecture %D August 5, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-32.ps.gz %X We construct a directed graph G such that a) G is strongly connected, b) G has the circular symmetry, c) G is not a directed odd cycle, d) G has no kernel but e) after removing any edge from G the obtained graph has a kernel . %A Boros, E. %A Gurvich, V. %A Vasin, A. %T Stable Families of Coalitions and Normal Hypergraphs %D August 5, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-33.ps.gz %X The core of a game is defined as the set of outcomes acceptable for {\em all} coalitions. This is probably the simplest and most natural concept of cooperative game theory. However, the core can be empty because there are too many coalitions. Yet, some players may not like or know each other, so they cannot form a coalition. Let $\cK$ be a fixed family of coalitions. The $\cK$-core is defined as the set of outcomes acceptable for all the coalitions from $\cK$. The family $\cK$ is called {\em stable} if the $\cK$-core is not empty for any normal form game (or equivalently, for any game in generalized characteristic function form). \\*[\parskip] \hspace*{1.5em}{\em Normal hypergraphs} can be characterized by several equivalent properties, e.g. they are {\em dual} to the {\em clique hypergraphs} of {\em perfect graphs}. We prove that a family $\cK$ of coalitions is stable iff $\cK$ as a hypergraph is normal. %A Boros, E. %A Gurvich, V. %T Stable Effectivity Functions and Perfect Graphs %D August 5, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-34.ps.gz %X An effectivity function $\cE$ is called {\em stable} if the core $C(\cE,u)$ is not empty for any payoff $u$. The problem to characterize stable effectivity functions seems, in general, very difficult. In this paper we apply a graph theoretic approach to this problem. Using a graph based model we obtain some necessary and some sufficient conditions for stability in terms of perfect graphs. We also demonstrate that the conjecture by Berge and Duchet (1983) related to perfect graphs and kernels, in fact, is a special case of the considered problem of stability of effectivity functions. %A Gurvich, Vladimir %T Dual Graphs on Surfaces %D August 5, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-35.ps.gz %X Consider an embedding of a graph $G$ in a surface $S$ ({\it map}). Assume that the difference splits into connected components ({\it countries}), each one homeomorphic to an open disk. (It follows from this assumption that graph $G$ must be connected). Introduce a graph $G^*$ dual to $G$ realizing the neighbor relations among countries. The graphs $G$ and $G^*$ have the same set of edges. More precisely, there is a natural one-to-one correspondence between their edge-sets. An arbitrary pair of graphs with common set of edges is called a {\it plan}. Every map induces a plan. A plan is called {\it \bf geographic} if it is induced by a map. In terms of Eulerian graphs we obtain criteria for a plan to be geographic. We also give an algorithm of reconstruction a map from a geographic plan. A case when this map is unique is singled out. Partially, these results were announced by Gurvich and Shabat in 1989. %A Coffman, E. G., Jr. %A Kahale, Nabil %A Leighton, F. T. %T Processor-Ring Communication: A Tight Asymptotic Bound on Packet Waiting Times %D August 14, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-36.ps.gz %X We consider $N$ processors communicating unidirectionally over a closed transmission channel, or ring. Each message is assembled into a fixed-length packet. Packets to be sent are generated at random times by the processors, and the transit times spent by packets on the ring are also random. Packets being forwarded, i.e., packets already on the ring, have priority over waiting packets. The objective of this paper is to analyze packet waiting times under a greedy policy, within a discrete Markov model that retains the over-all structure of a practical system, but is simple enough so that explicit results can be proved. Independent, identical Bernoulli processes model message generation at the processors, and i.i.d. geometric random variables model the transit times. Our emphasis is on asymptotic behavior for large ring sizes, $N$, when the respective rate parameters have the scaling $\la /N$ and $\mu /N$. Our main result shows that, if the traffic intensity is fixed at $\rho = \la / \mu < 1$, then as $N \to \infty$ the expected time a message waits to be put on the ring is bounded by a constant. This result verifies that the expected waiting time under the greedy policy is within a constant factor of that under an optimal policy. %A Babai, Laszlo %A Gal, Anna %A Wigderson, Avi %T Superpolynomial Lower Bounds for Monotone Span Programs %D September 8, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-37.ps.gz %X In this paper we obtain the first superpolynomial lower bounds for {\it monotone span programs} computing explicit functions. The best previous lower bound was $\Omega(n^{5/2})$ by Beimel, G\'al, Paterson \cite{BGP}; our proof exploits a general combinatorial lower bound criterion from that paper. Our lower bounds are based on an analysis of Paley-type bipartite graphs via Weil's character sum estimates. We prove an $n^{\Omega ( \log n / \log\log n)}$ lower bound for an explicit family of monotone Boolean functions in $n$ variables, which implies the same lower bound for the size of monotone span programs for the clique problem. Our results give the first superpolynomial lower bounds for linear secret sharing schemes.

We demonstrate the surprising power of monotone span programs by exhibiting a function computable in this model in linear size while requiring superpolynomial size monotone circuits and exponential size monotone formulae. We also show that the perfect matching function can be computed by polynomial size (non-monotone) span programs over arbitrary fields. %A Ouyang, Ming %T How good are branching rules in DPLL? %D September 9, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-38 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-38.ps.gz %X The Davis-Putnam-Logemann-Loveland algorithm is one of the most popular algorithms for solving the satisfiability problem. Its efficiency depends on its choice of a branching rule. We construct a sequence of instances of the satisfiability problem that fools a variety of ``sensible'' branching rules in the following sense: when the instance has n variables, each of the ``sensible'' branching rules brings about Omega(2^(n/5)) recursive calls of the Davis-Putnam-Logemann-Loveland algorithm, even though only O(1) such calls are necessary. %A van Melkebeek, Dieter %T Deterministic and Randomized Bounded Truth-table Reductions of P, NL, and L to Sparse Sets %D September 11, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-39.ps.gz %X We prove that there is no sparse hard set for P under logspace computable bounded truth-table reductions unless P = L. In case of reductions computable in NC1, the collapse goes down to P = NC1.

We parameterize this result and obtain a generic theorem allowing to vary the sparseness condition, the space bound and the number of queries of the truth-table reduction. Another instantiation yields that there is no quasipolynomially dense hard set for P under polylog-space computable truth-table reductions using polylogarithmically many queries unless P is in polylog-space.

We also apply the proof technique to NL and L. We establish that there is no sparse hard set for NL under logspace computable bounded truth-table reductions unless NL = L, and that there is no sparse hard set for L under NC1-computable bounded truth-table reductions unless L = NC1.

We show that all these results carry over to the randomized setting: If we allow two-sided error randomized reductions with confidence at least inversely polynomial, we obtain collapses to the corresponding randomized classes in the multiple access model. In addition, we prove that there is no sparse hard set for NP under two-sided error randomized polynomial-time bounded truth-table reductions with confidence at least inversely polynomial unless NP = RP. %A van Melkebeek, Dieter %A Ogihara, Mitsunori %T Sparse Hard Sets for P %D September 11, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-40.ps.gz %X Sparse hard sets for complexity classes has been a central topic for two decades. The area is motivated by the desire to clarify relationships between completeness/hardness and density of languages and studies the existence of sparse complete/hard sets for various complexity classes under various reducibilities. Very recently, we have seen remarkable progress in this area for low-level complexity classes. In particular, the Hartmanis' sparseness conjectures for P and NL have been resolved. This article overviews the history of sparse hard set problems and exposes some of the recent results. %A Boros, E. %A Gurvich, V. %T A corrected version of the Duchet Kernel Conjecture %D September 13, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-41.ps.gz %X In 1980 Piere Duchet conjectured that odd directed cycles are the only edge minimal kernel-less connected digraphs i.e. in which after the removal of any edge a kernel appears. Although this conjecture was disproved recently by Apartsin, Ferapontova and Gurvich (1996), the following modification of Duchet's conjecture still holds: odd holes (i.e. odd non-directed chordless cycles of length 5 or more) are the only connected graphs which are not kernel-solvable but after the removal of any edge the resulting graph is kernel-solvable. %A Subramanian, Balakrishna %A Barkema, G.T. %A Lebowitz, J.L. %A Speer, E.R. %T Numerical study of a non-equilibrium interface model %D September 20, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-42.ps.gz %X We have carried out extensive computer simulations of one-dimensional models related to the low noise (solid-on-solid) non-equilibrium interface of a two dimensional anchored Toom model with unbiased and biased noise. For the unbiased case the computed fluctuations of the interface in this limit provide new numerical evidence for the logarithmic correction to the subnormal $L^{\frac{1}{2}}$ variance which was predicted by the dynamic renormalization group calculations on the modified Edwards-Wilkinson equation. In the biased case the simulations are in close quantitative agreement with the predictions of the Collective Variable Approximation (CVA), which gives the same $L^{\frac{2}{3}}$ behavior of the variance as the KPZ equation. %A Erdos, Peter L. %A Steel, Michael A. %A Szekely, Laszlo A. %A Warnow, Tandy J. %T Local quartet splits of a binary tree infer all quartet splits via one dyadic inference rule %D September 24, 1996 %Z Mon, 11 Nov 1996 15:27:22 GMT %I DIMACS %R 96-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-43.ps.gz %X A significant problem in phylogeny is to reconstruct a \slb\ tree from few valid quartet splits of it. It is well-known that every \slb\ tree is determined by its set of all valid quartet splits. Here we strengthen this result by showing that its local (i.e. small diameter) quartet splits infer by a dyadic inference rule all valid quartet splits, and hence determine the tree. The results of the paper also present a polynomial time algorithm to recover the tree. %A Karpinski, Marek %A Macintyre, Angus %T Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks %D October 15, 1996 %Z Mon, 25 Nov 1996 15:27:22 GMT %I DIMACS %R 96-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-44.ps.gz %X We introduce a new method for proving explicit upper bounds on the VC Dimension of general functional basis networks, and prove as an application, for the first time, that the VC Dimension of analog neural networks with the sigmoidal activation function $\sigma(y) = 1/1+e^{-y}$ is bounded by a quadratic polynomial $O((lm)^2)$ in both the number $l$ of programmable parameters, and the number $m$ of nodes. The proof method of this paper generalizes to much wider class of Pfaffian activation functions and formulas, and gives also for the first time polynomial bounds on their VC Dimension. We present also some other applications of our method. %A Grigoriev, Dima %A Karpinski, Marek %A Heide, Friedhelm Meyer auf der %T A Lower Bound for Randomized Algebraic Decision Trees %D October 15, 1996 %Z Mon, 25 Nov 1996 15:27:22 GMT %I DIMACS %R 96-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-45.ps.gz %X We extend the lower bounds on the depth of algebraic decision trees to the case of {\em randomized} algebraic decision trees (with two-sided error) for languages being finite unions of hyperplanes and the intersections of halfspaces, solving a long standing open problem. As an application, among other things, we derive, for the first time, an $\Omega(n^2)$ {\em randomized} lower bound for the {\em Knapsack Problem} which was previously only known for deterministic algebraic decision trees. It is worth noting that for the languages being finite unions of hyperplanes our proof method yields also a new elementary technique for deterministic algebraic decision trees without making use of Milnor's bound on Betti number of algebraic varieties. %A Ablayev, Farid %A Karpinski, Marek %T On the Power of Randomized Branching Programs %D October 15, 1996 %Z Mon, 25 Nov 1996 15:27:22 GMT %I DIMACS %R 96-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-46.ps.gz %X We define the notion of a randomized branching program in the natural way similar to the definition of a randomized circuit. We exhibit an explicit function $f_{n}$ for which we prove that:

1) $f_{n}$ can be computed by polynomial size randomized read-once ordered branching program with a small one-sided error;

2) $f_{n}$ cannot be computed in polynomial size by deterministic read-once branching programs;

3) $f_{n}$ cannot be computed in polynomial size by deterministic
read-$k$-times ordered branching program for $k=o(n/\log n)$
(the required deterministic size is
$\exp\left(\Omega\left(\frac{n}{k}\right)\right)$).
%A Glebov, N. I.
%A Kostochka, A. V.
%T On independent domination number of graphs with given minimum degree
%D October 22, 1996
%Z Mon, 11 Nov 1996 15:27:22 GMT
%I DIMACS
%R 96-47
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-47.ps.gz
%X We prove a new upper bound on the independent domination
number of graphs in terms of the number of vertices and the
minimum degree. This bound is slightly better than that by
J.~Haviland~\cite{H} and settles Case $\de =2$ of the
corresponding conjecture by O.~Favaron~\cite{F}.
%A Farach, Martin
%T Optimal Suffix Tree Construction with Large Alphabets
%D October 28, 1996
%Z Mon, 11 Nov 1996 15:27:22 GMT
%I DIMACS
%R 96-48
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-48.ps.gz
%X The suffix tree of a string is the fundamental data structure of
combinatorial pattern matching. In this paper, we present a novel,
deterministic algorithm for the construction of suffix trees.
We settle the main open problem in the construction of suffix trees:
we build suffix trees in linear time for integer alphabet.
%A Bellantoni, Stephen J.
%T Ranking Arithmetic Proofs by Implicit Ramification
%D November 7, 1996
%Z Mon, 11 Nov 1996 15:27:22 GMT
%I DIMACS
%R 96-49
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-49.ps.gz
%X Proofs in an arithmetic system are ranked according to a ramification
hierarchy based on occurrences of induction. It is shown that this
ranking of proofs corresponds exactly to a natural ranking of the
primitive recursive functions based on occurrences of recursion.
A function is provably convergent using a rank $r$ proof, if and
only if it is a rank $r$ function. The result is of interest to complexity
theorists, since rank one corresponds to polynomial time. Remarkably, this
characterization of polynomial-time provability admits induction over
formulas having arbitrary quantifier complexity.
%A Bellantoni, Stephen J.
%T Ranking Primitive Recursions: the Low Grzegorczyk Classes Revisited
%D November 7, 1996
%Z Mon, 11 Nov 1996 15:27:22 GMT
%I DIMACS
%R 96-50
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-50.ps.gz
%X Traditional results in subrecursion theory are integrated with
the recent work in ``predicative recursion'' by defining a simple
ranking $\rho$ of all primitive recursive functions. The hierarchy
defined by this ranking coincides with the Grzegorczyk hierarchy at
and above the linear-space level. Thus, the result is like an
extension of the Schwichtenberg/M\"uller theorems. A natural series
of classes is also obtained down to the first level when primitive
recursion is replaced by recursion on notation.
This paper is a companion to the author's work showing how to
use a suitable definition of proof rank to
define weak subsystems of arithmetic containing unbounded inductions.
%A Loewenstern, David
%A Yianilos, Peter N.
%T Significantly Lower Entropy Estimates for Natural DNA Sequences
%D December 2, 1996
%Z Fri, 6 Dec 1996 15:21:23 GMT
%I DIMACS
%R 96-51
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-51.ps.gz
%X If DNA were a random string over its alphabet *{A,C,G,T}*,96-50.html
an optimal code would assign 2 bits to each nucleotide. We
imagine DNA to be a highly ordered, purposeful molecule, and
might therefore reasonably expect statistical models of its
string representation to produce much lower entropy estimates.
Surprisingly this has not been the case for many natural DNA
sequences, including portions of the human genome. We
introduce a new statistical model (compression algorithm), the
strongest reported to date, for naturally occurring DNA
sequences. Conventional techniques code a nucleotide using
only slightly fewer bits (1.90) than one obtains by relying only on
the frequency statistics of individual nucleotides (1.95). Our
method in some cases increases this gap by more than
five-fold (1.66) and may lead to better performance in
microbiological pattern recognition applications.

One of our main contributions, and the principle source of these improvements, is the formal inclusion of inexact match information in the model. The existence of matches at various distances forms a panel of experts which are then combined into a single prediction. The structure of this combination is novel and its parameters are learned using Expectation Maximization (EM).

Experiments are reported using a wide variety of DNA sequences and compared whenever possible with earlier work. Four reasonable notions for the string distance function used to identify near matches, are implemented and experimentally compared.

We also report lower entropy estimates for coding regions
extracted from a large collection of non-redundant human genes.
The conventional estimate is 1.92 bits. Our model produces
only slightly better results (1.91 bits) when considering
nucleotides, but achieves 1.84-1.87 bits when the prediction
problem is divided into two stages: i) predict the next amino
acid based on inexact polypeptide matches, and ii) predict the
particular codon. Our results suggest that matches at
the amino acid level play some role, but a small one, in determining
the statistical structure of non-redundant coding sequences.
%A Grigoriev, Dima
%A Karpinski, Marek
%T Randomized Omega (n^2) Lower Bound for Knapsack
%D December 3, 1996
%Z Fri, 6 Dec 1996 15:20:19 GMT
%I DIMACS
%R 96-52
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-52.ps.gz
%X We prove *Omega (n^2) * complexity *lower bound *for the
general model of *randomized computation trees* solving
the *Knapsack Problem*, and more generally *Restricted
Integer Programming*. This is the first nontrivial lower
bound proven for this model of computation. The method of the proof
depends crucially on the new technique for proving lower bounds on
the *border complexity* of a polynomial which could be of
independent interest.
%A Durand, Dannie
%A Farach, Martin
%T On the Design of Optimization Criteria for Multiple Sequence Alignment
%D December 10, 1996
%Z Fri, 3 Jan 1997 15:12:03 GMT
%I DIMACS
%R 96-53
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-53.ps.gz
%X Multiple sequence alignment (MSA) is important in functional,
structural and evolutionary studies of sequence data. Much research
has focussed on posing MSA as an optimization problem, and several
optimization criteria have been explored. In this paper, we discuss
biological and mathematical problems that arise in cost function
design for the multiple sequence alignment problem. In particular, we
focus on tree alignment, which is often viewed as the most
``biological'' of the rigorous approaches to MSA. We point out
several important pitfalls in current optimization approaches to MSA
and identify characteristics for good cost function design. We
address some extra design issues specific to approximation algorithms.
We hope these ideas will lead to future research on a biologically
realistic and mathematically rigorous approach to MSA.
%A Karpinski, Marek
%A Larmore, Lawrence L.
%A Rytter, Wojciech
%T Correctness of Constructing Optimal Alphabetic Trees Revisited
%D December 10, 1996
%Z Fri, 31 Jan 1997 11:20:15 GMT
%I DIMACS
%R 96-54
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-54.ps.gz
%X Several new observations which lead to new correctness proofs of
two known algorithms (Hu-Tucker and Garsia-Wachs) for construction
of optimal alphabetic trees are presented. A generalized version
of the Garsia-Wachs algorithm is given. Proof of this generalized
version works in a structured and illustrative way and clarifies
the usually poorly-understood behavior of both the Hu-Tucker and
Garsia-Wachs algorithms. The generalized version permits any
non-negative weights, as opposed to strictly positive weights
required in the original Garsia-Wachs algorithm. New local
structural properties of optimal alphabetic trees are given. The
concept of {\em well-shaped segment\/} (a part of an optimal tree)
is introduced. It is shown that some parts of the optimal tree are
known in advance to be well-shaped, and this implies correctness of
the algorithms rather easily. The crucial part of the correctness
proof of the Garsia-Wachs algorithm, namely the {\em structural
theorem}, is identified. The correctness proof of the Hu-Tucker
algorithm consists of showing a very simple mutual simulation
between this algorithm and the Garsia-Wachs algorithm. For this
proof, it is essential to use the generalized version of
Garsia-Wachs algorithm, in which an arbitrary locally minimal pair
is processed, not necessarily the rightmost minimal pair. Such a
generalized version is also needed for parallel implementations.
Another result presented in this paper is the clarification of the
problem of resolving ties (equalities between weights of items)
in the Hu-Tucker algorithm. This is related to the proof, by
simulation, of correctness of the Hu-Tucker algorithm. It is shown
that the condition that there are no ties may generally be assumed
without harm and that, essentially, the Hu-Tucker algorithm avoids
ties automatically.
%A Valtr, Pavel
%T On galleries with no bad points
%D December 17, 1996
%Z Thu, 19 Dec 1996 16:27:22 GMT
%I DIMACS
%R 96-55
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-55.ps.gz
%X For any $k$ we construct a simply connected compact set (art gallery)
in $R^3$ whose every point sees a positive fraction (in fact,
more than $5/9$) of the gallery,
but the whole gallery cannot be guarded by $k$ guards.
This disproves a conjecture of Kavraki, Latombe, Motwani, and
Raghavan.
%A Koiran, Pascal
%A Sontag, Eduardo D.
%T Vapnik-Chervonenkis Dimension of Recurrent Neural Networks
%D December 19, 1996
%Z Thu, 19 Dec 1996 16:27:18 GMT
%I DIMACS
%R 96-56
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-56.ps.gz
%X Most of the work on the Vapnik-Chervonenkis dimension of neural
networks has been focused on feedforward networks. However,
recurrent networks are also widely used in learning applications,
in particular when time is a relevant parameter.
This paper provides lower and upper bounds for the VC dimension
of such networks. Several types of activation functions are discussed,
including threshold, polynomial, piecewise-polynomial and sigmoidal functions.
The bounds depend on two independent parameters: the number $w$ of weights in
the network, and the length $k$ of the input sequence. In contrast,
for feedforward networks, VC dimension bounds can be expressed as a
function of $w$ only. An important difference between recurrent and
feedforward nets is that a fixed recurrent net can receive inputs
of arbitrary length. Therefore we are particularly interested in the
case~$k \gg w$.
Ignoring multiplicative constants,
the main results say roughly the following:
\begin{itemize}
\item
For architectures with activation $\sigma $ = any fixed nonlinear polynomial,
the VC dimension is $\approx \nopar\leninp$.
\item
For architectures with activation $\sigma $ = any fixed {\it piecewise\/}
polynomial,
the VC dimension is between $\nopar\leninp$ and $\nopar^2\leninp$.
\item
For architectures with activation $\sigma = {\cal H}$ (threshold nets),
the VC dimension is between $\nopar\log(\leninp/\nopar)$ and
$\min\{\nopar\leninp\log\nopar\leninp,\nopar^2+\nopar\log\nopar\leninp\}$.
\item
For the standard sigmoid $\sigma(x)=1/(1+e^{-x})$,
the VC dimension is between $\nopar\leninp$ and
$\nopar^4\leninp^2$.
\ei
%A Rodl, Vojtech
%A Thoma, Lubos
%T On the Size of Set Systems on [n] Not Containing Weak (r, \Delta)-Systems
%D December 19, 1996
%Z Mon, 24 Feb 1997 20:41:55 GMT
%I DIMACS
%R 96-57
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-57.ps.gz
%X Let $r\ge 3$ be an integer. A weak $(r, \Delta)$-system is a family
of $r$ sets such that all pairwise intersections among the members have
the same cardinality.

We show that for $n$ large enough, there exists a family ${\cal F}$ of subsets of $[n]$ such that ${\cal F}$ does not contain a weak $(r, \Delta)$-system and $|{\cal F}| \ge 2^{ {1\over 3} \cdot n^{1/5}\log^{4/5}(r-1)}.$

This improves an earlier result of P. Erd\H{o}s and E. Szemer\'edi. %A Frieze, Alan %A Karonski, Michal %A Thoma, Lubos %T On Perfect Matchings and Hamiltonian Cycles in Sums of Random Trees %D December 19, 1996 %Z Mon, 24 Feb 1997 20:46:05 GMT %I DIMACS %R 96-58 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-58.ps.gz %X We prove that the sum of two random trees almost surely possesses a perfect matching and the sum of five random trees almost surely possesses a Hamiltonian cycle. %A Karpinski, Marek %A Zelikovsky, Alexander %T Approximating Dense Cases of Covering Problems %D December 20, 1996 %Z Tue, 11 Feb 1997 14:17:25 GMT %I DIMACS %R 96-59 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-59.ps.gz %X We study dense cases of several covering problems. An instance of the set cover problem with $m$ sets is dense if there is $\epsilon>0$ such that any element belongs to at least $\epsilon m$ sets. We show that the dense set cover problem can be approximated with the performance ratio $c\log n$ for any $c>0$ and it is unlikely to be NP-hard. We construct a polynomial-time approximation scheme for the dense Steiner tree problem in $n$-vertex graphs, i.e. for the case when each terminal is adjacent to at least $\epsilon n$ vertices. We also study the vertex cover problem in $\epsilon$-dense graphs. Though thisproblem is shown to be still MAX-SNP-hard as in general graphs, we find a better approximation algorithm with the performance ratio $2\over{1+\epsilon}$. The {\em superdense} cases of all these problems are shown to be solvable in polynomial time. %A DIMACS %T DIMACS SPECIAL YEAR IN MATHEMATICAL SUPPORT FOR MOLECULAR BIOLOGY ANNUAL REPORT JANUARY 1996 %D February 6, 1997 %Z Thu, 6 Feb 1997 19:02:40 GMT %I DIMACS %R 96-60 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-60.ps.gz %X No abstract. %A DIMACS %T DIMACS ANNUAL REPORT - DECEMBER 1996 %D February 27, 1997 %Z Wed, 27 Feb 1997 20:38:55 GMT %I DIMACS %R 96-61 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-61.ps.gz %X No abstract. %A Jain, R. %A Werth, J. %T Analysis of approximate algorithms for constrained and unconstrained edge-coloring of bipartite graphs %D April 7, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-01.ps.gz %X The problem of edge-coloring a bipartite graph is to color the edges so that adjacent edges receive different colors. An optimal algorithm uses the minimum number of colors to color the edges. We consider several approximation algorithms for edge-coloring bipartite graphs and show tight bounds on the number of colors they use in the worst case. We also briefly consider the constrained edge-coloring problem where each color may be used to color at most $k$ edges, and obtain bounds on the number of colors used by the approximation algorithms in the worst case. %A Ramana, Motakuri %T An Exact Duality Theory for Semidefinite Programming and its Complexity Implications %D April 10,1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-02R.ps.gz %X In this paper, we present a new and more complete duality for Semidefinite Programming (SDP), with the following features: \begin{itemize} \item This dual is an explicit semidefinite program, whose number of variables and the coefficient bitlengths are polynomial in those of the primal. \item If the Primal is feasible, then it is bounded if and only if the dual is feasible. \item The duality gap, \ie the difference between the primal and the dual objective function values, is zero whenever the primal is feasible and bounded. Also, in this case, the dual attains its optimum \item It yields a precise Farkas Lemma for semidefinite feasibility systems, \ie characterization of the {\it infeasibility} of a semidefinite inequality in terms of the {\it feasibility} of another polynomial size semidefinite inequality. \end{itemize} Note that the standard duality for Linear Programming satisfies all of the above features, but no such duality theory was previously known for SDP, without Slater-like conditions being assumed. Then we apply the dual to derive certain complexity results for Semidefinite Programming Problems. The decision problem of Semidefinite Feasibility (SDFP), \ie that of determining if a given semidefinite inequality system is feasible, is the central problem of interest. The complexity of SDFP is unknown, but we show the following: 1) In the Turing machine model, SDFP is not NP-Complete unless NP=Co-NP; 2) In the real number model of Blum, Shub and Smale\cite{bss}, SDFP is in NP$\cap$Co-NP. We then give polynomial reductions from the following problems to SDFP: 1) Checking whether an SDP is bounded; 2) Checking whether a feasible and bounded SDP attains the optimum; 3) Checking the optimality of a feasible solution. %A Correa, Ricardo %A Ferreira, Afonso %T Parallel Best-First Branch-and-Bound in Discrete Optimization %D April 10, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-03.ps.gz.ps %X In this report, parallelism is used as a way to solve {\em discrete optimization problems}. We search for an {\em optimal solution} $x^{*}\in S^{*}$, where $S^{*}$ is the set of all best solutions in a {\em % domain} $S$, defined as the discrete set of all vectors $x$ in the solution space that satisfy a set of constraints. Improving the search efficiency is of considerable importance since exhaustive search is often impracticable. The method called {\em % branch-and-bound} (noted B\&B) is a tree search algorithm often used as an intelligent search in this context. Its principle lies in successive decompositions of the original problem in smaller disjoint subproblems until an optimal solution is found. The algorithm consists of a heuristic itn this context. Its principle lies in successive decompositions of the original problem in smaller disjoint subproblems until an optimal solution is found. The algorithm consists of a heuristic iterative search that avoids visiting some subproblems which are known not to contain an optimal solution. Parallel processing has been widely studied as an additional source of improvement in search efficiency in discrete optimization. We review in this report the literature pertinent to {\em parallel branch-and-bound algorithms}. The focus is on {\em distributed memory} parallel systems, which are composed of a set of processors connected by a physical network, each one with its own local memory. The communications between two different processors are implemented through the exchange of messages over links of the network connecting the two processors. Given that an attractive feature is that disjoint subproblems can be decomposed simultaneously and independently, the challenge is how to use the set of processors to improve the search efficiency of B\&B algorithms, concurrently decomposing several subproblems at each iteration. In general terms, the potential to be explored consists of a linear -- on the number of processors -- reduction on the number of iterations. However, the reduction on the number of iterations can deviate considerably from linear due to possible {\em % speedup anomalies}. Parallel B\&B is traditionally considered as an irregular parallel algorithm due to the fact that the structure of the search tree is not known beforehand. It may result in unnecessary work if a subproblem that does not contain an optimal solution is chosen and assigned to a processor to be decomposed. Therefore, the use of a distributed memory parallel system incurs a number of overheads, including communication overheads and idle time due to workload imbalance and contention on common data structures. Several special techniques have been developed to address these problems, essentially related to the amount of ``necessary'' work assigned to each processor. This report reviews these techniques, attempting to tie the area of parallel B\&B under a common, uniform terminology. It can be profitable both to the parallel processing and to the operations research community members. %A Lowenstern, D. %A Hirsh, H. %A Noordiwier, M. %A Yianilos, P. %T DNA Sequence Classification Using Compression-Based Induction %D May 1, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-04.ps.gz %X Inductive learning methods, such as neural networks and decision trees, have become a popular approach to developing DNA sequence identification tools. Such methods attempt to form models of a collection of training data that can be used to predict future data accurately. The common approach to using such methods on DNA sequence identification problems forms models that depend on the {\em absolute locations} of nucleotides and assume {\em independence} of consecutive nucleotide locations. This paper describes a new class of learning methods, called {\em compression-based induction} (CBI), that is geared towards sequence learning problems such as those that arise when learning DNA sequences. The central idea is to use text compression techniques on DNA sequences as the means for generalizing >from sample sequences. The resulting methods form models that are based on the more important {\em relative locations} of nucleotides and on the {\em dependence} of consecutive locations. They also provide a suitable framework into which biological domain knowledge can be injected into the learning process. We present initial explorations of a range of CBI methods that demonstrate the potential of our methods for DNA sequence identification tasks. %A Applegate, D. %A Bixby, R. %A Chvatal, V. %A Cook, B. %T Finding Cuts in the TSP (A preliminary report) %D April 10, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-05.ps.gz %X TSPLIB is Gerhard Reinelt's library of some hundred instances of the traveling salesman problem. Some of these instances arise from drilling holes in printed circuit boards; others arise from X-ray crystallography; yet others have been constructed artificially. None of them (with a single exception) is contrived to be hard and none of them is contrived to be easy; their sizes range from 17 to 85,900 cities; some of them have been solved and others have not. We have solved twenty previously unsolved problems from the TSPLIB. One of them is the problem with 225 cities that was contrived to be hard; the sizes of the remaining nineteen range from 1,000 to 7,397 cities. Like all the successful computer programs for solving the TSP, our computer program follows the scheme designed by George Dantzig, Ray Fulkerson, and Selmer Johnson in the early nineteen-fifties. The purpose of this preliminary report is to describe *some* of our innovations in implementing the Dantzig-Fulkerson-Johnson scheme; we are planning to write up a more comprehensive account of our work soon. %A Dubchack, I. %A Mayoraz, E. %A Muchnik, I. %T Relation between Protein Structure, Sequence Homology and Composition of Amino Acids %D April 10, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-06.ps.gz %X A method of quantitative comparison of two classifications rules applied to protein folding problem is presented. Classification of proteins based on sequence homology and based on amino acid composition were compared and analyzed according to this approach. The coefficient of correlation between these classification methods and the procedure of estimation of robustness of the coefficient are discussed. %A Romanik, K. %T Directed Rectangle-Visibility Graphs have Unbounded Dimension %D April 17, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-07.ps.gz %X Visibility representations of graphs map vertices to sets in Euclidean space and express edges as visibility relations between these sets. One visibility representation in the plane that has been studied is one in which the vertices of the graph map to closed isothetic rectangles and the edges are expressed by horizontal or vertical visibility between the rectangles. Two rectangles are only considered to be visible to one another if there is a non-zero width horizontal orvertical band of sight between them. A graph that can be represented in this way is called a rectangle-visibility graph. A rectangle-visibility graph can be directed by directing all edges towards the positive x and y directions, which yields a directed acyclic graph. A directed acyclic graph G has dimension d if d is the minimum integer such that the vertices of G can be ordered by d linear orderings, <_1, ..., <_d, and for vertices u and v there is a directed path from u to v if and only if u <_i v for all 1 <= i <= d. In this note we show that the dimension of the class of directed rectangle-visibility graphs is unbounded. %A Pekec, A. %T Optimization under Ordinal Scales: When is a Greedy Solution Optimal? %D April 18, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-08.ps.gz %X Mathematical formulation of an optimization problem often depends on data which can be measured in more than one acceptable way. If the conclusion of optimality depends on the choice of measure, then we should question whether it is meaningful to ask for an optimal solution. If meaningful optimal solution exists and the objective function depends on data measured on an ordinal scale of measurement, then the greedy algorithm will give such a solution for a wide range of objective functions. %A Rothkopf, M.H. %A Pekec, A. %A Harstad, R.M. %T Computationally Manageable Combinatorial Auctions %D April 19, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-09.ps.gz %X There is interest in designing simultaneous auctions for situations in which the value of assets to a bidder depends upon which other assets he or she wins. In such cases, bidders may well wish to submit bids for combinations of assets. When this is allowed, the problem of determining the revenue maximizing set of nonconflicting bids can be a difficult one. We analyze this problem, identifying several different structures of combinatorial bids for which computational tractability is constructively demonstrated and some structures for which computational tractability cannot be guaranteed. %A Pekec, A. %T A Winning Strategy for the Ramsey Graph Game %D April 21, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-10.ps.gz %X We consider a "Maker-Breaker" version of the Ramsey Graph Game, RG(n), and present a winning strategy for maker requiring less than n(2^n) moves. This is the fastest winning strategy known so far. We also demonstrate how the ideas presented can be used to develop winning strategies for some related combinatorial games. %A Jain, R. %A Werth, J. %T Airdisks and AirRAID: Modeling and scheduling periodic wireless data broadcast %D June 15, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-11.ps.gz %X We introduce a simple model, called the airdisk, for modeling the access of data transmitted periodically over wireless media as being analogous to the access of data from a standard magnetic disk. We consider several issues related to airdisks, such as their mean rotational latency under certain assumptions. The problem of scheduling the order in which data items are broadcast is analogous to that of determining how data should be laid out on the disk. Two problems of laying out data so as to minimize read time, given information about which data items are of most interest to the clients, are defined; both are shown to be NP-complete. We discuss ways in which the information about which items are of interest to clients can be obtained. Finally we consider how to increase the performance and storage capacity of airdisks, using the magnetic disk analogy as a guide. We suggest using multiple-track airdisks or borrowing the idea of Redundant Arrays of Inexpensive Disks (RAID) which is used for magnetic disks; for the wireless data broadcast environment we call the latter approach airRAID. %A Thorup, M. %T Equivalence between sorting and priority queues %D May 30, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-12.ps.gz %X For a RAM with arbitrary word size, it is shown that if we can sort n integers, each contained in one word, in time n*s(n), then (and only then) there is a priority queue with capacity for n integers, supporting `find-min' in constant time and `insert' and `delete' in `s(n)+0(1)$ amortized time. Here it is required that when we insert a key, it is not smaller than the current smallest key. The equivalence holds even if n is limited in terms of the word size w. One application is an O(n(log n)^{1/2+e} + m), e>0, algorithm for the single source shortest path problem on a graph with n nodes and m edges. %A Grigoriadis, M.D. %A Khachiyan, L.G %T Approximate Minimum-cost Multicommodity Flows in $\tilde 0(\eps^{-2}KNM)$ Time %D June 1, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-13.ps.gz %X We show that an $\veps$-approximate solution of the cost-constrained $K$-commodity flow problem on an $N$-node $M$-arc network $G$ can be computed by sequentially solving $0(K(\eps^{-2}+\log K)\log M\log(\eps^{-1}K))$ single-commodity minimum-cost flow problems on the same network. In particular, an approximate minimum-cost multicommodity flow can be computed in $\tilde 0(\eps^{-2}KNM)$ running time, where the notation $\tilde 0(\cdot)$ means ``up to logarithmic factors''. This result improves the time bound mentioned in Grigoriadis and Khachiyan (1994) by a factor of $M/N$ and that developed recently in Karger and Plotkin (1995) by a factor of $\eps^{-1}$. We also provide a simple $\tilde 0(NM)$-time algorithm for single-commodity budget-constrained minimum-cost flows which is $\tilde 0(\eps^{-3})$ times faster than the algorithm of Karger and Plotkin (1995). %A Chvatal, V. %T Resolution Search %D June 23, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-14.ps.gz %X Branch-and-bound is one of the most popular ways of solving difficult problems such as integer and mixed-integer linear programming problems; when the branching is done on zero-one valued variables, the resulting variant of branch-and-bound is called implicit enumeration. Efficiency of branch-and-bound and implicit enumeration algorithms depends heavily on the branching strategy used to select the next variable to branch on and its value. We propose an alternative to implicit enumeration. Our algorithm, which we call resolution search, seems to suffer less from inappropriate branching strategies than implicit enumeration does. %A Berry, J. %A Dean, N. %A Fasel, P. %A Goldberg, M. %A Johnson, E. %A MacCuish, J. %A Shannon, G. %A Skiena, S. %T LINK: A Combinatorics and Graph Theory Workbench for Applications and Research %D June 14, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-15.ps.gz %X LINK is a set of C++ class libraries that supports applications in discrete mathematics. The libraries include a commandline interpreter and a graphical user interface that allow access to basic data structures such as \Sets\ and \Lists, and a graph hierarchy that includes undirected, directed, and ``mixed'' graph may contain both directed and undirected edges. Many standard data structures including arrays, lists, heaps and binary search trees are within a \Container\ hierarchy. \Sets\ and \Sequences\ are supported within a \Collection\ hierarchy. The data structure hierarchies enable the user to experiment with competing data structure implementations, and with more complex and sophisticated data structures. If an algorithm has several possible choices of a data structure to be used, a single object can be created that is templated with the particular data structure desired. LINK also contains a set of graph generators, layout algorithms for hypergraphs and binary graphs, and numerous graph algorithms. Interactive visulation of hypergraphs, graphs, and subgraphs is included in LINKGUI, the research tool application for combinatorics and graph theory built on top of the LINK libraries. LINKGUI is a collection of libraries that includes a Motif-based graphical user interface and Tcl-based command-line interface. The ability to select, contract and expand subgraphs and nested subgraphs in either hypergraphs or binary graphs is included in LINKGUI. LINKGUI enables a user to perform unary operations, such as complement, on a particular graph or to combine graphs via binary operations such as graph union or intersection. The resulting graphs can be displayed in multiple windows. Algorithms can be run on graphs where subgraphs have been contracted into single vertices. Other LINK applications can be developed by using various parts of the library code and adding code specific to the application. This new code can include an entirely different interface from that supplied by LINKGUI. %A DasGupta, B. %A Furer, M. %T Bipartite Steinhaus Graphs %D June 12, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-16.ps.gz %X Steinhaus graphs are simple undirected graphs in which the first row of the adjacency matrix A=(a_{r,s}) (excluding the very first entry which is always 0) is an arbitrary sequence of zeros and ones and the remaining entries in the upper triangular part of A are defined by a_{r,s} = ((a_{r-1,s-1} + a_{r-1,s})) mod 2 (for 2 <= r < s <= n). Such graphs have already been studied for their various properties. In this paper we characterize bipartite Steinhaus graphs, and use this characterization to give an exact count as well as linear upper and lower bounds for the number of such graphs on n vertices. These results answer affirmatively some questions posed by W.M. Dymacek (Discrete Mathematics, 59 (1986) pp. 9-20). %A DasGupta, B. %A Sontag, E. %T Sample Complexity for Learning Recurrent Perceptron Mappings %D June 19, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-17 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-17.ps.gz %X Recurrent perceptron classifies generalize the classical perceptron model. They take into account those correlations and dependences among input coordinates which arise from linear digital filtering. This paper provides tight bounds on sample complexity associated to the fitting of such models to experimental data. (This research was supported in part by US Air Force Grant AFOSR-94-0293) %A Chen, T. %A Skiena, Steven S. %T Sorting with Fixed-Length Reversals %D July 11, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-18.ps.gz %X We investigate the problem of sorting circular permutations of length $n$ using reversals of length $k$. Of interest is the connectivity and diameter of the underlying group. For all values of $n$ and $k$, we characterize the number of connected components of the underlying group. To bound the diameter of the group, we give an algorithm to sort all sortable circular permutations in $0(n^{2}/k+nk)$ reversals, and show that $\Omega(n^{2}/k^{2}+n)$ reversals are necessary. %A Hannenhalli, S. %A Pevzner, Pavel P. %A Lewis, Herbert F. %A Skiena, Steven S. %T Positional Sequencing by Hybridization %D July 11, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-19.ps.gz %X Sequencing By Hybridization (SBH) is a promising alternative to the classical DNA sequencing approaches. However, the resolving power of SBH is rather low: with 64 Kb sequencing chips unknown DNA fragments only as long as 200 bp can be reconstructed in a single SBH experiment. To improve the resolving power of SBH Broude {\em et. al.}, 1994\nocite{Broude94} recently suggested {\em positional SBH} (PSBH) allowing (with additional experimental work) to measure approximate positions of every $1$-tuple in a target DNA fragment. We study the {\em positional eulerian path} problem motivated by PSBH. The input to the positional eulerian path problem is an eulerian graph $G(V,E)$ in which every edge has an associated range of integers and the problem is to find an eulerian path $e_1, \1dots, e_{|E|}$ in $G$ such that the range of $e_i$ contains $i$. We show that positional eulerian path problem is NP-complete even when the maximum out-degree (in-degree) of any vertex in the graph is 2. On the positive note we present polynomial algorithms to solve a special case of PSBH (bounded PSBH), where the range of the allowed positions for any edge is bounded by a constant (it corresponds to accurate experimental measurements of positions in PSBH). Moreover, if the positions of every $1$-tuple in an unknown DNA fragment of length $n$ are measured with $0(\log n)$ error, then our algorithm runs in polynomial time. We also present an estimate of the resolving power of PSBH for more realistic case when positions are measured with $\Theta(n)$ error. %A Mahadev, N.V.R. %A Pekec, A. %A Roberts, F.S. %T Single Machine Scheduling with Earliness and Tardiness Penalties: When is an Optimal Solution Not Optimal? %D July 7, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-20.ps.gz %X We consider the problem of finding the optimal schedule for jobs on a single machine when there are penalties for both late and early arrivals. We point out that if attention is paid to how certain parameters are measured, then a change of scale of measurement might lead to the anomalous situation where a schedule is optimal if these parameters are measured in one way, but not if they are measured in a different way that seems equally acceptable. We discuss conditions under which this anomaly is avoided. %A Feigenbaum, J. %A Koller, D. %A Shor, P. %T A Game-Theoretic Classification of Interactive Complexity Classes %D July 6, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-21.ps.gz %X Game-theoretic characterizations of complexity classes have often proved useful in understanding the power and limitations of these classes. One well-known example tells us that PSPACE can be characterized by two-person, perfect-information games in which the length of a played game is polynomial in the length of the description of the initial position [Chandra et al., Journal of the ACM, 28 (1981), pp.~114-133].

In this paper, we investigate the connection between game theory and interactive computation. We formalize the notion of a "polynomially defineable game system" for the language L, which, informally, consists of two arbitrarily powerful players P1 and PP2 and a polynomial-time referee V with a common input w. Player P1 claims that w is in L, and player P2 claims that w is not in L; the referee's job is to decide which of these two claims is true. In general, we wish to study the following question:

What is the effect of varying the system's game-theoretic properties on the class of languages recognizable by polynomially definable game systems?

There are many possible game-theoretic properties that we could investigate in this context. The focus of this paper is the question of what happens when one or both of the players P1 and P2 have imperfect information or or imperfect recall.

We use polynomially definable game systems to derive new charactizations of the complexity classes NEXP and coNEXP. We also derive partial results about other exponential complexity classes and isolate some intriguing open questions about the effects of imperfect information and imperfect recall. These results make use of recent work on complexity-theoretic aspects of games, e.g., [Koller et al., Proc. 26th ACM Symposium on Theory of computing, 1994, pp. 750-759] and [Newman, Information Processing Letters, 39 [1991), pp.~67-71.]

These results were presented in preliminary form at the 10th IEEE Structure in Complexity Theory Conference, Minneapolis MN, June 1995.

%A Feigenbaum, J. %T The Use of Coding Theory in Computational Complexity %D July 6, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-22.ps.gz %X The interplay of coding theory and computational complexity theory is a rich source of results and problems. We survey three of the major themes in this area:

1. the use of codes to improve algorithmic efficiency 2. the theory of program testing and correcting, which is a complexity theoretic analogue of error detection and correction 3. the use of codes to obtain characterizations of traditional complexity classes such as NP and PSPACE; these new characterizations are in turn used to show that certain combinatorial optimization problems are as hard to approximate closely as they are to solve exactly.

This article is based on a talk given at the Short Course on Coding Theory at the American Mathematical Society meeting in San Francisco, CA in January, 1995.

%A Penrice, S. %T Balanced Graphs and Network Flows %D July 21, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-23.ps.gz %X A graph $G$ is {\em balanced\/} if the maximum ratio of edges to vertices, taken over all subgraphs of $G$, occurs at $G$ itself. This note uses the max-flow/min-cut theorem to prove a good characterization of balanced graphs. This characterization is then applied to some results on how balanced graphs may be combined to form a larger balanced graph. In particular, we show that edge-transitive graphs and complete $m$-partite graphs are balanced, that a product or lexicographic product of balanced graphs is balanced, and that the normal product of a balanced graph and a regular graph is balanced. %A Penrice, Stephen %T Clique-like Dominating Sets %D July 24, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-24.ps.gz %X Continuing work by B\'{a}cso and Tuza and Cozzens and Kelleher, we investigate dominating sets which induce subgraphs with small clique covering number, or small independence number. We show that if a graph $G$ is connected and contains no induced subgraph isomorphic to $P_6$ or $H_t$ (the graph obtained by subdividing each edge of $K_{1,t}, t \geq 3$), then $G$ has a dominating set which induces a connected graph with clique covering number at most $t-1$. We then show that if $G$ has no isolated vertices and contains no induced subgraph isomorphic to $mK_2$, then $G$ contains a dominating set with independence number at most $2m-2$. For both of these results, the bounds are best possible. Finally, we prove a theorem which implies similar bounds for a variety of classes of graphs, and we show that if $G$ is connected and contains no induced subgraph isomorphic to $P_k$ or $H_t$, then the domination number of $G$ is bounded above by a constant that depends only on $k$, $t$, and the clique number of $G$. %A Cowen, Lenore %A Jesurum, Esther %T Coloring with Defects %D July 24, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-25.ps.gz %X This paper is concerned with algorithms and complexity results for defective coloring, where a defective (k,d)-coloring is a k coloring of the vertices of a graph such that each vertex is adjacent to at most d-self-colored neighbors. First, (2,d) coloring is shown NP-complete for d >= 1, even for planar graphs, and (3,1) coloring is also shown NP-complete for planar graphs (while there exists a quadratic algorithm to (3,2)-color any planar graph). A reduction from ordinary vertex coloring then shows (X,d) coloring NP-complete for any X >= 3, d >= 0, as well as hardness of approximation results.

Second, a generalization of Delta + 1 coloring defects is explored for graphs of maximum degree Delta. Based on a theorem of Lovasz, we obtain an O(Delta E) algorithm to (k, \1floor (Delta/k \rfloor) color any graph; this yields an O(E) algorithm to (2,1)-color 3-regular graphs, and (3,2)-color 6-regular graphs.

The generalization of Delta + 1 coloring is used in turn to generalize the polynomial-time approximate 3- and k-coloring algorithms of Widgerson and Karger-Motwani-Sudan to allow defects. For approximate 3-coloring, we obtain an O(Delta E) time algorithm to $(\lceil({8n \over d})^{.5}\rceil,d)$ color, and a polynomial time algorithm to $(O((\frac{n} {d})^{.387}), d)$ color any 3-colorable graph. %A Cowen, Lenore %A Goddard, Wayne %T Defective Coloring of Toroidal Graphs and Graphs of Bounded Genus %D July 24, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-26.ps.gz %X A graph is (k,d)-colorable if one can color the vertices with k colors such that no vertex is adjacent to more than d vertices of the same color. In this paper we investigate the existence of such colorings in surfaces.

It is shown that a toroidal graph is (3,2)- and (5,1)-colorable, and that a graph of genus g is (X(g) /(d+1) + 4,d)-colorable, where X(g) is the maximum chromatic number of a graph embeddable on the surface of genus g. %A Cowen, Lenore %A Feigenbaum, Joan %A Kannan, Sampath %T A Formal Framework for Evaluating Heuristic Programs %D July 26, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-27.ps.gz %X We address the question of how one evaluates the usefulness of a heuristic program on a particular input. If theoretical tools do not allow us to decide for every instance whether a particular heuristic is fast enough, might we at least write a simple, fast companion program that makes this decision on some inputs of interest? We call such a companion program a timer for the heuristic. Timers are related to program checkers, as defined by Blum, in the following sense: Checkers are companion programs that check the correctness of the output produced by (unproven but bounded-time) programs on particular instances; timers, on the other hand, are companion programs that attempt to bound the running time on particular instances of correct programs whose running times have not been fully analyzed. This paper provides a family of definitions that formalize the notion of a timer and some preliminary results that demonstrate the utility of these definitions. %A Mirkin, Boris %A Muchnik, Ilya %A Smith, Temple F. %T A Biologically Meaningful Model for Comparing Molecular Phylogenies %D August 9, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-28.ps.gz %X In the framework of the problem of combining different gene trees into a unique species phylogeny, a model for duplication/speciation/loss events along the evolutionary tree is introduced. The model is employed for embedding a plylogeny tree into another one via so called Duplication/Speciation principle requiring that the gene duplicated evolves in such a way that any of the contemporary species involved bears only one of the gene copies diverged. The number of biologically meaningful elements in the embedding result (duplications, losses, information gaps) is considered a (asymmetric) dissimilarity measure between the trees. The model duplication concept is compared with that one defined previously in terms of a mapping procedure for the trees. A graph-theoretic reformulation of the measure is derived. %A Penrice, Stephen %T Some New Graph Labeling Problems: A Preliminary Report %D July 25, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-29.ps.gz %X All "labelings" discussed in this paper are vertex labelings. We call a labeling $1$ of $G$ an $N$-labeling if there exists a connected induced subgraph $H$ of $G$ with $\sum_{x \in V(H)}1(x) = i$ for every integer $i$, $1 \leq i \leq N$. Then $\sigma_c(G)$ is the largest integer $N$ such that $G$ has an $N$-labeling. A labeling $1$ of $G$ is called {\em irredundant\/} if, for all connected induced subgraphs $H_1$ and $H_2$, $\sum_{x \in V(H_1)}1(x) = \sum_{x \in V(H_2)}1(x)$ if and only if $H_1 = H_2$. Then $\sigma_p(G)$ is the smallest integer $N$ such that there exists an irredundant labeling $1$ of $G$ with $\sum_{x \in V(G)}1(x) = N$. In this report, we study $\sigma_c(G)$ and $\sigma_p(G)$ in the cases where $G$ is a complete graph minus one edge, a star, a path, or a cycle. We obtain some preliminary results, but there are many interesting questions left open. %A Beals, Robert %T Algorithms for matrix groups and the Tits alternative %D August 7, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-30.ps.gz %X Tits has shown that a finitely generated linear group either contains a nonabelian free group or has a solvable subgroup of finite index. We give a polynomial time algorithm for deciding which of these two conditions holds for a given finitely generated matrix group over an algebraic number field. Noting that many computational problems are undecidable for groups with nonabelian free subgroups, we investigate the complexity of problems relating to linear groups with solvable subgroups of finite index.

For such a group G, we are able in polynomial time to compute a homomorphism phi such that phi(G) is a finite matrix group and the kernel of phi is solvable. If in addition G has a nilpotent subgroup of finite index, we obtain much stronger results. These include an effective encoding of elements of G such that the encoding length of an element obtained as a product of length L over the generators is 0(log L) times a polynomial in the input length. This result is the best possible; it has been shown by Tits and Wolf that if a finitely generated matrix group does not have a nilpotent subgroup of finite index, then the number of group elements expressible as words of length L over the generators grows as c^L for some constant c>1 depending on G.

For groups with abelian subgroups of finite index, we obtain a Las Vegas algorithm for several basic computational tasks including membership testing and computing a presentation. This generalizes recent work of Beals and Babai, who give a Las Vegas algorithm for the case of finite groups, as well as recent work of Babai, Beals, Cai, Ivanyos, and Luks, who give a deterministic algorithm for the case of abelian groups.

%A Beals, Robert %T Improved construction of negation-limited circuits %D August 7, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-31.ps.gz %X A theorem of Markov states that any system of boolean functions on n variables may be computed by a boolean circuit containing at most log(n+1) negation gates. We call such a circuit negation limited. A circuit with inputs x_1, ..., x_n and outputs neg(x_1), ..., neg(x_n) is called an inverter. Fischer has constructed negation-limited inverters of size 0(n^2 log n) and depth 0(log n). Recently, Tanaka and Nishino have reduced the circuit size to 0(n log^2 n) at the expense of increasing the depth to log^2 n. We construct negation-limited inverters of size 0(n log n), with depth only 0(log n). We also improve a technique of Valiant for constructing monotone circuits for slice functions (introduced by Berkowitz). %A Babai, L. %A Beals, R. %A J-Cai, y. %A Ivanyos, G. %A Luks, E.M. %T Multiplicative equations over commuting matrices %D August 8, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-32.ps.gz %X We consider the solvability of the equation A_1^x_1 * ... * X_k^x_k = B and generalizations, where the A_i and B are given commuting matrices over an algebraic number field F. In the semigroup membership problem, the variables x_i are constrained to be nonnegative integers. While this problem is NP-complete for variable k, we give a polynomial time algorithm if k is fixed. In the group membership problem, the matrices are assumed to be invertible, and the variables x_i may take on negative values. In this case we give a polynomial time algorithm for variable k and give an explicit description of the set of all solutions (as an affine lattice).

The results generalize recent work of Cai, Lipton, and Zalcstein [CLZ] where the case k=2 is solved using Jordan Normal Forms (JNF). We achieve greater clarity simplicity, and generality by eliminating the use of JNF's and referring to elementary concepts of the structure theory of algebras instead (notably, the radical and the local decomposition.

Partial solutions are combined using algorithms for (affine lattices. The special case of 1*1 matrices was recently solved by G. Ge and we heavily rely on his results. %A Ostheimer, Gretchen %T Algorithms for Polycyclic-by-finite Matrix Groups: Preliminary Report %D August 8, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-33.ps.gz %X Let R be the ring of integers or a number field. We present several algorithms for working with polycyclic-by-finite subgroups of GL(n,R). Let G be a subgroup of GL(n,R) given by a finite generating set of matrices. We describe an algorithm for deciding whether or not G is polycyclic-by-finite. For polycyclic-by-finite G, we describe an algorithm for deciding whether or not a given matrix is an element of G. We also describe an algorithm for deciding whether or not G is solvable-by-finite, providing an alternative to the algorithm proposed by Beals ([Be1]) for this problem.

Baumslag, Cannonito, Robinson and Segal prove that the problem of determining whether or not a finitely generated subgroup of GL(n,Z) is polycyclic-by-finite is decidable and that the problem of testing membership in a polycyclic-by-finite subgroup of GL(n,Z) is also decidable ([BCRS]). In this report we extend these results by describing algorithms which appear to be suitable for computer implementation. Experimentation is needed to determine the range of input for which they are practical.

Our method is to first reduce each problem to the corresponding problem for triangularizable matrix groups. The reduction is an easy consequence of the result of Dixon ([Di1]) that subgroups of the p-congruence subgroup of GL(n,Z_p) are connected in the Zariski topology, where Z_p is the ring of p-adic integers for a prime p.

We then prove a structure theorem for triangularizable matrix groups that allows us to decide whether or not a matrix group is triangularizable and to reduce the problem of testing membership in a polycyclic, triangularizable matrix group to the corresponding problems for finitely generated abelian matrix groups and for finitely generated unitriangular matrix groups. In the case of an abelian matrix group we can find a presentation for the group, and our membership test can be made constructive. For these results we rely heavily on the work of Ge ([Ge]) concerning algorithms for multiplicative subgroups of a number field. We also rely on an algorithm of Beals ([Be2]) to decide whether or not a triangularizable matrix group is polycyclic. %A Jonathan Berry, W. %A Goldberg, Mark K. %T Path Optimization for Graph Partitioning Problems %D August 7, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-34.ps.gz %X This paper presents a new heuristic for graph partitioning called Path Optimization PO, and the results of an extensive set of empirical comparisons of the new algorithm with two very well known algorithms for partitioning: The Kernighan-Lin algorithm and simulated annealing.

Our experiments are described in detail, and the results are presented in such a way as to reveal performance trends based on several variables. Sufficient trials are run to obtain 99% confidence intervals small enough to lead to a statistical ranking of the implementations for various circumstances.

The results for geometric graphs, which have become a frequently-used benchmark in the evaluation of partitioning algorithms, show that PO holds an advantage over the others.

In addition to the main test suite described above, comparisons of PO to more recent partitioning approaches are also given. We present the results of comparisons of PO with a parallelized implementation of Goemans' and Williamson's 0.878 approximation algorithm, a flow-based heuristic due to Lang and Rao, and the multilevel algorithm of Hendrickson and Leland.

%A Mayoraz, Eddy %A Aviolat, Frederic %T Constructive Training Methods for Feedforward Neural Networks with Binary Weights %D August 7, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-35.ps.gz %X Quantization of the parameters of a Perceptron is a central problem in hardware implementation of neural networks using a numerical technology. A neural model with each weight limited to a small integer range will require little surface of silicon. Moreover, according to Ockham's razor principle, better generalization abilities can be expected from a simpler computational model. The price to pay for these benefits lies in the difficulty to train these kind of networks. This paper proposes essentially two new ideas for constructive training algorithms, and demonstrates their efficiency for the generation of feedforward networks composed of Boolean threshold gates with discrete weights. A proof of the convergence of these algorithms is given. Some numerical experiments have been carried out and the results are presented in terms of the size of the generated networks and of their generalization abilities. %A Bafna, Vineet %A Narayanan, Babu %A Ravi, R. %T Nonoverlapping Local Alignments, Weighted Independent Sets of Axis Parallel Rectangles %D August 9, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-36.ps.gz %X We consider the following problem motivated by an application in computational molecular biology. We are given a set of weighted axis-parallel rectangles such that for any pair of rectangles and either axis, the projection of one rectangle does not enclose that of the other. Define a pair to be independent if their projections in both axes are disjoint. The problem is to find a maximum-weight independent subset of rectangles.

We show that the problem is NP-hard even in the uniform case when all the weights are the same. We analyze the performance of a natural local- improvement heuristic for the general problem and prove a performance ration of 3.25. We extend the heuristic to the problem of finding a maximum-weight independent set in (d+1)-claw free graphs, and show a tight performance ration of (d - 1 + (1/d)). A performance ratio of d/2 was known for the heuristic when applied to the uniform case. Our contributions are proving the hardness of the problem and providing a tight analysis of the local-improvement algorithm for the general weighted case. %A Srinivasan, Aravind %T Improved Approximation Guarantees for Packing and Covering Integer Programs %D September 18, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-37.ps.gz %X Several important NP-hard combinatorial optimization problems can be posed as {\em packing/covering integer programs}; the {\em randomized rounding} technique of Raghavan \& Thompson is a powerful tool to approximate them well. We present one elementary unifying property of all these integer programs (IPs), and use the FKG correlation inequality to derive an improved analysis of randomized rounding on them. This also yields a {\em pessimistic estimator}, thus presenting deterministic polynomial-time algorithms for them with approximation guarantees significantly better than those known. %A Rudek, G. %A Romanik, K. %A Whitesides, S. %T Localizing a Robot with Minimum Travel %D September 18, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-40.ps.gz %X We consider the problem of localizing a robot in a known environment modeled by a simple polygon P. We assume that the robot has a map of P but is placed at an unknown location inside P. From its initial location, the robot sees a set of points called the visibility polygon V of its location. In general, sensing at a single point will not suffice to uniquely localize the robot, since the set H of points in P with visibility polygon V may have more than one element. Hence, the robot must move around and use range sensing and a compass to determine its position (i.e. localize itself). We seek a strategy that minimizes the distance the robot travels to determine its exact location.

We show that the problem of localizing a robot with minimum travel is NP-hard. We then give a polynomial time approximation scheme that causes the robot to travel a distance of at most (k-1)d, where k = |H| and d is the length of a minumum length tour that would allow the robot to verify its true initial location by sensing. We also show that this bound is the best possible. %A Kahale, Nabil %T A semidefinite bound for mixing rates of Markov chains %D October 10, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-41.ps.gz %X We study the method of bounding the spectral gap of a reversible Markov chain by establishing canonical paths between the states. We provide examples where improved bounds can be obtained by allowing variable length functions on the edges. We give a simple heuristic for computing good length functions. Further generalization using multicommodity flow yields a bound which is an invariant of the Markov chain, and which can be computed at an arbitrary precision in polynomial time via semidefinite programming. We show that, for any reversible Markov chain on n states, this bound is off by a factor of at most O(log^2), and that this is tight. %A Romanik, Kathleen %T Geometric Probing and Testing - A Survey %D September 5, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-42.ps.gz %X Geometric probing is the area of computational geometry that studies how to identify, verify, or determine some property of an unknown geometric object using a measuring device known as a probe. It has applications in the areas of robotics, automated manufacturing, computer vision, optical character recognition and tomography. Geometric testing is the subarea of geometric probing that studies the verification problem - given a target object from a class of objects, it looks at how to find a set of probes that distinguishes the target object from all other objects in the class. In this paper we survey results in the field of geometric testing. %A Winter, Pawel %T Steiner Minimal Trees in Simple Polygons %D September 26, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-43.ps.gz %X An O(n log n) time and O(n) space algorithm for the Euclidean Steiner tree problem with four terminals in a simple polygon with n vertices is given. Its applicability to the problem of determining good quality solutions for any number of terminals is discussed. %A Kempner, Yulia %A Mirkin, Boris %T Monotone Linkage Clustering and Quasi-Convex Set Functions %D October 2, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-44.ps.gz %X Selecting subsets with adding the elements one by one is implicitely employed in many heuristical clustering procedures. Such a procedure, seriation, can be described generally in terms of a linkage function measuring entity-to-set similarities.

A quite known clustering technique, Single linkage, can be considered an example of the seriation procedure (actually, based on the minimum spanning tree construction) leading to the global maximum of a corresponding "minimum split" set function. The purpose of this work is to have the property extended to a class of the linkage functions called monotone linkages. It is proved that the monotone linkage functions are minimum split functions for the quasi-convex set functions (and only for them). This allows to prove that the global optima of the quasi-convex set functions can be found with the multiple seriation algorithm based on corresponding linkage function. %A Karolyi, Gyula %A Tardos, Gabor %T On Point Covers of Multiple Intervals and Axis-Parallel Rectangles %D September 21, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-45.ps.gz %X In certain families of hypergraphs the transversal number is bounded by some function of the packing number. In this paper we study hypergraphs related to multiple intervals and axis-parallel rectangles, respectively. Essential improvements of former established upper bounds are presented here. We explore the close connection between the two problems at issue. %A Agarwala, Richa %A bafna, Vineet %A Farach, Martin %A Narayanan, Babu %A Paterson, Michael %A Thorup, Mikkel %T On the Approximability of Numerical Taxonomy (Fitting Distances by Tree Matrices) %D September 18,1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-46.ps.gz We consider the problem of fitting an nxn distance matrix D by a tree metric T. Let e be the distance to the closest tree metric, that is, e=min_T |,D|_infinity. First we present an O(n^2) algorithm for finding an additive tree T such that |T,D|_infinity <= 3e. Second we show that it is NP-hard to find a tree T such that |T,D|_infinity < 9/8 e. This paper presents the first algorithm for this problem with a performance guarantee. %A Buja, Andreas %A Dean, Nathaniel %A Michael Littman, L. %A Swayne, Deborah %T Higher Dimensional Representations of Graphs %D October 10, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-47file.ps.gz %X Graphs are often used to model complex systems and to visualize relationships, and this often involves drawing a graph in the plane. For this, a variety of algorithms and mathematical tools have been used with varying success. We demonstrate why it is often more natural and more meaningful to view higher dimensional representations of graphs. We present some of the theory and problems associated with constructing such representations, and we briefly describe some visualization tools which are now available for experimental research in this area. %A Tavare, Simon (Host) %T Proceedings of Phylogeny Workshop %D October 18, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-48 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-48.ps.gz %X Phylogeny reconstruction remains a dynamic field with many challenging problems, both theoretical and applied. This report attempts to capture some of the recent developments, as represented by the speakers at the DIMACS workshop.

The organizing committee included Michael Bulmer, Martin Farach, Peter Smouse and Bob Vrijenhoek from Rutgers, Simon Tavare from USC, and Tandy Warnow from the University of Pennsylvania. We are indebted to DIMACS for their help in organizing the workshop and for arranging for publication of the proceedings. We would like to thank the participants, approximately 100 phylogeny enthusiasts, for making the workshop such a success.

The work of the organizing committee, and the success of the workshop, was greatly enhanced by the support of Pat Toci and Sandy Barbu who managed all the local arrangements. Sandy Barbu converted a wide variety of write-ups into the coherent format of the final Technical Report.

Finally, we thank Walter Fitch (UC Irvine), a pioneer in the field of phylogenetics, for getting the workshop off to a great start. We gave him the difficult task of providing "An introduction to molecular evolution for mathematicians, statisticians and computer scientists," a challenge he met with great skill and energy. %A Karolyi, Gyula %A Pach, Janos %A Toth, Geza %T Ramsey-Type Results for Geometric Graphs %D October 20, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-49 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-49.ps.gz %X For any 2--coloring of the ${n \choose 2}$ segments determined by $n$ points in general position in the plane, at least one of the color classes contains a non-selfintersecting spanning tree. Under the same assumptions, we also prove that there exist $[{{n+1} \over 3}]$ pairwise disjoint segments of the same color, and this bound is tight. The above theorems were conjectured by Bialostocki and Dierker. Furthermore, improving an earlier result of Larman et al., we construct a family of $m$ segments in the plane, which has no more than $m^{log 4/\log 27}$ members that are either pairwise disjoint of pairwise crossing. Finally, we discuss some related problems and generalizations. %A Shelah, Saharon %T Zero one laws for graphs with edge probabilities decaying with distance %D Dec 13, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-50 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-50.ps.gz %X Let G(n) be the random graph on [n]={1,...,n} with the possible edge {i,j} having probability being p(|i-j|)=1/(|i-j|^a), 0 is less than a is less than 1 irrational. We prove that the zero one law (for first order logic) holds. %A Shelah, Saharon %T Finite Canonization %D December 13, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-51 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-51.ps.gz %X The canonization theorem says that for given m,n for some m(*) (the first one is called ER(n;m)) we have for every function f with domain {1,...,m(*)]^n, for some A in [1,...,m(*)]^m, the question of when the equality f({i(1),...,i(n)}) = f({j(1),...,j(n)})

(where i(1)<, ..., i(n) and j(1)<, ..., j(n) are from A) holds has the simplest answer: for some subset v of {1,...,n} the equality holds if

i(1)=j(1) for all 1 in v.

We improve the bound on ER(n,m) so that fixing n the number of exponentiation needed to calculate ER(n,m) is the best possible. %A Shelah, Saharon %T In the random Graph G(n,p),p=n^{-\alpha}: If \psi has probability O(n^{-\epsilon}) for every \epsilon>0, then it has probability O(e^{-n^{\epsilon}}) for some \epsilon>0 %D October 24, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-52 %X Shelah Spencer proved the 0-1 law for the random graphs G(n,p(n)), p(n)=n^{-a}, 0 < a < 1 irrational (set of nodes in [n]={1,...,n}, the edges are drawn independently, probability of edge is p(n)). One may wonder what can we say on sentences F for which

Prob[G(n,p(n)) satisfies F} converges to zero.

Lynch asked the question and did the analysis, getting (for every F):

(i) Prob[G(n,p(n)) satisfies F] = cn^{-b}+ O(n^{-b-v}) for some v such

that b>v>0

or

(ii) Prob[G(n,p(n)) satisfies F) = O(n^{-v}) for every v>0.

Lynch conjectured that in case (ii) we have

(++) Prob[G(n,p(n)) satisfies F) = O(e^{-n^v)) for some v>0.

We prove it here. %A Shelah, Saharon %T Very weak zero one law for random graphs with order and random binary functions %D October 24, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-53 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-53.ps.gz %X Let G(<,n,p) denote the usual random graph G(n,p) on a totally ordered set of n vertices. (We naturally think of the vertex set as 1,...,n with the usual <). We will fix p=1/2 for definiteness. Let L(<) denote the first order language with predicates equality (x=y), adjacency (x~y) and less than (x < y). For any sentence A in L(<) let f(n)=f(A,n) denote the probability that the random G(<,n,p) has property A. It is known that there are A for which f(n) does not converge. Here we show what is called "a very weak zero-one law"

Theorem: For every A in language L(<) lim [f(A,n+1)-f(A,n)] = 0. Note, as an extreme example, that this implies the nonexistence of a sentence A holding with probability 1-o(1) when n is even and with probability o(1) when n is odd.

In section 2 we give the proof, based on a circuit complexity result. In Section 3 we prove that result, which is very close to the now classic theorem that parity cannot given by an AC^0 circuit. In Section 4 we give a very weak zero-one law for random two-place functions. The proof is very similar, the random function theorem being perhaps of more interest to logicians, the random graph theorem to discrete mathematicians. %A Erdos, Pal %A Hajnal, Andras %A Pach, Janos %T On a Metric Generalization of Ramsey's Theorem %D December 13, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-54 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-54.ps.gz %X An increasing sequence of reals $x=\langle x_i: i < \omega\rangle$ is simple if all ``gaps'' $x_{i+1}-x_i$are different. Two simple sequences $x$ and $y$ are distance similar if $x_{i+1}-x_i < x_{j+1}-x_j$ if and only if $y_{i+1}-y_i < y_{j+1}-y_j$ for all $i$ and $j$. Given any bounded simple sequence $x$ and any coloring of the pairs of rational numbers by a finite number of colors, we prove that there is a sequence $y$ distance similar to $x$ all of whose pairs are of the same color. We also consider many related problems and generalizations. %A Kayll, P. Mark %T Asymptotically Good Covers in Hypergraphs: Extended Abstract of the Dissertation %D December 13, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-55 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-55.ps.gz %X In the early 1980's, V. R\"{o}dl proved the Erd\H{o}s-Hanani Conjecture, sparking a remarkable sequence of developments in the theory of packing and covering in hypergraphs with bounded edge size. Generalizations were given by P. Frankl and R\"{o}dl, by N. Pippenger, and by others. In each case, an appropriate {\em semirandom} method was used to ``construct'' the desired optimal object (covering, matching, coloring) in several random stages, followed by a greedy stage.

The current work, which further generalizes some of the above results, is again probabilistic, and uses, in addition to earlier ideas, connections with so-called {\em normal} distributions on the set of matchings of a graph. For fixed $k\geq 2$, ${\cal H}$ a $k$-bounded hypergraph, and $t:{\cal H}\rightarrow \mbox{{\bf R}}^+$ a fractional cover, a sufficient condition is given to ensure that the edge cover number $\rho({\cal H})$, {\em i.e.}, the size of a smallest set of edges of ${\cal H}$ with union $V({\cal H})$, is asymptotically at most $t({\cal H}) = \sum_{A\in {\cal H}}t(A)$. This settles a conjecture first publicized in Visegr\'{a}d, June 1991. %A Belegradek, Oleg V. %A Stolboushkin, Alexei P. %A Taitslin, Michael A. %T Relational expressive power of local generic queries %D December 13, 1995 %Z Mon, 14 Sep 1998 20:00:00 GMT %I DIMACS %R 95-56 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1995/95-56.ps.gz %X Consider a scheme of databases Q and two signatures: L_0={<}, L={<, Omega}, where Omega is a finite relational signature. FO is the first-order language. FO in {L_0,Q} is called restricted. FO in {L,Q} is called extended. Consider a countable universe U in L. Let V be a saturated elementary extension of U in power aleph_1. There is the only such V up to elementary isomorphisms over U. Th.1. An extended query Phi is equal for all the finite database states over U to a restricted query iff Phi is generic for all the pseudo-finite database states over V. Th.2. If U is o-minimal, every locally generic for all the finite database states over U extended query is generic for all the pseudo-finite database states over V. Divisible ordered Abelian groups are o-minimal structures. So, for example, groups of rational and real numbers are o-minimal structures. But the additive group of the integer numbers and the additive semigroup of the natural numbers are not. Th.3. If U is the additive group of the integer numbers or is the additive semigroup of the natural numbers, every locally generic for all the finite database states over U extended query is generic for all the pseudo-finite database states over V. We propose a notion of local genericity for extended queries over finitely representable order database states. We show that every finitely representable order database state can be represented by a finite state (of another scheme) such that these two states are uniformly FO-translatable to one another. Using this result, we show that, for any divisible ordered Abelian group, the additive group of the integer numbers, the additive semigroup of the natural numbers, and any o-minimal structure, every locally generic extended query over finitely representable order database states can be translated into a restricted query. Over all rational database states, however, these two query languages differ. %A Michael A. Burr %A Eynat Raflin %A Diane L. %T Simplicial Depth: An Improved Definition, Analysis, and Efficiency for the Finite Sample Case %D Mon Feb 2 12:55:28 2004 %Z Wed Feb 4 13:06:52 EST 2004 %I DIMACS %R 2003-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-28.ps.gz %X As proposed by Liu 1990 the simplicial depth of a point $x$ with respect to a probability distribution $F$ on $R^d$ is the probability that $x$ belongs to a random simplex in $R^d$. The simplicial depth of $x$ with respect to a data set $S$ in $R^d$ is the fraction of the closed simplices given by $d+1$ of the data points containing the point $x$. We propose an alternative definition for simplicial depth which continues to remain valid over a continuous probability field, but also fixes some of the problems for the finite sample case, including those discussed by Zuo and Serfling 2000. Additionally, we discuss the effect of the revised definition on the efficiency of previously developed algorithms and prove tight bounds on the value of the simplicial depth based on the half-space depth. %A Amr Elmasry %T Deterministic Jumplists %D Mon Sep 22 11:37:51 2003 %Z Wed Feb 4 13:07:15 EST 2004 %I DIMACS %R 2003-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-29.ps.gz %X A new dictionary data structure is introduced. This structure provides the usual dictionary operations i.e. CONTAINS, INSERT, DELETE in amortized logarithmic time, and the SUCCESSOR operation in constant time. The new structure does not involve duplicate indexing information or extra pointers; instead it relies on ai linked list whose nodes are endowed with a jump pointer to speed up the search, for a total of only two pointers per node. Different dictionary operations are implemented in an easy, efficient and compact way. Moreover, the jumplists can be easily extended to be used as a higher dimensional search structure. %A Graham Cormode %A S. Muthukrishnan %T What's New: Finding Significant Differences in Network Data Streams %D Thu Oct 30 11:39:26 2003 %Z Wed Feb 4 13:07:55 EST 2004 %I DIMACS %R 2003-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-31.ps.gz %X Monitoring and analyzing network traffic usage patterns is vital for managing IP Networks. An important problem is to provide network managers with information about changes in traffic, informing them about "what's new". Specifically, we focus on the challenge of finding significant differences in traffic: over time, between interfaces and between routers. We introduce the idea of a "deltoid": an item that has a large difference, whether the difference is absolute, relative or variational.

We design algorithms for finding the most significant deltoids in high speed data, and prove that they use small space, small time per update, and are guaranteed to find significant deltoids with pre-specified accuracy. In experimental evaluation, our algorithms perform well and recover almost all deltoids. This is the first work to provide solutions capable of working with one pass, at network traffic speeds. %A Xiaomin Chen %T The Sylvester-Chvatal Theorem %D Fri Oct 17 14:16:55 2003 %Z Wed Feb 4 13:08:07 EST 2004 %I DIMACS %R 2003-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-32.ps.gz %X The Sylvester-Gallai theorem asserts that every finite set $S$ of points in two-dimensional Euclidean space includes two points, $a$ and $b$, such that either there is no other point in $S$ is on the line $ab$, or the line $ab$ contains all the points in $S$. V. Chv\'{a}tal extended the notion of lines to arbitrary metric spaces and made a conjecture that generalizes the Sylvester-Gallai theorem.

In the present article we prove this conjecture to be true. %A Christopher Schwarz %A Denise Sakai Troxell %T L(2,1)-Labelings of Products of Two Cycles %D Thu Oct 30 17:36:13 2003 %Z Wed Feb 4 13:08:19 EST 2004 %I DIMACS %R 2003-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-33.ps.gz %X An L(2,1)-labeling of a graph is an assignment of nonnegative integers to its vertices so that adjacent vertices get labels at least two apart and vertices at distance two get distinct labels. The $\lambda$-number of a graph G, denoted by $\lambda(G)$, is the minimum range of labels taken over all of its L(2,1)-labelings. We show that the $\lambda$-number of the Cartesian product of any two cycles is 6, 7 or 8. In addition, we provide complete characterizations for the products of two cycles with $\lambda$-number exactly equal to each one of these values. %A Meltem Ozturk %A Alexis Tsoukias %A Philippe Vincke %T Preference Modelling %D Fri Oct 31 07:56:49 2003 %Z Wed Feb 4 13:08:42 EST 2004 %I DIMACS %R 2003-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-34.ps.gz %X This paper provides the reader with a presentation of preference modelling fundamental notions as well as some recent results in this field. Preference modelling is an inevitable step in a variety of fields: economy, sociology, psychology, mathematical programming, even medicine, archaeology, and obviously decision analysis. Our notation and some basic definitions, such as those of binary relation, properties and ordered sets, are presented at the beginning of the paper. We start by discussing different reasons for constructing a model or preference. We then go through a number of issues that influence the construction of preference models. Different formalisations besides classical logic such as fuzzy sets and non-classical logics become necessary. We then present different types of preference structures reflecting the behavior of a decision-maker: classical, extended and valued ones. It is relevant to have a numerical representation of preferences: functional representations, value functions. The concepts of thresholds and minimal representation are also introduced in this section. In section 7, we briefly explore the concept of deontic logic (logic of preference) and other formalisms associated with "compact representation of preferences" introduced for special purposes. We end the paper with some concluding remarks. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T On Enumerating Minimal Dicuts and Strongly Connected Subgraphs %D Fri Nov 7 13:40:19 2003 %Z Wed Feb 4 13:08:56 EST 2004 %I DIMACS %R 2003-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-35.ps.gz %X We consider the problems of enumerating all minimal strongly connected subgraphs and all minimal dicuts of a given directed graph $\G$. We show that the first of these problems can be solved in incremental polynomial time, while the second problem is NP-hard: given a collection of minimal dicuts for $G$, it is NP-complete to tell whether it can be extended. The latter result implies, in particular, that for a given set of points $\cA\subseteq\RR^n$, it is NP-hard to generate all maximal subsets of $\cA$ contained in a closed half-space through the origin. We also discuss the enumeration of all minimal subsets of $\cA$ whose convex hull contains the origin as an interior point, and show that this problem includes as a special case the well-known hypergraph transversal problem. %A I. E. Zverovich %T A solution to a problem of Le %D Sat Nov 8 09:09:13 2003 %Z Wed Feb 4 13:09:17 EST 2004 %I DIMACS %R 2003-36 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-36.ps.gz %X replacing each $S_i$ by a new vertex $q_i$,

joining each $q_i$ and $q_j$, $1 \le i \neq j \le t$, and

joining $q_i$ to all vertices in $H - (S_1 \cup S_2 \cup\cdots \cup S_t)$ which were adjacent to some vertex of $S_i$.

A \em{cograph} is a $P_4$-free graph. A graph $G$ is called a {\em cograph contraction} if there exist a cograph $H$ and pairwise disjoint non-empty stable sets in $H$ for which $G \simeq H^*$. Solving a problem proposed by Le~\cite{Le99}, we give a finite forbidden induced subgraph characterization of cograph contractions.

{\bf Keywords:} {Cograph contractions, perfect graphs,weakly chordal graphs, forbidden induced subgraphs}.

{\bf 2000 Mathematics Subject Classification:} 05C17 %A Fanica Gavril %T Perfect interval filament graphs %D Fri Nov 14 12:00:36 2003 %Z Wed Feb 4 13:09:34 EST 2004 %I DIMACS %R 2003-37 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-37.ps.gz %X construct to each interval i \in I a curve f_i connecting its two endpoints, such that if two intervals are disjoint, their curves do not intersect; FI={f_i | i \in I} is a family of interval filaments and its intersection graph is an interval filament graph. The interval filament graphs contain the polygon-circle graphs, the circle graphs, the chordal graphs and the cocomparability graphs. Similar families of intersection graphs of filaments can be defined using families of circular-arcs of a circle and families of subtrees of a tree or of cactus.

In the present paper we describe polynomial time algorithms to find holes and antiholes of a give parity in interval filament graphs, circular-arc filament graphs and subtree filament graphs on a tree and on a cactus; assuming that Berge's SPGC is true, these algorithms give polynomial time algorithms to test for perfectness. In addition we describe polynomial time algorithms to find minimum coverings by cliques for their subfamilies of perfect graphs. %A Alexis Tsoukias %T On the concept of decision aiding process %D Fri Nov 14 13:05:59 2003 %Z Wed Feb 4 13:09:44 EST 2004 %I DIMACS %R 2003-38 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-38.ps.gz %X extension of the decision process one. The aim of the paper is to analyse the type of activities occurring between a ``client'' and an ``analyst'' both engaged in a decision process. The decision aiding process is analysed both under a cognitive point of view and an operational point of view: the ``products'', or cognitive artifacts the process will deliver at the end. Finally the decision aiding process is considered as a more general reasoning process to which recent artificial intelligence findings in argumentation could apply. %A S. Muthukrishnan %A Gopal Pandurangan %T The Bin-Covering Technique for Thresholding Random Geometric Graph Properties %D Fri Nov 21 13:35:13 2003 %Z Wed Feb 4 13:09:59 EST 2004 %I DIMACS %R 2003-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-39.ps.gz %X communication networks. The communication is modeled by the geometric random graph model $G(n,r,\ell)$ where $n$ points randomly placed within $[0,\ell]^d$ form the nodes, and any two nodes that correspond to points at most distance $r$ away from each other are connected. The more widely studied $G(n,r)$ model where $\ell$ is held fixed at unity and $n\rightarrow \infty$ applies for dense ad hoc networks, but the $G(n,r,\ell)$ model is more detailed, and more generally applicable to modeling communication networks, sparse or dense.

We study fundamental properties of $G(n,r,\ell)$ of interest: connectivity, coverage, and routing-stretch. Our main contribution is a simple analysis technique we call {\em bin-covering} that we apply uniformly to get {\em (asymptotically) tight} thresholds for each of these properties. Typically, in the past, geometric random graph analyses involved sophisticated methods from continuum percolation theory; on contrast, our bin-covering approach is discrete and very simple, yet it gives us tight threshold bounds. Our specific results should also prove interesting to the networking community that has seen a recent increase in the study of geometric random graphs motivated by engineering ad hoc networks. %A Patrick De Leenheer %A Simon A. Levin %A Eduardo D. Sontag %A Christopher A. Klausmeier %T Global stability in a chemostat with multiple nutrients %D Tue Nov 25 19:14:29 2003 %Z Wed Feb 4 13:10:15 EST 2004 %I DIMACS %R 2003-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-40.ps.gz %X and separate nutrient uptake from growth. For a broad class of uptake and growth functions it is proved that a nontrivial equilibrium may exist. Moreover, if it exists it is unique and globally stable, generalizing a previous result by Legovi{\'c} and Cruzado. %A Michal Koucky %T On Traversal and Exploration Sequences %D Fri Nov 28 21:23:24 2003 %Z Wed Feb 4 13:10:25 EST 2004 %I DIMACS %R 2003-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-41.ps.gz %X the study of undirected s-t-connectivity. Koucky (2001) defines a new variant of traversal sequences, exploration sequences, with certain advantages over the earlier notion of traversal sequences. An exact relationship of these two notions was not known.

In this paper we establish a relationship between these two concepts, in particular, we show that universal traversal sequences can be efficiently converted into universal exploration sequences. We also study conversion of universal exploration sequences for d-regular graphs into universal exploration sequences for d'-regular graphs. Further, we also show certain self-correcting properties of traversal and exploration sequences and we propose a candidate for a universal exploration sequence. %A Stephen Hartke %T The Elimination Procedure for the Competition Number is Not Optimal %D Sun Dec 28 14:46:03 2003 %Z Wed Feb 4 13:10:34 EST 2004 %I DIMACS %R 2003-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-42.ps.gz %X undirected graph with V(D) as its vertex set and where vertices x and y are adjacent if there exists another vertex z such that the arcs (x,z) and (y,z) are both present in G. The competition number k(G) for an undirected graph G is the least number r such that there exists an acyclic digraph F on |V(G)|+r vertices where C(F) is G along with r isolated vertices. Kim and Roberts] introduced an elimination procedure for the competition number, and asked whether the procedure calculated the competition number for all graphs. We answer this question in the negative by demonstrating a graph where the elimination procedure does not calculate the competition number. This graph also provides a negative answer to a similar question about the related elimination procedure for the phylogeny number. %A S. Muthukrishnan %A Rahul Shah %A Jeffrey Scott Vitter %T Mining Deviants in Time Series Data Streams %D Thu Dec 11 11:43:09 2003 %Z Wed Feb 4 13:10:49 EST 2004 %I DIMACS %R 2003-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-43.ps.gz %X data streams is that of identifying outliers. There is a long history of study of various outliers in statistics and databases, and a recent focus on mining outliers in data streams. Here, we adopt the notion of ``deviants'' from Jagadish et al \cite{jkm} as outliers. Formally, deviants are defined based on a representation sparsity metric, i.e., deviants are values whose removal from the dataset leads to an improved compressed representation of the remaining items. Thus, deviants are not global maxima/minima, but rather these are appropriate local aberrations. Deviants are known to be of great mining value in time series databases.

We present first-known algorithms for identifying deviants on massive data streams. Our algorithms monitor streams using very small space (polylogarithmic in data size) and are able to quickly find deviants at any instant, as the data stream evolves over time. For all versions of this problem---uni- vs multivariate time series, optimal vs near-optimal vs heuristic solutions, offline vs streaming---our algorithms have the same framework of maintaining a hierarchical set of candidate deviants that are updated as the time series data gets progressively revealed.

We show experimentally using real network traffic data (SNMP aggregate time series) as well as synthetic data that our algorithm is not only remarkably accurate in determining the deviants, but also that evolution of deviants over time reveals interesting artifacts in data streams. %A Patrick De Leenheer %A David Angeli %A Eduardo Sontag %T Crowding effects promote coexistence in the chemostat %D Wed Feb 4 13:12:02 2004 %Z Wed Feb 4 13:12:13 EST 2004 %I DIMACS %R 2003-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-44.ps.gz %X particular chemostat model. It deviates from the classical chemostat because crowding effects are taken into consideration. This model can be rewritten as a negative feedback interconnection of two systems which are monotone (as input-output systems). Moreover, these subsystems behave nicely when subject to constant inputs. This allows the use of a particular small-gain theorem which has recently been developed for feedback interconnections of monotone systems. Application of this theorem requires -at least approximate- knowledge of two gain functions associated to the subsystems. It turns out that for the chemostat model proposed here, these approximations can be obtained explicitely and this leads to a sufficient condition for almost-global stability. In addition, we show that coexistence occurs in this model if the crowding effects are large enough. %A Stephen Hartke %T The Voter Model with Confidence Levels %D Wed Feb 4 13:12:33 2004 %Z Wed Feb 4 13:12:42 EST 2004 %I DIMACS %R 2003-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-45.ps.gz %X where each vertex has an opinion, 0 or 1. As time progresses, each voter's opinion is influenced by its neighbors. We introduce a modification of the voter model that changes how quickly a voter will change its opinion based on its confidence in its opinion. We show that the voter model with confidence levels always results in a uniform opinion, and we determine the probability of each outcome (uniform 1 or 0) based on the initial opinions and the structure of the graph.

Keywords: voter model, interacting particle system

Mathematics Subject Classification: Primary 60K35, Secondary 82C20, 82C22 %A Amr Elmasry %T Adaptive Sorting with AVL Trees %D Wed Feb 4 13:13:00 2004 %Z Wed Feb 4 13:13:12 EST 2004 %I DIMACS %R 2003-46 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-46.ps.gz %X on using the traditional AVL trees, and has the same performance limitations. More precisely, the number of comparisons performed by our algorithm, on an input sequence of length $n$ that has $I$ inversions, is at most 1.44nlg(I/n) + O(n). Empirical results imply an average performance of about 1.02nlg(I/n) comparisons for random sequences. Our algorithm runs in time O(nlg(I/n)) and is practically efficient and easy to implement. %A Piotr Berman %A Bhaskar DasGupta %A Eduardo Sontag %T On the Complexities of Some Combinatorial Problems in Reverse Engineering of Protein and Gene Networks %D Wed Feb 4 13:03:45 2004 %Z Mon Apr 5 12:42:30 EDT 2004 %I DIMACS %R 2004-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-01.ps.gz %X In this report we investigate the computational complexities of a combinatorial problem that arises in the reverse engineering of protein and gene networks. %A Dmitriy Fradkin %A Ilya Muchnik %T A Study of K-Means Clustering for Improving Classification Accuracy of Multi-Class SVM %D Thu Mar 4 11:42:24 2004 %Z Mon Apr 5 12:43:25 EDT 2004 %I DIMACS %R 2004-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-02.ps.gz %X This work discusses how clustering methods, in particular K-Means, can be used to improve classification accuracy. We discuss two approaches to constructing hierarchical classifiers using cluster analysis and suggest new methods and improvements in each of these approaches. We also suggest a new method for constructing features that improve classification accuracy. All the methods are evaluated in the context of multi-class classification problems using multi-class SVM and some standard datasets from UCI. %A I. E. Zverovich %A I. I. Zverovich %T Recognizing $k$-complete bipartite bihypergraphs %D Thu Mar 4 11:52:50 2004 %Z Mon Apr 5 12:43:56 EDT 2004 %I DIMACS %R 2004-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-03.ps.gz %X Let $H^0$ and $H^1$ be hypergraphs with the same vertex-set $V$. The ordered pair $H = (H^0, H^1)$ is called a {\em bihypergraph}. A set $S \subseteq V(H)$ is {\em stable} in $H^i$ if $S$ contains no hyperedges of $H^i$, $i = 0, 1$. A bihypergraph $H = (H^0, H^1)$ is called {\em bipartite} if there exists an ordered partition $S^0 \cup S^1 = V(H)$ such that the set $S^i$ is stable in $H^i$ for $i = 0, 1$.

In Section 1, we give a survey numerous applications of bipartite bihypergraphs.

In Section 2, we show that recognizing bipartite bihypergraphs within classes of $k$-complete bihypergraphs can be done in polynomial time. A bihypergraph $H = (H^0, H^1)$ is called {\em $k$-complete}, $k \ge 0$, if each $k$-subset of $V(H)$ contains a hyperedge of $H$, i.e., a hyperedge of $H^0$ or $H^1$. Moreover, we can construct all bipartitions of a $k$-complete bihypergraph, if any, in polynomial time.

A bihypergraph $H = (H^0, H^1)$ is called {\em strongly bipartite} if each maximal stable set of $H^0$ is a transversal of $H^1$. We show that recognizing strongly bipartite bihypergraphs $(H^0, H^1)$ is a co-NP-complete problem even in the case where $H^0$ is a graph and $H^1$ has exactly one hyperedge. Some examples of strongly bipartite bihypergraphs are given.

AMS Subject Classification: 05C65, 03E20, 05C75, 05B35.

Keywords: Bipartite bihypergraphs, bipartite hypergraphs, $k$-complete bihypergraphs, satisfiability problem. %A Fanica Gavril %T Generalizations of interval-filament graphs %D Mon Mar 8 10:28:02 2004 %Z Mon Apr 5 12:44:34 EDT 2004 %I DIMACS %R 2004-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-04.ps.gz %X Let $I$ be a family of intervals on a line $L$ in a plane $PL$ such that every two intersecting intervals have a common segment. Let $V=\{v| i(v)\in I\}$ be a vertex set. In $PL$, above L, construct to each interval $i(v)\in I$ a filament (curve) $a(v)$ connecting its two endpoints and bounded in $PL$ by the endpoints of $i(v)$; $FI=\{a(v)| i(v)\in I\}$ is a family of $2D-interval-filaments$ and its intersection graph is a $2D-interval-filament graph$ [GA4]. Let $PP$ be a plane perpendicular to $PL$ whose intersection with $PL$ is exactly the family of $2D-interval-filaments$ $FI$. In $PP$, above $PL$, for every $a(v)\in FI$, connect its endpoints by a filament $v$ such that if $a(u)$,$a(v)$ overlap the two filaments $u$,$v$ intersect and if $a(u)$,$a(v)$ are disjoint, the two filaments $u,v$ do not intersect; $FF=\{v| a(v)\in FI\}$ is a family of $3D-interval-filaments$ and its intersection graph is a $3D-interval-filament graph$. A graph $G(V,E)$ is $G$ $mixed$ if its edge set can be partitioned into two disjoint subsets $E1$, $E2$ such that $G(V,E1)\in G$, $G(V,E2)$ is transitive and for every three vertices $u,v,w$ if $u\rightarrow v\in E2$ and $(v,w)\in E1$ then $(u,w)\in E1$.

We prove that emph{the family of complements of 3D-interval-filament graphs is exactly the family of co(2D-interval-filament) mixed graphs} and define various subfamilies of $3D$-interval-filament graphs characterizing them as complements of families of G mixed graphs. We present another generalization of the $2D$-interval-filament graphs, namely the $k-filament graphs$. We describe polynomial time algorithms for holes in co(G mixed) graphs. %A Henrik Bjorklund %A Sven Sandberg %A Sergei Vorobyov %T A Combinatorial Strongly Subexponential Strategy Improvement Algorithm for Mean Payoff Games %D Tue Mar 23 12:20:38 2004 %Z Mon Apr 5 12:45:20 EDT 2004 %I DIMACS %R 2004-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-05.ps.gz %X We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem based on iteratively improving the longest shortest distances to a sink in a possibly cyclic graph.

We identify a new ``controlled'' version of the shortest paths problem. By selecting exactly one outgoing edge in each of the controlled vertices we want to make the shortest distances from all vertices to the unique sink as long as possible. Under reasonable assumptions the problem belongs to the complexity class \textsc{NP}$\cap$\textsc{coNP}. Mean payoff games are easily reducible to this problem. We suggest an algorithm for computing longest shortest paths. Player Max selects a strategy (one edge in each controlled vertex) and player Min responds by evaluating shortest paths to the sink in the remaining graph. Then Max locally changes choices in controlled vertices looking at attractive switches that seem to increase shortest paths (under the current evaluation). We show that this is a monotonic strategy improvement, and any locally optimal strategy is globally optimal. This allows us to construct a randomized algorithm of complexity \[\min(poly\cdot W,\;2^{O(\sqrt{n\log n})}),\] which is simultaneously pseudopolynomial ($W$ is the maximal absolute edge weight) and subexponential in the number of vertices $n$. All previous algorithms for mean payoff games were either exponential or pseudopolynomial (which is purely exponential for exponentially large edge weights). %A Vladimir Gurvich %T War and Peace in Veto Voting %D Mon Mar 29 10:47:11 2004 %Z Mon Apr 5 12:45:47 EDT 2004 %I DIMACS %R 2004-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-06.ps.gz %X Let $I = \{i_1, \ldots, i_n\}$ be a set of voters (players) and $A = \{a_1, \ldots, a_p\}$ be a set of candidates (outcomes). Each voter $i \in I$ has a preference $P_i$ over the candidates. We assume that $P_i$ is a complete order on $A$. The preference profile $P = \{P_i, i \in I\}$ is called a {\em situation}. A situation is called {\em war} if the set of all voters $I$ is partitioned in two coalitions $K_1$ and $K_2$ such that all voters of $K_i$ have the same preference, $i = 1,2,$ and these two preferences are opposite. For a simple class of veto voting schemes we prove that the results of elections in all war situations uniquely define the results for all other ({\em peace}) situations.

Key words: veto, voting scheme, voting by veto, veto power, veto resistance, voter, candidate, player, outcome, coalition, block, effectivity function, veto function, social choice function, social choice correspondence %A Jonathan W. Berry %A Daniel Hrozencik %A Shrisha Rao %A Zhizhang Shen %T Finding Central Sets of Tree Structures in Synchronous Distributed Systems %D Sat Apr 3 16:54:15 2004 %Z Mon Apr 5 12:46:27 EDT 2004 %I DIMACS %R 2004-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-07.ps.gz %X Finding the central sets, such as center and median sets, of a network topology is a fundamental step in the design and analysis of complex distributed systems. This paper presents distributed synchronous algorithms for finding central sets in general tree structures.

Our algorithms are distinguished from previous work in that they take only qualitative information, thus reducing the constants hidden in the asymptotic notation, and all vertices of the topology know the central sets upon their termination. %A Yuri Goncharov %A Ilya Muchnik %A Leonid Shvartser %T Simultaneous Feature Selection And Margin Maximization Using Saddle Point Approach %D Mon Apr 5 12:32:47 2004 %Z Mon Apr 5 12:46:47 EDT 2004 %I DIMACS %R 2004-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-08.ps.gz %X A new SVM wrapper method, which simultaneously maximizes margin and minimizes feature space is introduced. For these purposes we modify the standard criterion by adding to the basic objective function a third term, which directly penalizes a chosen set of variables. The new criterion divided the set of all variables into three subsets: deleted, selected and weighted features. We are showing that the question can be formulated as a particular min-max problem for convex-concave functions, which in turn can be solved by saddle point polynomial algorithms. We analyzed a set of such algorithms and realized one, which is taking to account specificity of our problem. The algorithm is examined on a classification Benchmark and its ability to improve the recognition results is shown. We also show that the developed method can be easily transfered to the Support Vector Regression case. %A Piotr Berman %A Bhaskar DasGupta %A Eduardo Sontag %T On the Complexities of Some Combinatorial Problems in Reverse Engineering of Protein and Gene Networks %D Wed Feb 4 13:03:45 2004 %Z Sat Oct 30 17:04:39 EDT 2004 %I DIMACS %R 2004-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-01.ps.gz %X In this report we investigate the computational complexities of a combinatorial problem that arises in the reverse engineering of protein and gene networks. %A Dmitriy Fradkin %A Ilya Muchnik %T A Study of K-Means Clustering for Improving Classification Accuracy of Multi-Class SVM %D Thu Mar 4 11:42:24 2004 %Z Sat Oct 30 17:04:49 EDT 2004 %I DIMACS %R 2004-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-02.ps.gz %X This work discusses how clustering methods, in particular K-Means, can be used to improve classification accuracy. We discuss two approaches to constructing hierarchical classifiers using cluster analysis and suggest new methods and improvements in each of these approaches. We also suggest a new method for constructing features that improve classification accuracy. All the methods are evaluated in the context of multi-class classification problems using multi-class SVM and some standard datasets from UCI. %A I. E. Zverovich %A I. I. Zverovich %T Recognizing $k$-complete bipartite bihypergraphs %D Thu Mar 4 11:52:50 2004 %Z Sat Oct 30 17:04:57 EDT 2004 %I DIMACS %R 2004-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-03.ps.gz %X Let $H^0$ and $H^1$ be hypergraphs with the same vertex-set $V$. The ordered pair $H = (H^0, H^1)$ is called a {\em bihypergraph}. A set $S \subseteq V(H)$ is {\em stable} in $H^i$ if $S$ contains no hyperedges of $H^i$, $i = 0, 1$. A bihypergraph $H = (H^0, H^1)$ is called {\em bipartite} if there exists an ordered partition $S^0 \cup S^1 = V(H)$ such that the set $S^i$ is stable in $H^i$ for $i = 0, 1$.

In Section 1, we give a survey numerous applications of bipartite bihypergraphs.

In Section 2, we show that recognizing bipartite bihypergraphs within classes of $k$-complete bihypergraphs can be done in polynomial time. A bihypergraph $H = (H^0, H^1)$ is called {\em $k$-complete}, $k \ge 0$, if each $k$-subset of $V(H)$ contains a hyperedge of $H$, i.e., a hyperedge of $H^0$ or $H^1$. Moreover, we can construct all bipartitions of a $k$-complete bihypergraph, if any, in polynomial time.

A bihypergraph $H = (H^0, H^1)$ is called {\em strongly bipartite} if each maximal stable set of $H^0$ is a transversal of $H^1$. We show that recognizing strongly bipartite bihypergraphs $(H^0, H^1)$ is a co-NP-complete problem even in the case where $H^0$ is a graph and $H^1$ has exactly one hyperedge. Some examples of strongly bipartite bihypergraphs are given.

AMS Subject Classification: 05C65, 03E20, 05C75, 05B35.

Keywords: Bipartite bihypergraphs, bipartite hypergraphs, $k$-complete bihypergraphs, satisfiability problem. %A Fanica Gavril %T Generalizations of interval-filament graphs %D Mon Mar 8 10:28:02 2004 %Z Sat Oct 30 17:05:07 EDT 2004 %I DIMACS %R 2004-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-04.ps.gz %X Let $I$ be a family of intervals on a line $L$ in a plane $PL$ such that every two intersecting intervals have a common segment. Let $V=\{v| i(v)\in I\}$ be a vertex set. In $PL$, above L, construct to each interval $i(v)\in I$ a filament (curve) $a(v)$ connecting its two endpoints and bounded in $PL$ by the endpoints of $i(v)$; $FI=\{a(v)| i(v)\in I\}$ is a family of $2D-interval-filaments$ and its intersection graph is a $2D-interval-filament graph$ [GA4]. Let $PP$ be a plane perpendicular to $PL$ whose intersection with $PL$ is exactly the family of $2D-interval-filaments$ $FI$. In $PP$, above $PL$, for every $a(v)\in FI$, connect its endpoints by a filament $v$ such that if $a(u)$,$a(v)$ overlap the two filaments $u$,$v$ intersect and if $a(u)$,$a(v)$ are disjoint, the two filaments $u,v$ do not intersect; $FF=\{v| a(v)\in FI\}$ is a family of $3D-interval-filaments$ and its intersection graph is a $3D-interval-filament graph$. A graph $G(V,E)$ is $G$ $mixed$ if its edge set can be partitioned into two disjoint subsets $E1$, $E2$ such that $G(V,E1)\in G$, $G(V,E2)$ is transitive and for every three vertices $u,v,w$ if $u\rightarrow v\in E2$ and $(v,w)\in E1$ then $(u,w)\in E1$.

We prove that emph{the family of complements of 3D-interval-filament graphs is exactly the family of co(2D-interval-filament) mixed graphs} and define various subfamilies of $3D$-interval-filament graphs characterizing them as complements of families of G mixed graphs. We present another generalization of the $2D$-interval-filament graphs, namely the $k-filament graphs$. We describe polynomial time algorithms for holes in co(G mixed) graphs. %A Henrik Bjorklund %A Sven Sandberg %A Sergei Vorobyov %T A Combinatorial Strongly Subexponential Strategy Improvement Algorithm for Mean Payoff Games %D Tue Mar 23 12:20:38 2004 %Z Sat Oct 30 17:05:13 EDT 2004 %I DIMACS %R 2004-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-05.ps.gz %X We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem based on iteratively improving the longest shortest distances to a sink in a possibly cyclic graph.

We identify a new ``controlled'' version of the shortest paths problem. By selecting exactly one outgoing edge in each of the controlled vertices we want to make the shortest distances from all vertices to the unique sink as long as possible. Under reasonable assumptions the problem belongs to the complexity class \textsc{NP}$\cap$\textsc{coNP}. Mean payoff games are easily reducible to this problem. We suggest an algorithm for computing longest shortest paths. Player Max selects a strategy (one edge in each controlled vertex) and player Min responds by evaluating shortest paths to the sink in the remaining graph. Then Max locally changes choices in controlled vertices looking at attractive switches that seem to increase shortest paths (under the current evaluation). We show that this is a monotonic strategy improvement, and any locally optimal strategy is globally optimal. This allows us to construct a randomized algorithm of complexity \[\min(poly\cdot W,\;2^{O(\sqrt{n\log n})}),\] which is simultaneously pseudopolynomial ($W$ is the maximal absolute edge weight) and subexponential in the number of vertices $n$. All previous algorithms for mean payoff games were either exponential or pseudopolynomial (which is purely exponential for exponentially large edge weights). %A Vladimir Gurvich %T War and Peace in Veto Voting %D Mon Mar 29 10:47:11 2004 %Z Sat Oct 30 17:05:21 EDT 2004 %I DIMACS %R 2004-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-06.ps.gz %X Let $I = \{i_1, \ldots, i_n\}$ be a set of voters (players) and $A = \{a_1, \ldots, a_p\}$ be a set of candidates (outcomes). Each voter $i \in I$ has a preference $P_i$ over the candidates. We assume that $P_i$ is a complete order on $A$. The preference profile $P = \{P_i, i \in I\}$ is called a {\em situation}. A situation is called {\em war} if the set of all voters $I$ is partitioned in two coalitions $K_1$ and $K_2$ such that all voters of $K_i$ have the same preference, $i = 1,2,$ and these two preferences are opposite. For a simple class of veto voting schemes we prove that the results of elections in all war situations uniquely define the results for all other ({\em peace}) situations.

Key words: veto, voting scheme, voting by veto, veto power, veto resistance, voter, candidate, player, outcome, coalition, block, effectivity function, veto function, social choice function, social choice correspondence %A Jonathan W. Berry %A Daniel Hrozencik %A Shrisha Rao %A Zhizhang Shen %T Finding Central Sets of Tree Structures in Synchronous Distributed Systems %D Sat Apr 3 16:54:15 2004 %Z Sat Oct 30 17:05:29 EDT 2004 %I DIMACS %R 2004-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-07.ps.gz %X Finding the central sets, such as center and median sets, of a network topology is a fundamental step in the design and analysis of complex distributed systems. This paper presents distributed synchronous algorithms for finding central sets in general tree structures.

Our algorithms are distinguished from previous work in that they take only qualitative information, thus reducing the constants hidden in the asymptotic notation, and all vertices of the topology know the central sets upon their termination. %A Yuri Goncharov %A Ilya Muchnik %A Leonid Shvartser %T Simultaneous Feature Selection And Margin Maximization Using Saddle Point Approach %D Mon Apr 5 12:32:47 2004 %Z Sat Oct 30 17:05:36 EDT 2004 %I DIMACS %R 2004-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-08.ps.gz %X A new SVM wrapper method, which simultaneously maximizes margin and minimizes feature space is introduced. For these purposes we modify the standard criterion by adding to the basic objective function a third term, which directly penalizes a chosen set of variables. The new criterion divided the set of all variables into three subsets: deleted, selected and weighted features. We are showing that the question can be formulated as a particular min-max problem for convex-concave functions, which in turn can be solved by saddle point polynomial algorithms. We analyzed a set of such algorithms and realized one, which is taking to account specificity of our problem. The algorithm is examined on a classification Benchmark and its ability to improve the recognition results is shown. We also show that the developed method can be easily transfered to the Support Vector Regression case. %A Henrik Bjorklund %A Sven Sandberg %A Sergei Vorobyov %T Randomized Subexponential Algorithms for Infinite Games %D Fri Apr 16 11:28:36 2004 %Z Sat Oct 30 17:05:42 EDT 2004 %I DIMACS %R 2004-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-09.ps.gz %X The complexity of solving infinite games, including parity, mean payoff, and simple stochastic games, is an important open problem in verification, automata theory, and complexity theory. In this paper we develop an abstract setting for studying and solving such games, as well as related problems, based on function optimization over certain discrete structures. We introduce new classes of completely local-global (CLG) and recursively local-global (RLG) functions, and show that strategy evaluation functions for parity and simple stochastic games belong to these classes. We also establish a relation to the previously well-studied completely unimodal (CU) and local-global functions. A number of nice properties of CLG-functions are proved.

In this setting, we survey several randomized optimization algorithms appropriate for CU-, CLG-, and RLG-functions. We show that the subexponential algorithms for linear programming by Kalai and Matousek, Sharir, and Welzl, can be adapted to optimizing the functions we study, with preserved subexponential expected running time. We examine the relations to two other abstract frameworks for subexponential optimization, the LP-type problems of Matousek, Sharir, Welzl, and the abstract optimization problems of G\"artner. The applicability of our abstract optimization approach to parity games builds upon a discrete strategy evaluation measure.

We also consider local search type algorithms, and settle two nontrivial, but still exponential, upper bounds. As applications we address some complexity-theoretic issues including non-PLS-completeness of the problems studied. %A Piotr Berman %A Bhaskar DasGupta %A Eduardo Sontag %T Randomized Approximation Algorithms for Set Multicover Problems with Applications to Reverse Engineering of Protein and Gene Networks %D Sat Oct 30 17:10:22 2004 %Z Sat Oct 30 17:10:31 EDT 2004 %I DIMACS %R 2004-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-10.ps.gz %X (This paper improves the results in DIMACS TR 2004-01)

In this paper we investigate the computational complexities of a combinatorial problem that arises in the reverse engineering of protein and gene networks. Our contributions are as follows:

(1) We abstract a combinatorial version of the problem and observe that this is ``equivalent'' to the set multicover problem when the ``coverage'' factor k is a function of the number of elements n of the universe. An important special case for our application is the case in which k=n-1.

(2) We observe that the standard greedy algorithm produces an approximation ratio of \Omega(log n) even if k is ``large'', i.e., k=n-c for some constant c>0.

(3) Let a<n denotes the maximum number of elements in any given set in our set multicover problem. Then, we show that a non-trivial analysis of a simple randomized polynomial-time approximation algorithm for this problem yields an expected approximation ratio E[r(a,k)] that is an increasing function of a/k. The behavior of E[r(a,k)] is ``roughly'' as follows: it is about 1+ln(a/k) when a/k is at least about $e^2, and for smaller values of a/k it decreases towards 2 linearly with decreasing a/k with lim_{a/k-->0} E[r(a,k)] <= 2. Our randomized algorithm is a cascade of a deterministic and a randomized rounding step parameterized by a quantity $\beta$ followed by a greedy solution for the remaining problem. %A Graham Cormode %T The Hardness of the Lemmings Game %D Sat Oct 30 17:11:01 2004 %Z Sat Oct 30 17:11:05 EDT 2004 %I DIMACS %R 2004-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-11.ps.gz %X green-haired lemming creatures to safety, and save them from an untimely demise. We formulate the decision problem which is, given a level of the game, to decide whether it is possible to complete the level (and hence to find a solution to that level). Under certain limitations, this can be decided in polynomial time, but in general the problem is shown to be NP-Hard. This can hold even if there is only a single lemming to save, thus showing that it is hard to approximate the number of saved lemmings to any factor. %A Graham Cormode %A Artur Czumaj %A S. Muthukrishnan %T How to Increase the Acceptance Ratios of Top Conferences %D Sat Oct 30 17:11:34 2004 %Z Sat Oct 30 17:11:38 EDT 2004 %I DIMACS %R 2004-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-12.pdf %X authors observed that many resumes list acceptance ratios of conferences where their papers appear, boasting the low acceptance ratio. The lower the ratio, the better your paper looks. The list might look equally impressive if one listed the rejection ratio of conferences where one’s paper was submitted and rejected. We decided to lampoon rather than lament the effort the PC typically put in: wouldn’t the world be better if we could encourage only high quality submissions and so run top conferences with very high acceptance ratios? This paper captures our thoughts, and it is best consumed in a pub (and in color). %A Jaewook Joo %A Joel L. Lebowitz %T Pair Approximation of the stochastic susceptible-infected-recovered-susceptible epidemic model on the hypercubic lattice %D Sat Oct 30 17:11:58 2004 %Z Sat Oct 30 17:12:03 EDT 2004 %I DIMACS %R 2004-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-13.ps.gz %X susceptible-infected-recovered-susceptible~(SIRS) epidemic model on one- and two- dimensional lattices. We compare the behavior of this system, obtained from computer simulations, with those obtained from the mean-field approximation~(MFA) and pair-approximation~(PA). The former~(latter) approximates higher order momen ts in terms of first~(second) order ones. We find that the PA gives consistently better results than the MFA. In one dimension the improvement is even qualitative. %A Jaewook Joo %A Joel Lebowitz %T Behavior of $SIS$ epidemics on heterogeneous networks with saturation %D Sat Oct 30 17:12:20 2004 %Z Sat Oct 30 17:12:33 EDT 2004 %I DIMACS %R 2004-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-14.ps.gz %X ls of the spread of epidemics in heterogeneous populations. The structure of interactions in the population is represented by networks with connectivity distribution $P(k)$, including scale-free (SF) networks with power law distributions $P(k)\sim k^{-\gamma}$. Considering cases where the transmission of infection between nodes depends on t heir connectivity, we introduce a saturation function $C(k)$ which reduces the i nfection transmission rate $\lambda$ across an edge going from a node with high connectivity $k$. A mean field approximation with the neglect of degree-degree c orrelation then leads to a finite threshold $\lambda_{c}>0$ for SF networks with $2<\gamma \leq 3$.

We also find, in this approximation, the fraction of infected individuals among those with degree $k$ for $\lambda$ close to $\lambda_{c}$.

We investigate via computer simulation the contact process on a heterogeneous regular lattice and compare the results with those obtained from mean field theory with and without neglect of degree-degree correlations. %A P. De Leenheer %A D. Angeli %A E.D. Sontag %T A tutorial on monotone systems- with an application to chemical reaction networks %D Sat Oct 30 17:12:51 2004 %Z Sat Oct 30 17:15:16 EDT 2004 %I DIMACS %R 2004-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-15.ps.gz %X partial order. Some applications will be briefly reviewed in this paper. Much of the appeal of the class of monotone systems stems from the fact that roughly, most solutions converge to the set of equilibria. However, this usually requires a stronger monotonicity property which is not always satisfied or easy to check in applications. Following a result by J.F. Jiang, we show that monotonicity is enough to conclude global attractivity if there is a unique equilibrium and if the state space satisfies a particular condition. The proof given here is self-contained and does not require the use of any of the results from the theory of monotone systems. We will illustrate it on a class of chemical reaction networks with monotone, but otherwise arbitrary, reaction kinetics. %A P. De Leenheer %A D. Angeli %A E.D. Sontag %T Monotone chemical reaction networks %D Sat Oct 30 17:13:24 2004 %Z Sat Oct 30 17:15:23 EDT 2004 %I DIMACS %R 2004-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-16.ps.gz %X converges to some steady state. The reaction kinetics are assumed to be monotone but otherwise arbitrary. When diffusion effects are taken into account, the conclusions remain unchanged. The main tools used in our analysis come from the theory of monotone dynamical systems. We review some of the features of this theory and provide a self-contained proof of a particular attractivity result which is used in proving our main result. %A Talip Atajan %A Xuerong Yong %A Hiroshi Inaba %T Further Analysis of the Number of Spanning Trees in Circulant Graphs %D Sat Oct 30 17:13:39 2004 %Z Sat Oct 30 17:15:28 EDT 2004 %I DIMACS %R

In this paper we obtain further properties of the numbers $a_n$ by considering their combinatorial structures. Using these properties we answer the open problem posed in the Conclusion of [Y. P. Zhang, X. Yong, M. J. Golin.]. As examples, we describe our technique and asymptotic properties of the numbers.

Key Words: Circulant graphs; number of spanning trees %A Yuanping Zhang %A Zhiyong Zhang %A Xuerong Yong %T The number of spanning trees in circulant graphs with non-fixed jumps %D Sat Oct 30 17:13:58 2004 %Z Sat Oct 30 17:15:32 EDT 2004 %I DIMACS %R 2004-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-18.ps.gz %X An {\em undirected circulant graph}, $C_n^{s_1,s_2,\cdots,s_k}, ~$ has $n$ vertices labelled $0,\ 1,\ 2,\ \cdots$, $n-1$, and for each $s_i\ (1\leq i\leq k)$ and $j$ ($0\leq j \leq n-1$) there is an edge between $j$ and $j + s_i\ \bmod n$. Let $T(C_n^{s_1,s_2,\cdots,s_k})$ be the number of spanning trees of $C_n^{s_1,s_2,\cdots,s_k}$. We proved in \cite{kn:ZHANG00} that $T(C_n^{s_1,s_2,\cdots,s_k})=na_n^2$ where $a_n$ satisfies a linear recurrence relation of order $2^{s_k-1}$. In this paper we show that $T(C_{pn}^{s_1,s_2,\cdots,s_k,qn})$ can be splitted into the products of $T(C_n^{s_1,s_2,\cdots,s_k})$ and $\frac{1}{p}c_{n,p,q}^2$, where $c_{n,p,q}$ satisfies a linear recurrence relation for some $p,\ q$. As examples of its use, we describe the method and, for certain $p,q$, we provide the recurrence relations and the asymptotics of $T(C_{pn}^{s_1,s_2,\cdots,s_k,qn})$. %A Vadim V. Lozin %A Dieter Rautenbach %T The Relative Clique-Width of a Graph %D Sat Oct 30 17:14:12 2004 %Z Sat Oct 30 17:15:37 EDT 2004 %I DIMACS %R 2004-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-19.ps.gz %X which is partly due to the fact that many hard algorithmic problems can be solved efficiently when restricted to graphs of bounded tree-width. The same is true for the clique-width which is a relatively young notion generalizing tree-width in the sense that graphs of bounded tree-width have bounded clique-width.

Whereas tree-decompositions that are used to define tree-width are a very intuitive and easily visualizable way to represent the global structure of a graph, the clique-width is much harder to grasp intuitively. To better understand the nature of the clique-width, we introduce the notion of relative clique-width and study two algorithmical problems related to it. In conjunction, these problems would allow to determine the clique-width.

For one of the problems, which is to determine the relative clique-width, we propose a polynomial-time factor 2 approximation algorithm and also show an exact solution in a natural special case. The study of the other problem, which is left open in the paper, has brought us to an alternative and transparent proof of the known fact that graphs of bounded tree-width have bounded clique-width. %A Donald R. Hoover %T Subject Allocation and Study Curtailment for Fixed Event Comparative Poisson Trials %D Sat Oct 30 17:14:28 2004 %Z Sat Oct 30 17:15:42 EDT 2004 %I DIMACS %R 2004-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-20.ps.gz %X can be lengthy and costly. We evaluate two easily implemented approaches to reduce numbers of disease cases and person years of follow up (N_{u+t}) for comparative Poisson trials with fixed numbers of cases (T); i) altering k the portion of N_{u+t} allocated to treatment, and ii) curtailed stopping before T cases if numbers of cases in the treatment or control group indicate that H_o has already been rejected or will not be rejected at T cases. Normal and arcsine approximations as well as discrete exact tests are evaluated. For studies not stopped early, allocating about 1/(1+\sqrt(r_a)) of person years to treatment roughly minimizes T needed for given size and power (where r_a is the alternative hypothesized relative disease incidence in treated subjects used to power the study). This reduces T moderately V_s equal allocation (k=0.5); by 2-3 cases in our examples with exact tests. However, the common practice of allocating k=0.5 of the person years to treatment may be the overall best strategy to minimize N_{u+t} for studies that are not stopped early. For studies analyzed by exact test and planned with a one sided \alpha ranging from 0.005 to 0.025, \beta from 0.1 to 0.2 and r_a from 0.2 to 0.5, curtailed stopping reduces both number of disease cases and N_{u+t} by 6% to 40% depending on true treatment benefit. With curtailed stopping, allocating k=0.5 person years to treatment approximately minimizes numbers of cases and person years under most conditions, although k as large as 0.6 often performs comparably well. If a specific localized k is selected to minimize disease cases for a study analyzed by exact test, the study may be underpowered should the final allocation deviate even slightly from that k when it is conducted.

Key Words: Comparative Poisson, Curtailed Stopping, Power, Sample Size, Subject Allocation %A Amr Elmasry %T A Framework for Reducing Comparisons in Heap Operations %D Sat Oct 30 17:14:42 2004 %Z Sat Oct 30 17:15:47 EDT 2004 %I DIMACS %R 2004-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-21.ps.gz %X minimum deletion operations for priority queues. In particular, we give a priority queue with constant cost per insertion and minimum finding, and logarithmic cost with at most $\log{n}+O(\log{\log{n}})$ \footnote{$\log{x}$ equals $\max(\log_2{x},1)$.} comparisons per deletion and minimum deletion, improving over the bound of $2 \log{n}+O(1)$ comparisons for the binomial queues and the pairing heaps. We further improve this bound to at most $\log{n}+O(1)$ comparisons. We also give a priority queue that supports, in addition to the above operations, the decrease-key operation. This latter priority queue achieves, in the amortized sense, constant cost per insertion, minimum finding and decrease-key operations, and logarithmic cost with at most $1.44\log{n}+O(\log{\log{n}})$ comparisons per deletion and minimum deletion. %A Sarmad Abbasi %A Asif Jamshed %T A Degree Constraint for Uniquely Hamiltonian Graphs %D Sat Oct 30 17:16:38 2004 %Z Sat Oct 30 17:38:40 EDT 2004 %I DIMACS %R 2004-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-22.ps.gz %X Hamilton cycle. We show that if $G=(V,E)$ is uniquely Hamiltonian then $$ \sum_{v \in V} \left( 2 \over 3 \right) ^{d(v)-\#(G)} \geq 1.$$ Where $\#(G) = 1$ if $G$ has even number of vertices and $2$ if $G$ has odd number of vertices. It follows that every $n$-vertex uniquely Hamiltonian graph contains a vertex whose degree is at most $c \log_2 n + 2$ where $c=\left(\log_2 3 -1 \right)^{-1} \approx 1.71$ thereby improving a bound given by Bondy and Jackson. %A Amir Herzberg %A Ahmad Gbara %T Protecting (even) Naïve Web Users, or: Preventing Spoofing and Establishing Credentials of Web Sites %D Sat Oct 30 17:16:52 2004 %Z Sat Oct 30 17:38:50 EDT 2004 %I DIMACS %R 2004-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-23.pdf %X sites and/or present false credentials, causing substantial damages to individuals and corporations. Several papers presented such web spoofing attacks, and suggested countermeasures, mostly by improved browser user interface. However, we argue that these countermeasures are inappropriate to most non-expert web users; indeed, they are irrelevant to most practical web-spoofing attacks, which focus on non-expert users. In fact, even expert users could be victim of these practical, simple spoofing attacks.

We present the trusted credentials area, a simple and practical browser UI enhancement, which allows secure identification of sites and validation of their credentials, thereby preventing web-spoofing, even for naïve users. The trusted credentials area is a fixed part of the browser window, which displays only authenticated credentials, and in particular logos, icons and seals. In fact, we recommend that web sites always provide credentials (e .g. logo) securely, and present them in the trusted credentials area; this will help users to notice the absence of secure logo in spoofed sites.

Existing web security mechanisms (SSL/TLS) may cause substantial overhead if applied to most web pages, as required for securing credentials (e.g. logo) of each page; we present a simple alternative mechanism to secure web pages and credentials, with acceptable overhead. Finally, we suggest additional anti-spoofing measures for site owners and web users, mainly until deployment of the trusted credentials area. %A Randeep Bhatia %A Nicole Immorlica %A Tracy Kimbrel %A Vahab S. Mirrokni %A Joseph (Seffi) Naor %A Baruch Schieber %T Network Augmentation for Confluent Flow in Data Networks %D Sat Oct 30 17:17:05 2004 %Z Sat Oct 30 17:38:55 EDT 2004 %I DIMACS %R 2004-24 %U 2000 Mathematics Subject Classification: 68Q15 (68Q17, 05C69, 05C70).

KEY WORDS: End of life, elderly, health care expenditures, Medicare, Medicai

%A M. Hoffman %A S. Muthukrishnan %A Rajeev Raman %T Location Streams: Models and Algorithms %D Sat Oct 30 17:18:59 2004 %Z Sat Oct 30 17:42:11 EDT 2004 %I DIMACS %R 2004-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-28.ps.gz %X These are suitable for processing data streams comprising locations of moving objects. We present algorithms for tracking the ``extent'' of such points in both these models that fits the stream constraint of polylogarithmic space and time. Our work adds to the growing knowledge of data stream algorithms in general by initating the study of new models and algoritms for spatial data such as location streams. %A German A. Enciso %A Eduardo D. Sontag %T A Note on a Monotone Small Gain Theorem, with Applications to Delay Systems %D Sat Oct 30 17:19:08 2004 %Z Sat Oct 30 17:42:17 EDT 2004 %I DIMACS %R 2004-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-29.ps.gz %X system by writing it as the negative feedback closed loop of a controlled monotone system. We associate to this continuous system a discrete one in a lower dimensional state space, whose global attractivity to an equilibrium implies that of the continuous system to some state. The proofs are given in an abstract setting, and an application is given to a lac operon model with time delay. %A Fanica Gavril %T k-Interval-filament graphs %D Sat Oct 30 17:53:19 2004 %Z Sat Oct 30 17:53:24 EDT 2004 %I DIMACS %R 2004-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-30.pdf %X if it is acyclic and for every directed path $p=u1-> u2-> ...->uk+2$ with $k+2$ vertices, $p$ induces a clique if each of the two subpaths $u1->u2->...->uk+1$ and $u2->...->uk+2$ induces a clique. We describe an algorithm to find a maximum weight clique in a $k$-transitive graph.Consider a hereditary family $G$ of graphs. A graph $H(V,E)$ is called $G-k-mixed$ if its edge set can be partitioned into two disjoint subsets $E1$, $E2-E3$ such that $H(V,E1) \in $G$, $H(V,E2)$ is transitive, $H(V,E2-E3)$ is $k$-transitive and for every three distinct vertices $u,v,w$ if $u->v \in E2$ and $(v,w) \in E1$ then $(u,w) \in E1$. The letter $G$ is generic and can be replaced by names of specific families. We show that if the family $G$ has a polynomial time algorithm to find a maximum clique, then, under certain restrictions there exists a polynomial time algorithm to find a maximum clique in a family of $G-k$-mixed graphs.

Let $I$ be a family of intervals on a line $L$ in a plane $PL$ such that every two intersecting intervals have a common segment. In $PL$, above $L$, construct to each interval $i(v) \in I$ a filament $v $connecting its two endpoints, such that for every two filaments $u,v$ having $u \bigcap v \not = empty_set $ and disjoint intervals $i(u)<i(v)$, there are no $k$ filaments $w$ with $i(u)<i(w)<i(v)$ which intersect neither $u$ nor $v$, are mutually disjoint and have mutually disjoint intervals. This is a $family of k-interval-filaments$ and its intersection graph is a $k-interval-filament graph$; their complements are ($k$-transitive) mixed graphs and have a polynomial time algorithm for maximum cliques. Now, when two filaments $u,v$ do not intersect because $u \subset v$, and between $u$ and $v$ there are at most $k-1$ non-intersecting filaments $w1,...,wk-1$ such that $wi \subset wi+1$ and intersect neither $u$ nor $v$, we attach to each one of $u,v$ a branch in the space above $PL$ such that the two branches intersect. This is a$ family of general-k-interval-filaments$ and its intersection graph is a $general-k-interval-filament graph$; their complements are ($k$-transitive)-$k$-mixed graphs. %A E. Boros %A K. Elbassioni %A V. Gurvich %A L. Khachiyan %T A Global Parallel Algorithm for Finding All Minimal Transversals of Hypergraphs of Bounded Edge-size %D Sat Oct 30 17:53:39 2004 %Z Sat Oct 30 17:53:45 EDT 2004 %I DIMACS %R 2004-31 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-31.ps.gz %X $\cH\subseteq 2^V$ with maximum edge-size less than $\delta$, outputs all minimal transversals (or equivalently all maximal independent sets) of $\cH$ in time $O(\delta^2\log^2 n\log m)$ using $O(n^{\delta\log\delta+2} m^{\delta})$ processors on a CREW-PRAM, where $m$ is the total number of minimal transversals and $n=|V|$. This algorithm can be modified so that, for any integer $k$, it outputs $k$ minimal transversals of $\cH$ in time $O(\delta^2\log^2 n \log k+T)$ using $O(n^{\delta\log\delta+1} k^{\delta}+k\Pi)$ processors, where $T$ and $\Pi$ are respectively the parallel time and the number of processors required to generate a single minimal transversal of $\cH$. The latter problem is known to be in RNC for hypergraphs of bounded edge-size. This strengthens and generalizes a previously known result for graphs. We also obtain a similar result for hypergraphs whose transversal hypergraphs are $\delta$-conformal for some constant $\delta$.

**Key Words**:

Bounded dimension, conformal hypergraph, dualization, global
parallel algorithm, hypergraph, incremental generating,
maximal independent set, minimal transversal.
%A Amir Herzberg
%T Controlling Spam by Secure Internet Content Selection
%D Sat Oct 30 17:55:18 2004
%Z Sat Oct 30 17:55:23 EDT 2004
%I DIMACS
%R 2004-32
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-32.pdf
%X users and service providers. We present the Secure Internet Content Selection
(SICS) protocol, an efficient cryptographic mechanism for spam-control, based
on allocation of responsibility (liability). With SICS, e-mail is sent with
a content label, and a cryptographic protocol ensures labels are authentic
and penalizes falsely labeled e-mail (spam). The protocol supports trusted
senders (penalized by loss of trust) and unknown senders (penalized
financially). The recipient can determine the compensation amount for
falsely labeled e-mail (spam)). SICS is practical, with negligible overhead,
gradual adoption path, and use of existing relationships; it is also flexible
and appropriate for most scenarios, including deployment by end users and/or
ISPs and support for privacy and legitimate, properly labeled commercial
e-mail. SICS improves on other crypto-based proposals for spam controls,
and complements non-cryptographic spam controls.
%A Akshay Vashist
%A Casimir A. Kulikowski
%A Ilya Muchnik
%T Automatic screening for groups of orthologous genes in comparative genomics using multiple-component clustering
%D Sat Oct 30 17:55:37 2004
%Z Sat Oct 30 17:55:44 EDT 2004
%I DIMACS
%R 2004-33
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-33.ps.gz
%X organisms is a problem in modeling evolutionary history while solving
practical problems related to functional annotation of genes. We have
developed automatic method for discovering groups of gene sequences
present in different organisms that are functionally related through
evolution.

We have developed a new clustering method, which allows us to build clusters from multi-component types of data. In our case the data is a large set of genomes in which one has to find clusters that are groups of orthologous genes, focusing on hyper-inter-similarities among genes from different genomes more than the intra-similarities among genes from the same genome.

We have found that discovering these groups provides a "strong draft" of the complete picture of orthologous relations among genes in the complete genomes studied. Comparisons of these groups with the well-known semi-automatically extracted clusters of orthologous groups, COG [http://www.ncbi.nlm.nih.gov/COG/] shows strong correlation between these two systems of clusters. For instance, more than 85% of our clusters include genes from at least three different genomes and each of these genes belongs to COGs. These studies demonstrate that the method can be applied for an automatic screening of groups of orthologous genes in analyzing a large collection of genomes from different organisms. %A Alexander Kelmans %T On Hamiltonicity Of Claw- And Net-Free Graphs %D Sat Oct 30 17:56:17 2004 %Z Sat Oct 30 17:58:57 EDT 2004 %I DIMACS %R 2004-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-34.ps.gz %X for a claw- and net-free graph $G$ with $s,t \in V(G)$ and $e \in E(G)$ to have $(a1)$ a Hamiltonian path and Hamiltonian $s$--path, $(a2)$ a Hamiltonian $st$--path and Hamiltonian $s$- and $st$--paths containing $e$ if $G$ has connectivity one, and $(a3)$ a Hamiltonian cycle containing $e$ if $G$ is 2--connected. For example, we show that {\em if $G$ is a 2--connected claw- and net-free graph and $e = xy \in E(G)$ then $e$ belongs to a Hamiltonian cycle of $G$ if and only if $G - \{x,y\}$ is connected.} These results imply that a connected claw- and net-free graph has a Hamiltonian path and a 2--connected claw- and net-free graph has a Hamiltonian cycle [D. Duffus, R.J. Gould, M.S. Jacobson, Forbidden Subgraphs and the Hamiltonian Theme, in The Theory and Application of Graphs $($Kalamazoo, Mich., 1980$)$, Wiley, New York (1981) 297--316], and are used in [A. Kelmans, On the Claw- and Net-Free Graphs, a manuscript (1999)] to obtain more general Hamiltonicity results for $k$--connected graphs. Our proofs of $(a1)-(a3)$ are shorter, than the proofs of their corollaries in [D. Duffus, R.J. Gould, M.S. Jacobson, Forbidden Subgraphs and the Hamiltonian Theme, in The Theory and Application of Graphs $($Kalamazoo, Mich., 1980$)$, Wiley, New York (1981) 297--316], and provide polynomial time algorithms for solving the corresponding Hamiltonicity problems.

**Keywords:** claw, net, graph, Hamiltonian path, Hamiltonian
cycle, polynomial time algorithm.
%A Alexander Kelmans
%T On Claw- And Net-Free Graphs
%D Sat Oct 30 17:56:31 2004
%Z Sat Oct 30 17:59:10 EDT 2004
%I DIMACS
%R 2004-35
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-35.ps.gz
%X net-free graphs. Using these properties, we find (among other results)
necessary and sufficient conditions for a claw- and net-free graph $G$ with
$s, t \in V(G)$, $s \ne t$, $e \in E(G)$, and disjoint paths $S$ and
$T$ in $G$ to have:

$(c1)$ a Hamiltonian $st$-path,

$(c2)$ a Hamiltonian $s$-path containing $e$,

$(c3)$ a Hamiltonian $st$-path containing $e$ if $G$ is 3-connected,

$(c4)$ a Hamiltonian cycle containing $S$ if $G$ is $k$-connected, $k \ge 2$, and $v(S) \le k$,

$(c5)$ a Hamiltonian path containing $e$ and $S$ if $G$ is $(k+1)$-connected, $k \ge 2$, and $v(S) \le k$,

$(c6)$ a Hamiltonian $st$-path containing $S$ and $T$ if $G$ is $k$-connected, $k \ge 3$, $v(S) + v(T) \le k$, and $s$, $t$ are end-vertices of paths $S$ and $T$, respectively.

From the above mentioned results we obtain, in particular:

Let $G$ be a claw- and net-free graph. If $G$ is 4-connected, then for any two different vertices $s$, $t$ and any edge $e$ in $G$ there is a Hamiltonian $st$-path in $G$ containing $e$ and, in particular, every two edges in $G$ belong to a Hamiltonian cycle. If $G$ is 3-connected then every two non-adjacent edges in $G$ belong to a Hamiltonian cycle.

**Keywords:** claw, net, graph, Hamiltonian path, Hamiltonian
cycle, polynomial time algorithm.
%A Patrick De Leenheer
%A Eduardo Sontag
%T A note on the monotonicity of matrix Riccati equations
%D Sat Oct 30 17:56:46 2004
%Z Sat Oct 30 17:59:16 EDT 2004
%I DIMACS
%R 2004-36
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-36.ps.gz ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-36.pdf
%X equations is monotone. Consequently, many results for Riccati equations
can be obtained easily using standard as well as more recent results from
the theory of monotone systems.
%A James Abello
%A Graham Cormode
%T Report on DIMACS Working Group Meeting: Data Mining and Epidemiology, March 18-19, 2004
%D Sat Oct 30 17:56:55 2004
%Z Sat Oct 30 17:59:22 EDT 2004
%I DIMACS
%R 2004-37
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-37.pdf
%X and explaining patterns of health and disease in populations, usually of
humans, but also populations of animals, insects and plants. Data mining
is an active area of research interested in finding algorithms for
describing latent patterns in often very large data sets. This Working
Group had the objective of fostering collaboration between these two
disciplines. In March of 2004 it organized a two-day meeting at DIMACS to
bring these two groups together in a format designed to initiate such
collaborations.
%A Ahmed Belal
%A Amr Elmasry
%T Verification of Minimum-Redundancy Prefix Codes
%D Sat Oct 30 17:57:21 2004
%Z Sat Oct 30 17:59:30 EDT 2004
%I DIMACS
%R 2004-38
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-38.ps.gz
%X $\Omega(n\log{n})$, indicating that the verification problem is not
asymptotically easier than the construction problem. Alternatively,
we give linear-time verification algorithms for several special cases that
are either typical in practice or theoretically interesting.
%A Amr Elmasry
%T Distribution-Sensitive Priority Queues
%D Sat Oct 30 18:01:57 2004
%Z Sat Oct 30 18:02:02 EDT 2004
%I DIMACS
%R 2004-39
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-39.ps.gz
%X constant and the cost per deletion or minimum-deletion of an item $x$ is
$O(\log{k_x})$, where $k_x$ is the number of items that are inserted during
the lifespan of $x$ and are still in the heap when $x$ is deleted. We achieve
the above bounds in both the amortized case and the worst case. Several
applications of our structure are mentioned.
%A Yangzhe Xiao
%A Haym Hirsh
%A Casimir Kulikowski
%A Michael Littman
%A Ilya Muchnik
%T Category-based feature extraction in supervised categorization of Aviation Safety Report System documents
%D Sat Oct 30 17:58:23 2004
%Z Sat Oct 30 18:02:11 EDT 2004
%I DIMACS
%R 2004-40
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-40.ps.gz
%X for text categorization, which are based on characteristic
descriptions of categories. A new category-based feature is
derived from the relationship between a document and such a
description. A document is then projected onto the category-based
coordinates in the new feature space. We evaluate two different
approaches for extracting category-based features. One is a
category-specific weighting, where a description of a category is
composed of the relative discriminating powers of all terms
w.r.t. the category. The new feature based on this category is
the weighted sum of all terms of a document vector according to
this description. The other is classifier-based, where a
description is a learned classifier of a category and the new
feature for a document is the judgment of this classifier for
this document. We evaluate our new feature extraction methods for
the Aviation Safety Report System documents using Support Vector
Machines. The new category-based feature extraction methods give
comparable results to the best ones obtained using feature
selection with Chi-square-based term ranking.
%A Henrik Bjorklund
%A Olle Nilsson
%A Ola Svensson
%A Sergei Vorobyov
%T The Controlled Linear Programming Problem
%D Sat Oct 30 18:02:31 2004
%Z Sat Oct 30 18:04:26 EDT 2004
%I DIMACS
%R 2004-41
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-41.ps.gz
%X Programming Problem. In a system of linear constraints of the form
$x_i\leq p_i^j(\bar x)+w_i^j$, where $p_i^j$ are linear homogeneous
polynomials with nonnegative coefficients, some variables are
controlled and the controller wants to select exactly one
constraint for each controlled variable in a way that makes $\max\sum
x_i$ subject to the remaining constraints as large as possible (over
all selections). We suggest several iterative strategy improvement
policies (simplex-like and interior-point), prove optimality
conditions in several important cases, describe subexponential
algorithms, %, including single-, multiple-switch,
and interior point ones, and show that the decision version of the
problem is a member of NP $\cap$ coNP whenever
stability yields optimality, and also when all coefficients are
nonnegative integers.

It turns out that the well known mean payoff, discounted payoff, and simple stochastic games are easily reducible to the Controlled LP-Problem. This suggests a simple unifying view of all these games as particular restricted instances of the problem.

We show that a slight generalization of the problem allowing for negative coefficients in the constraint polynomials $p_i^j$ leads to NP-hardness. %A S. Muthukrishnan %T Nonuniform Sparse Approximation with Haar Wavelet Basis %D Sat Oct 30 18:02:44 2004 %Z Sat Oct 30 18:05:21 EDT 2004 %I DIMACS %R 2004-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-42.ps.gz %X a given signal on $N$ pointsas a linear combination of at most $B$ ($B\ll N$) elements from a dictionary so that the error of the representation is minimized; traditionally, error is taken {\em uniformly} as the sum of squares of errors at each point.

In this paper, we initiate the study of {\em nonuniform} sparse approximation theory where each point has an {\em importance}, and we want to minimize the sum of individual errors weighted by their importance. In particular, we study this problem with the basic Haar wavelet dictionary that has found many applications since being introduced in 1910. Parseval's theorem from 1799 which is central in solving uniform sparse approximation for Haar wavelets does not help under nonuniform importance.

We present the first known polynomial time for the problem of finding $B$ wavelet vectors to represent a signal of length $N$ so that the representation has the smallest error, averaged over the given importance of the points. The algorithm takes time $O(N^2B/\log B)$. When the importance function is well-behaved, we present another algorithm that takes near-linear time. Our methods also give first known, efficient algorithms for a number of related problems with nonuniform importance. %A Denise Sakai Troxell %T On Critical Trees Labeled with a Condition at Distance Two %D Sat Oct 30 18:03:00 2004 %Z Sat Oct 30 18:05:27 EDT 2004 %I DIMACS %R 2004-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-43.ps.gz %X integers to its vertices so that adjacent vertices get labels at least two apart and vertices at distance two get distinct labels. A graph is said to be lambda-critical if lambda is the minimum span taken over all of its L(2,1)-labelings, and every proper subgraph has an L(2,1)-labeling with span strictly smaller than lambda. Georges and Mauro have studied 5-critical trees with maximum degree 3 by examining their path-like substructures. They also presented an infinite family of 5-critical trees of maximum degree 3. We generalize these results for lambda-critical trees with maximum degree greater than 3. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T Computing Many Maximal Independent Sets for Hypergraphs in Parallel %D Sat Oct 30 18:03:10 2004 %Z Sat Oct 30 18:05:33 EDT 2004 %I DIMACS %R 2004-44 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-44.ps.gz %X $\delta$-sparse} if for every nonempty subset $X\subseteq V$ of vertices, the average degree of the sub-hypergraph of $\cH$ induced by $X$ is at most $\delta$. We show that there is a deterministic algorithm that, given a uniformly $\delta$-sparse hypergraph $\cH$, and a positive integer $k$, outputs $k$ or all minimal transversals for $\cH$ in $O(\delta \log (1+k)\mbox{polylog}(\delta|V|))$-time using $|V|^{O(\log \delta)}k^{O(\delta)}$ processors. Equivalently, the algorithm can be used to compute in parallel $k$ or all maximal independent sets for $\cH$. %A Steven B. Horton %A Claudio N. Meneses %A Arup Mukherjee %A M. Erol Ulucakli %T A Computational Study of the Broadcast Domination Problem %D Sat Oct 30 18:03:24 2004 %Z Sat Oct 30 18:05:39 EDT 2004 %I DIMACS %R 2004-45 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-45.ps.gz %X s \}$ that minimizes $\sum_{v \in V} f(v)$ with the constraint that for every $v $ with $f(v)=0$, there is a vertex $u \in V$ with $f(u)$ at least the distance f rom $u$ to $v$ in $G$. This is a generalization of the standard Dominating Set problem. This note investigates the problem from a computational standpoint. %A Michael Markov %A Vadim Mottl %A Ilya Muchnik %T Principles of nonstationary regression estimation: A new approach to dynamic multi-factor models in finance %D Sat Oct 30 18:03:38 2004 %Z Sat Oct 30 18:05:46 EDT 2004 %I DIMACS %R 2004-47 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-47.pdf %X to detect the hidden dynamics of an investment instrument or a portfolio in respect to certain market or economic factors. Such problems can be naturally formulated as a complex of problems concerned with estimating a nonstationary linear regression model under additional constraints and requirements which have been not considered in the classical methodology of signal analysis. These problems are adequate also to many other engineering and scientific applications.

We review existing financial multi-factor models from the standpoint of their performance in
detecting hidden investment portfolio dynamics. Using practical examples we present and analyze
the shortcomings of these models in detecting both a gradual and rapid changes in investment
portfolio structure. We then lay the groundwork for a new approach, which we call Dynamic
Style Analysis (DSA), representing a true time-series multi-factor portfolio analysis model.
At the core of the methodology we present a new dynamic regression model, which we call Constrained
Flexible Least Squares (CFLS). One of the most important features of the DSA model is
that it is fully adaptive, i.e., all model parameters are determined from data. The major concepts
of the new methodology are gradually introduced and applied to analyses of both model portfolios
and well-known public US mutual funds. By comparing publicly available holdings data with
results obtained with DSA, we demonstrate both the superiority of the new model and its remarkable
accuracy in detecting portfolio dynamics. We also address issues such as the computational
complexity of DSA and its practical applications in the areas of risk management, performance
measurement and investment research. One of the major applications of the new methodology
lies in hedge fund due diligence and risk monitoring, where the importance of uncovering and
controlling hidden factor dynamics is especially valuable given the limited transparency of hedge
funds.
%A Zdenek Dvorak
%A Vit Jelinek
%A Daniel Kral
%A Jan Kyncl
%A Michael Saks
%T Three Optimal Algorithms for Balls of Three Colors
%D Sat Oct 30 18:03:47 2004
%Z Sat Oct 30 18:05:52 EDT 2004
%I DIMACS
%R 2004-48
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-48.ps.gz
%X a coloring of n balls with three colors. At each step, Paul chooses
a pair of balls and asks Carol whether the balls have the same color.
Carol truthfully answers yes or no. In the Plurality problem, Paul
wants to find a ball with the most common color. In the Partition
problem, Paul wants to partition the balls according to their colors.
He wants to ask Carol the least number of questions to reach his goal.
We find optimal deterministic and probabilistic strategies for the
Partition problem and an asymptotically optimal probabilistic strategy
for the Plurality problem.
%A Anna C. Gilbert
%A S. Muthukrishnan
%A
%A Martin J. Strauss
%T Improved Time Bounds for Near-Optimal Sparse Fourier Representations
%D Sat Oct 30 18:03:57 2004
%Z Sat Oct 30 18:06:00 EDT 2004
%I DIMACS
%R 2004-49
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-49.ps.gz
%X B terms for a given discrete signal A of length N. The
Fast Fourier Transform (FFT) can find the optimal N-term
representation in O(N*log(N)) time, but our goal is to get
sublinear algorithms when B

Our main result is a significantly improved algorithm for this problem and the d-dimensional analog. Our algorithm outputs an R with the same approximation guarantees but it runs in time B*poly(d,log(N),log(M),1/\epsilon).

We need two crucial ideas to achieve this bound: bulk sampling and estimation for multipoint polynomial evaluation using an unevenly-spaced Fourier tranform, and construction and use of arithmetic-progression independent random variables. %A Bin Tian %T A $O(n^2)$ lower bound for swaping the order of $n$ input bits in planar boolean circuit %D Sat Oct 30 18:04:11 2004 %Z Sat Oct 30 18:06:08 EDT 2004 %I DIMACS %R 2004-51 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-51.ps.gz %X supprisingly hard. One example is to swap the order of $n$ input bits in a planar boolean circuit. Mario Szegedy conjectured that $O(n^2)$ gates are necessary. This paper gives a short proof. %A S. Muthukrishan %A Martin J. Strauss %T Approximate Histogram and Wavelet Summaries of Streaming Data %D Mon Nov 1 23:05:16 2004 %Z Mon Feb 7 17:11:46 EST 2005 %I DIMACS %R 2004-52 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-52.ps.gz %X optimal or nearly optimal, from streaming data. Specifically, we consider a vector A given either as a stream of aggregate values A[0],A[1],... or a stream of updates of the form "add x to A[i]." We then build a histogram H; that is, a piecewise-constant vector with a given number B of pieces. We also consider the related representation formats of piecewise-linear representations and Haar wavelet representations.

We consider first point queries to the data. The point query i should ideally be answered by A[i]; instead, it will be answered by H[i], leading to square error (A[i]-H[i])^2. We want to minimize or nearly minimize thus sum square error. Next, we consider range queries, of the form (ell,r), which should ideally be answered by the sum of A[i] over i in the range ell to r. Finally, we consider some of these questions with regard to two-dimensional signals A[i,j] indexed by two (or more) integers instead of just a single integer. %A Solomon Borodkin %A Aleksey Borodkin %A Ilya Muchnik %T Optimal Mapping of Deep Gray Scale Images to a Coarser Scale of Gray %D Fri Nov 5 17:09:24 2004 %Z Mon Feb 7 17:12:05 EST 2005 %I DIMACS %R 2004-53 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-53.pdf %X and explored. A distance between images having different intensity ranges is introduced. Minimization of such a distance can be viewed as least squares approximation of a source high range image by the best target image with a given number of levels of gray. Following S.Lloyd [1], we proved that the latter problem is equivalent to optimal partitioning of the source image intensity range into a given number of intervals, provided that the sum of intra-interval variations reaches minimum. An efficient algorithm for optimal partitioning based on dynamic programming is used, which is the same in complexity as the ones known from literature, but better in terms of required memory. The proposed approach is applied to visualization of deep gray scale medical images. Advantages of the method over linear mapping and histogram equalization are demonstrated on the sample images. The other application fields may include image optimization for printing/faxing/copying. %A Natalia L. Komarova %A Liming Wang %T Initiation of colorectal cancer: where do the two hits hit? %D Wed Nov 17 20:25:05 2004 %Z Mon Feb 7 17:12:19 EST 2005 %I DIMACS %R 2004-54 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-54.ps.gz %X importance for colorectal cancer initiation. The earliest event being the inactivation of both alleles of the Adenomatous Polyposis Coli (APC) gene, it is thought that the stem cells are the most likely target for these two first hits. Indeed, at the first glance, short-lived differentiated cells cannot sustain a mutation long enough for the second hit to occur, because of the constant apoptosis/renewal process in epithelial tissues. Using a straightforward calculation, we show that this intuitive argument is incorrect. Our model based on the conventional view of colon crypt architecture, suggests that at least one of the two hits may occur in the migrating compartment. We suggest that a possible role of differentiating cells in cancer initiation cannot be discarded simply based on the fact that they are short--lived. More evidence is needed to understand the cellular origins of cancer and to identify whether or not a double hit in a daughter cell can be ``immortalizing''. In this study we discuss several scenarios and propose some experiments which can shed light on these questions. %A Logan Everett %A Endre Boros %T Optimal Protein Encoding %D Sat Nov 20 11:18:02 2004 %Z Mon Feb 7 17:12:45 EST 2005 %I DIMACS %R 2004-55 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-55.pdf %X for matching sequences. In protein sequence databases, searching is hindered by both the increased amount of data and the complexity of sequence similarity metrics. Protein similarity is not simply a matter of character matching, but rather is determined by a matrix of scores assigned to every match and mismatch (Henikoff and Henikoff, 1992). One strategy to increase search speed is to map sequences into a binary space where the Hamming distance between strings is comparable to the similarities of the original sequences (Halperin, Buhler,Karp, Krauthgamer and Westover, 2003). Within binary Hamming spaces, statistically proven sampling methods can be used for fast, reliably sensitive searches (Buhler, 2002). However, mapping the protein alphabet to a binary Hamming space often comes with a certain level of distortion. We have employed Linear Programming techniques to model and study the nature of these mapping schemes. Specifically, we have found the theoretically minimum distortion achievable for several biological scoring matrices, as well as corresponding optimal encoding weights. We have also analyzed the use of these encoding weights to generate pseudo-random binary encodings that approach the theoretically minimum distortions. %A Henrik Bjorklund %A Olle Nilsson %A Ola Svensson %A Sergei Vorobyov %T Controlled Linear Programming: Duality and Boundedness %D Sat Dec 18 13:45:12 2004 %Z Mon Feb 7 17:13:06 EST 2005 %I DIMACS %R 2004-56 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-56.ps.gz %X (CLPP) introduced in [DIMACS-TR-2004-41], by defining and studying the Dual CLPP, boundedness, duality, stability, optimality conditions, correctness of subexponential algorithms, conditions for NP$\cap$coNP-membership. %A Liming Wang %T Does Wee1 inhibits the entry to M phase? %D Sat Dec 18 13:46:57 2004 %Z Mon Feb 7 17:13:38 EST 2005 %I DIMACS %R 2004-57 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2004/2004-57.ps.gz %X M phase. At the first glance, it is quite obvious since active Wee1 helps adding a phosphate to the tyrosine-15 (Tyr15) site of the active M phase promoting factor (MPF-P) to make it inactive. But on the other hand, acitve Wee1 can also adds a phosphate to the non-phosphorylated MPF (MPF), the product then has a great chance to be phosphorylated at the threonine-161 (Thr161) site. And active cdc25 will help to remove the phospate at Tyr15 to make it active. In this way, Wee1 act positively on the entry to M phase. This paper addressed the question that whether Wee1 inhibits or promotes M phase through a mathematical point of view. Both assumptions, the mass action and the Michaelis-Menten, are discussed. Under the assumption of mass action Wee1 always inhibits the M phase, while under Michaelis-Menten, it depends on certain parameters. And biological experiments suggest that the parameters are in the right range such that Wee1 inhibits M phase. %A Dmitriy Fradkin %A Michael Littman %T Exploration Approaches to Adaptive Filtering %D Wed Jan 19 20:27:59 2005 %Z Thu Oct 13 18:45:47 EDT 2005 %I DIMACS %R 2005-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-01.ps.gz %X Exploration-exploitation trade-off is particularly important in adaptive filtering setting, where feedback is received only on submitted documents and, therefore, is scarce. We examine effect of certain exploration techniques on performance of simple linear classifiers. %A Jamiru Luttamaguzi %A Michael Pelsmajer %A Zhizhang Shen %A Boting Yang %T Integer Programming Methods for Several Optimization Problems in Graph Theory %D Wed Jan 19 20:35:30 2005 %Z Thu Oct 13 18:47:43 EDT 2005 %I DIMACS %R 2005-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-02.ps.gz %X In this paper, we discuss how to solve several graph theory based optimization problems by following an {\em integer programming} approach. In particular, we discuss such problems as the space-filling problem, the bandwidth problem, the cut-width problem, and the feedback edge/vertex problem. We present respective IP formulations, and carry out some preliminary output analysis. %A Meltem Ozturk %A Alexis Tsoukias %T Modelling continuous positive and negative reasons in decision aiding %D Tue Jan 25 14:37:11 2005 %Z Thu Oct 13 18:47:53 EDT 2005 %I DIMACS %R 2005-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-03.ps.gz %X The use of positive and negative reasons in inference and decision aiding is a recurrent issue of investigation. A language enabling us to explicitly take into account such reasons is Belnap's logic and the four valued logics derived from it. In this paper, we explore the interpretation of a continuous extension of a four-valued logic as a necessity degree (in possibility theory). It turns out that, in order to take full advantage of the four values, we have to consider "sub-normalized" necessity measures. Under such a hypothesis four-valued logics become the natural logical frame for such an approach. %A Jihui Zhao %A Vladimir Gurvich %A Leonid Khachiyan %T Extending Dijkstra's Algorithm to Shortest Paths with Blocks %D Thu Feb 10 18:23:59 2005 %Z Thu Oct 13 18:47:57 EDT 2005 %I DIMACS %R 2005-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-04.ps.gz %X We consider the problem of computing shortest paths in a directed arc-weighted graph G=(V,A) in the presence of an adversary that can block, for each vertex v, some subsets X(v) of the arcs leaving v. We assume that if X(v) is an admissible blocking set of arcs at v, then so is any subset of X(v). In other words, for each vertex v, the hypergraph B(v) of all admissible blocks at v is an independence system. We show that when the independence systems B(v) are given by an arbitrary membership oracle, and the input arc-weights are non-negative, the single-destination version of the problem can be solved by a natural extension of Dijkstra's algorithm in O(|A| log|V|) time and at most |A| monotonically increasing blocking tests. We obtain better bounds for the special case where each blocking system B(v) consists of all arc-sets X(v) of given cardinality p(v), and also show that computing shortest paths with blocks becomes NP-hard when the adversary can block a given number of arcs arbitrarily distributed in the input graph. %A Henrik Bjorklund %A Ola Svensson %A Sergei Vorobyov %T Linear Complementarity Algorithms for Mean Payoff Games %D Thu Mar 24 10:44:30 2005 %Z Thu Oct 13 18:48:02 EDT 2005 %I DIMACS %R 2005-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-05.ps.gz %X We suggest new pseudopolynomial and subexponential algorithms for Mean Payoff Games (MPGs). The algorithms are based on representing the MPGs decision problem in the forms of non-standard and standard LCPs, Linear Complementarity Problems (find $w,z\geq 0$ satisfying $w=Mz+q$ and $w^T\cdot z=0$), and monotonic iterative propagation of slack updates. %A Diogo Andrade %A Endre Boros %A Vladimir Gurvich %T Even-hole-free and Balanced Circulants %D Wed Feb 23 16:57:56 2005 %Z Thu Oct 13 18:48:06 EDT 2005 %I DIMACS %R 2005-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-06.ps.gz %X In this paper some well-known conjectures about the even-hole-free graphs and balanced graphs are verified under the additional assumption of circular symmetry.

Keywords: balanced graph, circulant, circular symmetry, even hole, even-hole-free graph %A Vladimir Gurvich %T On the misere version of game Euclid and miserable games %D Wed Feb 23 17:00:06 2005 %Z Thu Oct 13 18:48:11 EDT 2005 %I DIMACS %R 2005-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-07.ps.gz %X Recently, Gabriel Nivasch got the following remarkable formula for the Sprague-Grundy function of game Euclid: $g^+(x,y) = \lfloor |x/y - y/x| \rfloor$ for all integer $x,y \geq 1$. We consider the corresponding misere game and show that its Sprague-Grundy function $g^-(x,y)$ is equal to $g^+(x,y)$ for all positions $(x,y)$, except for the case when $(x,y)$ or $(y,x)$ equals $(k F_i , k F_{i+1})$, where $F_i$ is the $i$-th Fibonacci number and $k$ is a positive integer. It is easy to see that these exceptional {\em Fibonacci} positions are exactly those in which all further moves in the game are forced (unique) and hence, the results of the normal and misere versions are opposite; in other words, for these positions $g^+$ and $g^-$ take values 0 and 1 so that $g^- = 1 - g^+ = (-1)^i + g^+ $.

Let us note that the good old game of Nim has similar property: if there is at most one bean in each pile then all further moves are forced. Hence, in these {\em forced} positions the results of the normal and misere versions are opposite, while for all other positions they are the same, as it was proved by Charles Bouton in 1901. Respectively, in the forced positions $g^+$ and $g^-$ take values 0 and 1 so that $g^- = 1 - g^+ = (-1)^i + g^+ $, where $i$ is the number of non-empty piles, while in all other positions $g^+ \equiv g^-$ %A Vladimir Gurvich %T Cyclically Orientable Graphs %D Mon Feb 28 14:00:47 2005 %Z Thu Oct 13 18:48:15 EDT 2005 %I DIMACS %R 2005-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-08.ps.gz %X Graph $G$ is called cyclically orientable (CO) if it admits an orientation in which every chordless cycle is cyclically oriented. This family of graphs was introduced by Barot, Geiss, and Zelevinsky in their paper ``Cluster algebras of finite type and positive symmetrizable matrices'', ArXiv:math.CO/0411341 v3, 13 Dec. 2004. The authors obtained several nice characterizations of CO-graphs being motivated primarily by their applications in cluster algebras. Here we give two more characterizations of CO-graphs, one of which allows one to recognize CO-graphs and get their cyclic orientation in linear time.

Keywords: graph, cycle, chord, chordless cycle, orientation, cyclic orientation %A Bruno Escoffier %A Peter L. Hammer %T Approximation of the Quadratic Set Covering Problem %D Tue Apr 19 11:03:04 2005 %Z Thu Oct 13 18:48:20 EDT 2005 %I DIMACS %R 2005-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-09.ps.gz %X We study in this article polynomial approximation of the Quadratic Set Covering problem. This problem, which arises in many applications, is a natural generalization of the usual Set Covering problem. We show that this problem is very hard to approximate in the general case, and even in classical subcases (when the size of each set or when the frequency of each element is bounded by a constant). Then we focus on the convex case and give both positive and negative approximation results. Finally, we tackle the unweighted version of this problem. %A Donald R. Hoover %A Usha Sambamoorthi %A James T. Walkup %A Stephen Crystal %T Mental Illness and Length of Hospital Stay for Medicaid Inpatients Infected with HIV %D Sat Jun 25 22:52:18 2005 %Z Thu Oct 13 18:48:26 EDT 2005 %I DIMACS %R 2005-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-10.pdf %X OBJECTIVE: Study associations of length of inpatient stay (LOS) for HIV-infected Medicaid recipients with; Severe Mental Illness History (SMI-H), Other (Less Severe) Mental Illness History (OMI-H), and diagnosis with Acute Mental Illness (AMI) during the inpatient visit. DATA SOURCE & COLLECTION / STUDY SETTING: Merged 1992-98 Medicaid claims and HIV/AIDS surveillance data obtained from the State of New Jersey for adults with .1 inpatient stay after an HIV/AIDS diagnosis from 1992-1996. STUDY DESIGN: Observational study of 8,186 HIV patients with 31,515 inpatient visits. SMI-H, OMI-H and Primary/Secondary AMI diagnosis at visits were ascertained from ICD–9-CM Codes; 11% of visits had an AMI diagnosis while 25% and 29% of visits, respectively, were from patients with SMI-H and OMI-H histories. PRINCIPAL FINDINGS: HIV patient-stays with Primary or Secondary AMI diagnoses each had mean LOS=11.0 days and stays of patients with SMI-H and OMI-H had mean LOS of 10.4 and 11.8 days, respectively compared to a mean LOS=12.7 days for stays of patients with no history of mental illness. But after adjusting for measures of HIV disease severity and health care access in multivariate models, patients presenting with primary and secondary AMI diagnoses had ~32% and ~13% longer LOS, respectively, than did similar patients without AMI (P<0.001). In the absence of a diagnosed AMI, SMI-H and OMI-H alone were not related to LOS in adjusted models. However, in adjusted models, SMI-H was associated with ~20% shorter time to readmission for a new visit. CONCLUSIONS: This study concurs with previous findings of greater (adjusted) LOS for HIV patients that have mental cormibidity. But the patterns seen here suggest that the increase may be mediated by extra time required to treat acute mental illnesses occurring at the visit rather than from mental illness interfering with treatment and discharge of HIV conditions.

KEYWORDS: HIV Disease, Hospitalization, Length of Stay, Mental Illness %A Graham Cormode %A S. Muthukrishnan %A Irina Rozenbaum %T Summarizing and Mining Inverse Distributions on Data Streams via Dynamic Inverse Sampling %D Tue Jul 19 13:49:15 2005 %Z Thu Oct 13 18:48:30 EDT 2005 %I DIMACS %R 2005-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-11.ps.gz %X Emerging data stream management systems approach the challenge of massive data distributions which arrive at high speeds while there is only small storage by summarizing and mining the distributions using samples or sketches. However, data distributions can be "viewed" in different ways. A data stream of integer values can be viewed either as the forward distribution f(x), ie., the number of occurrences of x in the stream, or as its inverse, $f^{-1}(i)$, which is the number of items that appear i times. While both such "views" are equivalent in stored data systems, over data streams that entail approximations, they may be significantly different. In other words, samples and sketches developed for the forward distribution may be ineffective for summarizing or mining the inverse distribution. Yet, many applications such as IP traffic monitoring naturally rely on mining inverse distributions.

We formalize the problems of managing and mining inverse distributions and show provable differences between summarizing the forward distribution vs the inverse distribution. We present methods for summarizing and mining inverse distributions of data streams: they rely on a novel technique to maintain a dynamic sample over the stream with provable guarantees which can be used for variety of summarization tasks (building quantiles or equidepth histograms) and mining (anomaly detection: finding heavy hitters, and measuring the number of rare items), all with provable guarantees on quality of approximations and time/space used by our streaming methods. We also complement our analytical and algorithmic results by presenting an experimental study of the methods over network data streams. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Leonid Khachiyan %T Generating All Minimal Integral Solutions to AND-OR Systems of Monotone Inequalities: Conjunctions are Easier than Disjunctions %D Wed Mar 16 15:45:35 2005 %Z Thu Oct 13 18:48:34 EDT 2005 %I DIMACS %R 2005-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-12.ps.gz %X We consider monotone $\vee, \wedge$-formulae $\phi$ of $m$ atoms, each of which is a monotone inequality of the form $f_i(x)\geq t_i$ over the integers, where for $i=1,\ldots,m$, $f_i:\ZZ^n\mapsto\ RR$ is a given monotone function and $t_i$ is a given threshold. We show that if the $\vee$-degree of $\phi$ is bounded by a c onstant, then for linear, transversal and polymatroid monotone inequalities all minimal integer vectors satisfying $\phi$ can be generated in incremental quasi-polynomial time. In contrast, the enumeration problem for the disjunction of $m$ inequalities is NP-hard when $m$ is part of the input. We also discuss some applications of the above results in di sjunctive programming, data mining, matroid and reliability theory. %A Henrik Bjorklund %A Ola Svensson %A Sergei Vorobyov %T Controlled Linear Programming for Infinite Games %D Tue Apr 12 16:40:45 2005 %Z Thu Oct 13 18:48:39 EDT 2005 %I DIMACS %R 2005-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-13.ps.gz %X We investigate the CONTROLLED LINEAR PROGRAMMING PROBLEM (CLPP), a new combinatorial optimization problem nicely merging linear programming with games. In a system of linear constraints of the form $x_i \leq p^j_i(\overline x) + w^j_i$ where $p^j_i$ are linear homogeneous polynomials with nonnegative coefficients and $w^j_i \in R$, some variables are distinguished as \emph{controlled} and we want to select \emph{exactly one} constraint for each controlled variable to make max$\sum x_i$, subject to the remaining constraints, as large as possible (over all possible such selections). We suggest a number of iterative strategy improvement policies, simplexlike and interiorpoint. For many important cases we prove optimality conditions (when a local optimum is global), describe a number of combinatorial randomized \emph{subexponential} algorithms, and show that the decision version of the problem is in $NP \bigcap coNP$.

It turns out that the wellknown mean payoff, discounted payoff, simple stochastic, and parity games, as well as the recent LongestShortest Paths (LSP) problem [10] (all in $NP\bigcap coNP$, unknown to be in P) are easily reducible to the CLPP. This suggests a simplifying and unifying view of all these games as particular restricted instances of the CLPP, one of the "most general" problems in $NP \bigcap coNP$ to which all the known games in the class reduce (in certain sense "complete" in the class). We show that a slight generalization of the CLPP allowing for negative coefficients in the constraint polynomials $p^j_i$ is NPhard. So are the controlled versions of many other polynomial optimization problems, like MAXIMUM FLOW.

The simple algebraic and combinatorial structure of the CLPP unifies and gives insights into algorithmic and complexitytheoretic properties of infinite games, and is amenable to the powerful tools from combinatorial and polyhedral optimization, as well as linear programming. In this paper we investigate boundedness, CLPP duality, stability, optimality conditions, correctness of subexponential algorithms, and conditions for $NP \bigcap coNP$membership. In particular, strong duality implies $NP \bigcap coNP$membership, polynomial optimality conditions, and strongly subexponential algorithms. %A Bhaskar Dasgupta %A Sergio Ferrarini %A Uthra Gopalakrishnan %A Nisha Raj Paryani %T Inapproximability Results for the Lateral Gene Transfer Problem %D Mon May 16 21:33:08 2005 %Z Thu Oct 13 18:48:43 EDT 2005 %I DIMACS %R 2005-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-14.ps.gz %X In this paper we establish some inapproximability results for the \textsc{Lateral Transfer Problem}. This optimization problem, which was defined by Hallet and Lagergren, is that of finding the most parsimonious lateral gene transfer scenario for a given pair of gene and species trees. We will prove that the Lateral Transfer Problem is MAX SNP-hard; thus Polynomial Time Approximation Scheme is not possible for it unless P = NP. %A Xiaomin Chen %A Bin Tian %A Lei Wang %T Santa Claus' Towers of Hanoi %D Thu Oct 13 18:54:16 2005 %Z Thu Oct 13 18:54:22 EDT 2005 %I DIMACS %R 2005-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-15.ps.gz %X Two new variants of the Towers of Hanoi problem are proposed. In both variations, one is allowed to put a bigger disk directly on the top of a smaller one under some restrictions. We give procedures to solve these two versions, and prove the optimality of our procedures. Our solution also resolves a problem, which is similar to one of our versions, proposed by D. Wood tweenty-four years ago. %A Donald R. Hoover %T Extending Power and Sample Size Approaches for McNemar's Procedure to General Sign Tests %D Tue Apr 12 16:49:28 2005 %Z Thu Oct 13 18:48:54 EDT 2005 %I DIMACS %R 2005-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-16.pdf %X Current software and textbooks present procedures to estimate power and sample size for sign tests that only apply to settings where positive (i.e., X=1) or negative (i.e., X=-1), but not neutral (i.e., X=0) outcomes occur. However, many studies analyzed by sign tests involve the more general setting where significant amounts of neutral outcomes can occur. This paper illustrates application of existing power / sample size approaches and software that have been developed for matched binary responses (McNemar's discordant pairs) to general sign tests with neutral outcomes occurring. An application is made to a recent study that the author collaborated on.

Key Words: McNemar's Procedure, Power, Sample Size, Sign Test %A Goh Chee Ying %A Koh Khee Meng %A Bruce E. Sagan %A Vincent R. Vatter %T Maximal Independent Sets In Graphs With At Most $r$ Cycles %D Thu May 12 23:54:34 2005 %Z Thu Oct 13 18:48:58 EDT 2005 %I DIMACS %R 2005-17 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-17.ps.gz %X We find the maximum number of maximal independent sets in two families of graphs. The first family consists of all graphs with $n$ vertices and at most $r$ cycles. The second family is all graphs of the first family which are connected and satisfy $n ≥ 3r$. %A Bruce E. Sagan %A Vincent R. Vatter %T Maximal and Maximum Independent Sets In Graphs With At Most $r$ Cycles %D Thu May 12 23:50:34 2005 %Z Thu Oct 13 18:49:03 EDT 2005 %I DIMACS %R 2005-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-18.ps.gz %X Let $m(G)$ denote the number of maximal independent sets of vertices in a graph $G$ and let $c(n,r)$ be the maximum value of $m(G)$ over all connected graphs with $n$ vertices and at most $r$ cycles. A theorem of Griggs, Grinstead, and Guichard gives a formula for $c(n,r)$ when $r$ is large relative to $n$, while a theorem of Goh, Koh, Sagan, and Vatter does the same when $r$ is small relative to $n$. We complete the determination of $c(n,r)$ for all $n$ and $r$ and characterize the extremal graphs. Problems for maximum independent sets are also completely resolved. %A S. Muthukrishnan %A Martin J. Strauss %A Xuan Zheng %T Workload-Optimal Histograms on Streams %D Fri May 27 06:59:30 2005 %Z Thu Oct 13 18:49:07 EDT 2005 %I DIMACS %R 2005-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-19.ps.gz %X Histograms are used in many ways in conventional databases and in data stream processing for summarizing massive data distributions. Previous work on constructing histograms on data streams with provable guarantees have not taken into account the workload characteristics of databases which show some parts of the distributions to be more frequently used than the others; on the other hand, previous work for constructing histograms that do make use of the workload characteristics---and have demonstrated the significant advantage of exploiting workload information---have not come with provable guarantees on the accuracy of the histograms or the time and space bounds needed to obtain reasonable accuracy.

We study the algorithmic complexity of constructing workload-optimal histograms on data streams, and expose the effect of the workload on the complexity of histogram construction problem.

Consider the case when the workload is explicitly stored and the input data is streamed in the time series model---the input is a vector of N components, and we are presented with the components, one at a time, in order. We present an algorithm that uses polylogarithmic space, processes each new item in constant time, and, in polylogarithmic post-processing time, determines a (1+epsilon)-approximation to the optimal histogram. This matches the space complexity, up to polylogarithmic factors, of the histogram construction on the stream when workload is uniform [Guha, Indyk, Muthukrishnan, and Strauss]. To get this result, we need to identify and exploit the notion of linearly robust histograms. These is the first known algorithmic result with provable guarantees for workload-optimal histogram construction on streams.

Now consider the case when workload is not stored fully, but is compressed. As we show, trivial lossy compression can be used (one can drop low-order bits of individual weights) but algorithmic results such as the above are not possible if any nontrivial lossy compression is used. However, we show that our algorithmic results can be extended efficiently to the case when the workload is compressed without loss by using, for example, a universal compression scheme of Lempel-Ziv. This requires supporting a symbol-range-count data structure on compressed data which may be of independent interest. Also, the direction of exploring data stream problems when one of the inputs (workload) is compressed losslessly, is novel. %A Ola Svensson %A Sergei Vorobyov %T A Subexponential Algorithm for a Subclass of P-Matrix Generalized Linear Complementarity Problems %D Sun Jul 3 14:37:11 2005 %Z Thu Oct 13 18:49:12 EDT 2005 %I DIMACS %R 2005-20 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-20.ps.gz %X We define a nontrivial (polynomially time recognizable, but not known to be polynomial time solvable) subclass of P-matrix Generalized (Vertical) Linear Complementarity Problems, which we call D-matrix GLCPs. D-matrices have nonnegative entries and strict row diagonal dominance. In general, P-matrix LCPs and GLCPs are not known to be polynomial, or even subexponential time solvable, and the property ``to be a P-matrix'' is coNP-complete.

We describe the first strongly subexponential randomized algorithm for D-matrix GLCPs. As a lower bound, we show that simple stochastic games, a long standing problem in NP intersect coNP, unknown to be polynomial, is subsumed by D-matrix GLCPs.

As an important application, our D-matrix GLCP algorithm gives the best currently available algorithm for computing values of simple stochastic games. At present, the D-matrix GLCP is the only known nontrivial subclass of the P-matrix GLCP solvable in subexponential time. %A Leonid Khachiyan %A Endre Boros %A Konrad Borys %A Khaled Elbassioni %A Vladimir Gurvich %T Generating all vertices of a polyhedron is hard %D Sun Jul 3 14:48:28 2005 %Z Thu Oct 13 18:49:16 EDT 2005 %I DIMACS %R 2005-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-21.pdf %X We show that generating all negative cycles of a weighted graph is a hard enumeration problem, in both the directed and undirected cases. More precisely, given a family of (directed) negative cycles, it is an NP-complete problem to decide whether this family can be extended or there are no other negative (directed) cycles in the graph, implying that (directed) negative cycles cannot be generated in polynomial output time, unless P=NP. As a corollary, we solve in the negative two well-known generating problems from linear programming: (i) Given an (infeasible) system of linear inequalities, generating all minimal infeasible subsystems is hard. Yet, for generating maximal feasible subsystems the complexity remains open. (ii) Given a (feasible) system of linear inequalities, generating all vertices of the corresponding polyhedron is hard. Yet, in the case of bounded polyhedra the complexity remains open keywords polytope, polyhedron, polytope-polyhedron problem, vertex, face, facet, enumeration problem, vertex enumeration, facet enumeration, graph, cycle, negative cycle, linear inequalities, feasible system %A Z. Nedev %A S. Muthukrishnan %T The Nagger-Mover Game %D Fri Jul 15 13:10:56 2005 %Z Thu Oct 13 18:49:20 EDT 2005 %I DIMACS %R 2005-22 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-22.ps.gz %X We introduce a new two-persons game. Nagger starts from position 0 in a round table with positions marked $0,1,\ldots,n-1$ and repeatedly calls number of places to move, and Mover responds to each move with which direction---clockwise or anticlockwise---that the Nagger should move. What is the maximum number of positions that Nagger may occupy? We provide the exact bound for this question, provide algorithmic strategies for the Nagger and the Mover to meet this bound, and pose other variants of the Nagger-Mover game for future study. %A K. L. Ng %A P. Raff %T Fractional Firefighting in the Two Dimensional Grid %D Mon Aug 15 17:57:28 2005 %Z Thu Oct 13 18:49:25 EDT 2005 %I DIMACS %R 2005-23 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-23.ps.gz %X We consider a generalization of the firefighter problem where the number of firefighters available per time step $t$ is not a constant. We show that if the number of firefighters available is periodic in $t$ and the average number per time step exceeds $\frac{3}{2}$, then a fire starting at a finite number of vertices in the two dimensional infinite grid graph can be contained. %A Leonid Khachiyan %A Endre Boros %A Konrad Borys %A Khaled Elbassioni %A Vladimir Gurvich %A Kazuhisa Makino %T Enumerating Cut Conjunctions in Graphs and Related Problems %D Mon Jul 11 13:35:58 2005 %Z Thu Oct 13 18:49:29 EDT 2005 %I DIMACS %R 2005-24 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-24.pdf %X Let G = (V,E) be an undirected graph, and let $B \subseteq V \times V$ be a collection of vertex pairs. We give an incremental polynomial time algorithm to enumerate all minimal edge sets $X \subseteq E$ such that every vertex pair $(s,t) \in B$ is disconnected in $(V,E\setminus X)$, generalizing well-known efficient algorithms for enumerating all minimal s-t cuts, for a given pair $s, t \in V$ of vertices. We also present an incremental polynomial time algorithm for enumerating all minimal subsets $X \subseteq E$ such that no $(s, t) \in B$ is a bridge in $(V,X \bigcup B)$. These two enumeration problems are special cases of the more general cut conjunction problem in matroids: given a matroid M on ground set $S = E \bigcup B$, enumerate all minimal subsets $X \subseteq E$ such that no element $b \in B$ is spanned by $E \setminus X$. Unlike the above special cases, corresponding to the cycle and cocycle matroids of the graph $(V,E \bigcup B)$, the enumeration of cut conjunctions for vectorial matroids turns out to be NP-Hard. %A Graham Cormode %A S. Muthukrishnan %T Towards an Algorithmic Theory of Compressed Sensing %D Thu Oct 13 18:55:16 2005 %Z Thu Oct 13 18:55:39 EDT 2005 %I DIMACS %R 2005-25 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-25.pdf %X In Approximation Theory, the fundamental problem is to reconstruct a signal A in R^n from linear measurements with respect to a dictionary Psi for R^n. Recently, there has been tremendous excitement about the novel direction of Compressed Sensing [Donoho 04] where the reconstruction can be done with very few---O(k log n)---linear measurements over a modified dictionary Psi' if the information of the signal is concentrated in k coefficients over an orthonormal basis Psi. These results have reconstruction error on any given signal that is optimal with respect to a broad class of signals. In a series of papers and meetings over the past year, a theory of Compressed Sensing has been developed by mathematicians.

We develop an algorithmic perspective for the Compressed Sensing problem, showing that Compressed Sensing results resonate with prior work in Group Testing, Learning theory and Streaming algorithms. Our main contributions are new algorithms that present the most general results for Compressed Sensing with (1+eps) approximation on every signal, faster algorithms for the reconstruction, as well as succinct transformations of Psi to Psi'. %A Tanya Y Berger-Wolf %A Jared Saia %T Critical Groups in Dynamic Social Networks %D Thu Oct 13 18:56:03 2005 %Z Thu Oct 13 18:58:06 EDT 2005 %I DIMACS %R 2005-26 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-26.ps.gz %X We address the issue of fragility of a network of interactions, such as a social network, in a dynamic setting. We propose a new mathematical and computational framework that allows analysis of dynamic social networks addressing the time component explicitly. In this framework, we pose the question of finding a critical set of groups at various times whose lack of existence would leave no persistent social structure in the network. We formulate this question in terms of a graph optimization problem, prove that it is NP-hard, and provide a polynomial time algorithm for one important special case. %A Tanya Y. Berger-Wolf %A Bhaskar DasGupta %A Wanpracha Chaovalitwongse %A Mary V. Ashley %T Combinatorial Reconstruction of Sibling Relationships in Absence of Parental Data %D Thu Oct 13 18:56:13 2005 %Z Thu Oct 13 18:58:04 EDT 2005 %I DIMACS %R 2005-27 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-27.ps.gz %X We present a new algorithm for reconstructing sibling relationships in a single generation of individuals without parental information, using data from codominant DNA markers such as microsatellites. We use the simple genetic constraints on the full-sibling groups, such as the limit of four alleles per single locus and no more than two alleles from each potential parent per locus. We then use combinatorial optimization techniques to extract a minimum number of groups that satisfy these constraints. The results of a simulation study of a relaxed version of the algorithm show that our approach is reasonably accurate and the full version of the algorithm should be pursued. Our algorithm does not require any a priori knowledge about allele frequency, population size, mating system, or family size distributions. %A Tanya Y. Berger-Wolf %A Jared Saia %T A Framework for Analysis of Dynamic Social Networks %D Thu Oct 13 18:56:21 2005 %Z Thu Oct 13 18:58:13 EDT 2005 %I DIMACS %R 2005-28 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-28.ps.gz %X Finding patterns of social interaction within a population has wide-ranging applications including: disease modeling, cultural and information transmission, phylogeography, conservation, and behavioral ecology. Recently, scientists have started to model social interaction with graphs (networks). One of the intrinsic characteristics of societies is their continual change. However, majority of the social network analysis methodologies today are essentially static in that all information about the time that social interactions take place is discarded or long time series are averaged to discern the overall or long-term strength of connections. Such approach not only may give inaccurate or inexact information about the patterns in the data, but it prevents us from even asking questions about the temporal causes and consequences of social structures. In this paper we propose a new mathematical and computational framework that allows analysis of dynamic social networks addressing the time component explicitly. We present several algorithms that explore the social structure in this model and pose many open questions. %A Zhivko Nedev %T Universal Subsets of Zn, Linear Integer Optimization, and Integer Factorization %D Tue Jan 17 15:47:57 2006 %Z Wed Dec 6 16:25:48 EST 2006 %I DIMACS %R 2005-29 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-29.pdf %X We consider two classes of sets in Zn . A non-empty subset U of Zn is universal (the first class) if for all x \in U , and for all 0 < l ≤ n/2 at least one of x +/- l (mod n) lies in U . For each universal U its complement, Zn\U, is from the second class and vice versa. We define \beta(n) to be the minimum cardinality of an universal set modulo n. Completely characterizing all sets in the second class we derive a formula for \beta(n).

We demonstrate that universal sets arise in the context of a two-player game that was analyzed for the first time in [3] and has interesting connections to the prime factorization of n. Finally we model our optimization problem, find \beta(n), as an integer linear program. %A Khaled Elbassioni %A Zvi Lotker %A Raimund Seidel %T Upper Bound on the Number of Vertices of Polyhedra with $0,1$-Constraint Matrices %D Wed Aug 17 21:29:41 2005 %Z Wed Dec 6 16:26:05 EST 2006 %I DIMACS %R 2005-30 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-30.ps.gz %X In this note we show that the maximum number of vertices in any polyhedron $P=\{x\in \mathbb{R}^d~:~Ax\leq b\}$ with $0,1$-constraint matrix $A$ and a real vector $b$ is at most $d!$. %A Alexei Ashikhmin %A Vitaly Skachek %T Decoding of Expander Codes at Rates Close to Capacity %D Wed Aug 17 21:40:16 2005 %Z Wed Dec 6 16:26:32 EST 2006 %I DIMACS %R 2005-32 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-32.ps.gz %X The concatenation of nearly-MDS expander codes of Roth and Skachek, ``On Nearly-MDS Expander Codes,'' \emph{Proc. IEEE ISIT'04,} with `typical' LDPC codes is investigated. It is shown that for the rates $R=(1-\varepsilon)C$ ($C$ is the capacity of the binary symmetric channel (BSC)), under certain condition on the parameters of LDPC codes, these concatenated codes have decoding time linear in their length and polynomial in $1/\varepsilon$, and the decoding error probability decays exponentially. These codes are compared to the recently presented codes of Barg and Z\'emor, ``Error Exponents of Expander Codes,'' \emph{IEEE Trans. Inform. Theory,} 2002, and ``Concatenated Codes: Serial and Parallel,'' \emph{IEEE Trans. Inform. Theory,} 2005. It is shown that the latter families can not be tuned to have all the aforementioned properties. %A Amit Chakrabarti %A Khanh Do Ba %A S. Muthukrishnan %T Estimating Entropy and Entropy Norm on Data Streams %D Thu Oct 13 18:57:00 2005 %Z Wed Dec 6 16:26:43 EST 2006 %I DIMACS %R 2005-33 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-33.pdf %X We consider the problem of computing information theoretic functions such as entropy on a data stream, using sublinear space. Our first result deals with a measure we call the ``entropy norm'' of an input stream: it is closely related to entropy but is structurally similar to the well-studied notion of frequency moments. We give a polylogarithmic space one-pass algorithm for estimating this norm under certain conditions on the input stream. We also prove a lower bound that rules out such an algorithm if these conditions do not hold. Our second result is a sublinear space one-pass algorithm for estimating the empirical entropy of an input stream. For a stream of $m$ items and a given real parameter $\alpha$, our algorithm uses space $\widetilde{O}(m^{2\alpha})$ and provides an approximation of $O(1/\alpha)$ in the worst case and $(1+\eps)$ in ``most'' cases. All our algorithms are quite simple. %A Xiaoling Hou %A Andras Prekopa %T Monge Property and Bounding Multivariate Probability Distribution Functions with Given Marginals and Covariances %D Thu Oct 13 18:57:08 2005 %Z Wed Dec 6 16:26:59 EST 2006 %I DIMACS %R 2005-34 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-34.ps.gz %X Multivariate probability distributions with given marginals are considered, along with linear functionals, to be minimized or maximized, acting on them. The functionals are supposed to satisfy the Monge or inverse Monge or some higher order convexity property and they may be only partially known. Existing results in connection with Monge arrays are reformulated and extended in terms of LP dual feasible bases. Lower and upper bounds are given for the optimum value as well as for unknown coefficients of the objective function based on the knowledge of some dual feasible basis and corresponding objective function coefficients. In the two- and three-dimensional cases dual feasible bases are obtained for the problem, where not only the univariate marginals, but also the covariances of the pairs of random variables are known.

Distributions with given marginals, transportation problem, Monge arrays, bounding expectations under partial information. %A Dmitriy Fradkin %A Dona Schneider %A Ilya Muchnik %T Machine Learning Methods in the Analysis of Lung Cancer Survival Data %D Thu Oct 13 18:57:16 2005 %Z Wed Dec 6 16:27:13 EST 2006 %I DIMACS %R 2005-35 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-35.ps.gz %X Support Vector Machines (SVM) and penalized logistic regression are well known to the machine learning community but are yet to be actively used in an epidemiological application. We apply them to the task of constructing a predictive model for the survival of patients diagnosed with lung cancer and analyzing the importance of features based on model parameters. The methods produce distinct and complementary models, making it advantageous to consider both whenever possible to gain different perspectives into large datasets. After applying the methods to Surveilance, Epidemiology and End Results (SEER) data, we also compute several measures of feature importance in the final models, showing these measures to be strongly correlated. %A Vadim V. Lozin %A Martin Milanic %T A polynomial algorithm to find an independent set of maximum weight in a fork-free graph %D Wed Oct 26 12:56:24 2005 %Z Wed Dec 6 16:27:25 EST 2006 %I DIMACS %R 2005-38 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-38.ps.gz %X The class of fork-free graphs is an extension of claw-free graphs and their subclass of line graphs. The first polynomial-time solution to the maximum weight independent set problem in the class of line graphs, which is equivalent to the maximum matching problem in general graphs, has been proposed by Edmonds in 1965 and then extended to the entire class of claw-free graphs by Minty in 1980. Recently, Alekseev proposed a solution for the larger class of fork-free graphs, but only for the unweighted version of the problem, i.e., finding an independent set of maximum cardinality. In the present paper, we describe the first polynomial-time algorithm to solve the problem for weighted fork-free graphs. %A Shiri Azenkot %A Tzu-Yi Chen %A Graham Cormode %T An evaluation of the edit-distance-with-moves similarity metric for comparing genetic sequences %D Tue Nov 29 17:00:30 2005 %Z Wed Dec 6 16:27:38 EST 2006 %I DIMACS %R 2005-39 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-39.ps.gz %X We describe the first known implementation of an approximation algorithm for the string edit distance with moves similarity metric. This is the first algorithm to consider nontrivial alignment and run in substantially sub-quadratic time [2]. Extensive experimentation demonstrates that the algorithm produces a good approximation for the edit distance with moves metric, especially on strings of length 500B to 10KB. We also found that the algorithm has high potential for use in computational biology. When comparing texts of genetic sequences, our algorithm outperforms the q-grams heuristic in predicting results of the Smith-Waterman algorithm. Finally, we propose additional application areas for our implementation. %A Graham Cormode %A S. Muthukrishnan %T Combinatorial Algorithms for Compressed Sensing %D Tue Jan 10 12:45:32 2006 %Z Wed Dec 6 16:27:48 EST 2006 %I DIMACS %R 2005-40 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-40.pdf %X In sparse approximation theory, the fundamental problem is to reconstruct a signal A \in R^n from linear measurements (A,psi_i) with respect to a dictionary of psi_i's. Recently, there is focus on the novel direction of Compressed Sensing where the reconstruction can be done with very few---O(k log n)---linear measurements over a modified dictionary if the signal is compressible, that is, its information is concentrated in k coefficients. In particular, these results prove that there exists a single O(k log n) x n measurement matrix such that any such signal can be reconstructed from these measurements, with error at most O(1) times the worst case error for the class of such signals. Compressed sensing has generated tremendous excitement both because of the sophisticated underlying Mathematics and because of its potential applications.

In this paper, we address outstanding open problems in Compressed Sensing. Our main result is an explicit construction of a non-adaptive measurement matrix and the corresponding reconstruction algorithm so that with number of measurements polynomial in k, log n, 1/epsilon, we can reconstruct any compressible signal upto 1+epsilon error. This is the first known polynomial time explicit construction of any such measurement matrix. Our result not only improves the error guarantee from O(1) to 1 + \epsilon but also improve the reconstruction time from poly(n) to poly(k log n).

Our second result is a randomized construction of a measurement matrix of size O(k polylog n) that works for each signal with high probability and gives per-instance approximation guarantee rather than over the class of all signals. Previous work on Compressed Sensing does not provide such per-instance approximation guarantee; our result improves the best known number of measurements known from prior work in other areas including Learning Theory Streaming algorithms and Complexity Theory for this case. Our approach is combinatorial. In particular, we use two parallel sets of group tests, one to filter and the other to certify and estimate; the resulting algorithms are quite simple to implement. %A Reka Albert %A Bhaskar DasGupta %A Riccardo Dondi %A Eduardo Sontag %T Inferring (Biological) Signal Transduction Networks via Binary Transitive Reductions %D Tue Dec 13 17:27:41 2005 %Z Wed Dec 6 16:27:53 EST 2006 %I DIMACS %R 2005-41 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-41.ps.gz %X In this paper we consider the binary transitive reduction (BTR) problem that arises in inferring a sparsest possible (biological) signal transduction network consistent with a set of experimental observations with a goal to minimize false positive inferences even if risking false negatives. Special cases of BTR has been investigated before in different contexts; the best previous results are as follows: (1) the minimum equivalent digraph problem, that correspond to the special case of BTR with no critical edges and all edges labels being zeroes, is known to be MAX-SNP-hard, admits a polynomial time algorithm with an approximation ratio of 1.617 + \epsilon for any constant \epsilin > 0 and can be solved in linear time for directed acyclic graphs. (2) a 2-approximation algorithm exists for the special case of BTR in which all edge labels are zeroes. In this paper, our contributions include: -- observing that the BTR problem can be solved in linear time for directed acyclic graphs, -- providing a 1.78-approximation for the restricted version of BTR when all edge labels are zeroes (the same restricted version as in (2) above), and -- providing a $2+o(1)$-approximation for BTR on general graphs. %A Andrei Anghelescu %A Aynur Dayanik %A Dmitriy Fradkin %A Alex Genkin %A Paul Kantor %A David Lewis %A David Madigan %A Ilya Muchnik %A Fred Roberts %T Simulated Entity Resolution by Diverse Means: DIMACS Work on the KDD Challenge of 2005 %D Sun Apr 16 12:39:57 2006 %Z Wed Dec 6 16:27:58 EST 2006 %I DIMACS %R 2005-42 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-42.ps.gz %X This report describes DIMACS work on two of the groups of entity resolution problems, ER1 and ER2 for the KDD Challenge in 2005. We presume that the situation is intended to mimic, using abstracts and author information from the life sciences, some real world problem, in which it is important to recognize the identity of an individual, even though he may share that name with other individuals (ER1), or may actively seek to hide his identity by removing his own name from a work, or replacing it with an alias (ER2a, and ER2b,c). Thus specific problems investigated include author resolution, finding a missing author of a paper, and detecting a false author of a paper. The methods used to attack these problems include combinatorial cluster analysis, fusion of methods, penalized logistic regression / maximum entropy approaches, and dependency modeling. %A James Abello %A Frank van Ham %T Interactive Navigation of Power-Law Graphs %D Thu Jul 20 10:04:25 2006 %Z Wed Dec 6 16:28:02 EST 2006 %I DIMACS %R 2005-43 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2005/2005-43.ps.gz %X We illustrate how iterated deletion of vertices of degree one, followed by biconnected graph decomposition constitute simple but powerful preprocessing steps that we can use to create suitable hierarchies on a graph. These hierarchies can then aid us substantially in the process of graph layout and navigation. The benefits of this approach are more apparent on sparse power-law graphs when a hierarchy tree is computed for each biconnected component via recursive clustering. Our sample data sets include an Autonomous System-level Internet topology with 11461 vertices and a 43,433 vertex graph detailing the static call dependencies of the Linux core. %A Khaled M. Elbassioni %T On the Complexity of Monotone Boolean Duality Testing %D Thu Jan 12 18:54:15 2006 %Z Wed Dec 6 16:29:46 EST 2006 %I DIMACS %R 2006-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-01.ps.gz %X We show that the duality of a pair of monotone Boolean functions in disjunctive normal forms can be tested in polylogarithmic time using a quasi-polynomial number of processors. Our decomposition technique yields stronger bounds on the complexity of the problem than those currently known and also allows for generating all minimal transversals of a given hypergraph using only polynomial space. %A David L. Roberts %A Fred S. Roberts %T The Paradoxical Nature of Locating Sensors in Paths and Cycles: The Case of 2-Identifying Codes %D Fri Feb 24 23:35:47 2006 %Z Wed Dec 6 16:29:55 EST 2006 %I DIMACS %R 2006-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-02.ps.gz %X For a graph $G$ and a set $D \subseteq V(G)$, define $N_r[x] = \{x_i \in V(G): d(x,x_i) \leq r\}$ (where $d(x,y)$ is graph theoretic distance) and $D_r(x) = N_r[x] \cap D$. $D$ is known as an $r$-identifying-code if for every vertex $x, D_r(x) \neq \emptyset$, and for every pair of vertices $x$ and $y$, $x \neq y \Rightarrow D_r(x) \neq D_r(y)$. The various applications of these codes include attack sensor placement in networks and fault detection/localization in multiprocessor or distributed systems. In~\cite{BCHL04}~and~\cite{gravier}, partial results about the minimum size of $D$ for $r$-identifying codes are given for paths and cycles and complete closed form solutions are presented for the case $r = 1$, based in part on~\cite{Daniel}. We provide complete solutions for the case $r = 2$ as well as present our own solutions (verifying earlier results) to the $r = 1$ case. We use these closed form solutions to illustrate some surprisingly counterintuitive behavior that arises when the length of the path or cycle or the value of $r$ varies. %A F. Della Croce %A M. J. Kaminski %A V. Th. Paschos %T An exact algorithm for MAX-CUT in sparse graphs %D Fri Apr 21 19:40:20 2006 %Z Wed Dec 6 16:30:00 EST 2006 %I DIMACS %R 2006-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-03.ps.gz %X We Study exact algorithms for the MAX-CUT problem. Introducing a new technique, we present an algorithmic scheme that computes a maximum cut in graphs with bounded maximum degree. Our algorithm runs in time $O^*(2^{(1-(2/\Delta))n})$. We also describe a MAX-CUT algorithm for general graphs. Its time complexity is $O^*(2^{mn/(m+n)})$. Both algorithms use polynomial space. %A Petros Drineas %A Michael W. Mahoney %A S. Muthukrishnan %T Polynomial Time Algorithm for Column-Row Based Relative-Error Low-Rank Matrix Approximation %D Sun Apr 16 21:25:13 2006 %Z Wed Dec 6 16:30:04 EST 2006 %I DIMACS %R 2006-04 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-04.ps.gz %X Given an $m \times n$ matrix $A$ and an integer $k$ less than the rank of $A$, the best -- with respect to the Frobenius norm -- rank $k$ approximation to $A$ is $A_k$, which is obtained by truncating the Singular Value Decomposition (SVD) of $A$. While $A_k$ is routinely used in data analysis, it is difficult to interpret and understand it in terms of the {\em original data}, namely the rows and columns of $A$ which come from the application domain.

In this paper, we address the problem of obtaining low-rank approximations that are directly expressible in terms of the original rows and columns of $A$. Our main results are as follows. We present a randomized algorithm to determine a set $C$ of columns, whose size is polynomial in $k,\log (1/\delta),1/\varepsilon$, such that the matrix $A'$ expressly written in terms of $C$ satisfies \[ \FNorm{A-A'} \leq (1+\varepsilon) \FNorm{A-A_k} \] with probability at least $1-\delta$. This is the first polynomial time algorithm for low-rank matrix approximation that gives relative error guarantees; all previously known methods including the seminal work of Frieze, Kannan and Vempala~\cite{FKV98} yield approximations with a large additive term of $\varepsilon \FNorm{A}$ and are improved by our result. We further extend this result to obtain a randomized algorithm to determine a set $C$ of columns and a set $R$ of rows, both polynomial in $k,\log (1/\delta),1/\varepsilon$ in size, such that the matrix $A'$ expressly written in terms of $C$ and $R$ satisfies h polynomial in $k,\log (1/\delta),1/\varepsilon$ in size, such that the matrix $A'$ expressly written in terms of $C$ and $R$ satisfies \[ \FNorm{A-A'} \leq (1+\varepsilon) \FNorm{A-A_k} \] with probability at least $1-\delta$. This is again the first polynomial time algorithm of this form with relative error guarantees; previously, even existence of such $C$ and $R$ were not known.

All our algorithms employ random sampling, but rather than sampling rows and columns of $A$ as in prior work, we carefully use information from the top few singular vectors of $A$. This technique was recently introduced in~\cite{DMM06} for the $l_2$ regression problem, and is significantly extended here. Our algorithms are quite simple, taking time of the order of the time needed to compute the SVD of $A$, and will likely be useful in practical applications. %A Saket Anand %A David Madigan %A Richard Mammone %A Saumitr Pathak %A Fred Roberts %T Experimental Analysis of Sequential Decision Making Algorithms for Port of Entry Inspection Procedures %D Wed May 17 10:25:35 2006 %Z Wed Dec 6 16:30:08 EST 2006 %I DIMACS %R 2006-05 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-05.pdf %X Following work of Stroud and Saeger, we investigate the formulation of the port of entry inspection algorithm problem as a problem of finding an optimal binary decision tree for an appropriate Boolean decision function. We report on an experimental analysis of the robustness of the conclusions of the Stroud-Saeger analysis and show that the optimal inspection strategy is remarkably insensitive to variations in the parameters needed to apply the Stroud-Saeger method. %A Michael Tortorella %T Generalized Traffic Equations %D Wed Dec 6 11:29:13 2006 %Z Wed Dec 6 16:30:13 EST 2006 %I DIMACS %R 2006-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-06.pdf %X This paper considers the problem of computing certain performance parameters of stochastic networks that have Markovian routing but are not otherwise Jackson networks. A scheme using this model for approximating address-based routing is described. The method relies on generalizing the traffic equation to special subsets of the flows in the network.

**KEY WORDS:** Latency, jitter, routing, path-additivity.
%A S. Goldstein
%A J. L. Lebowitz
%A E. R. Speer
%T Large Deviations for a Point Process of Bounded Variability
%D Mon Oct 16 22:13:11 2006
%Z Wed Dec 6 16:30:17 EST 2006
%I DIMACS
%R 2006-07
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-07.ps.gz
%X
We consider a one-dimensional translation invariant point
process of density one with uniformly bounded variance of the number $N_I$
of particles in any interval $I$. Despite this suppression of fluctuations
we obtain a large deviation principle with rate function
$\F(\rho)\simeq-L^{-1}\log\Prob(\rho)$ for observing a macroscopic density
profile $\rho(x)$, $x\in[0,1]$, corresponding to the coarse-grained and
rescaled density of the points of the original process in an interval of
length $L$ in the limit $L\to\infty$. $\F(\rho)$ is not convex and is
discontinuous at $\rho\equiv1$, the typical profile.
%A Adi Ben-Israel
%A Cem Iyigun
%T Probabilistic Distance Clustering
%D Wed Jun 14 22:52:44 2006
%Z Wed Dec 6 16:30:23 EST 2006
%I DIMACS
%R 2006-09
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-09.ps.gz
%X
We present a new iterative method for probabilistic clustering of
data. Given clusters, their **centers**, and the
**distances** of data points from these centers, the
**probability** of cluster membership at any point is
assumed to depend on its distances from all centers. We state this
assumption as a **principle** that probability is inversely
proportional to distance.

The method is based on the above principle, and on the
**joint distance function**, a measure of distance from all
cluster centers, that evolves during the iterations, and captures
the data in its low contours.

At each iteration, the distances (Euclidean, Mahalanobis, etc.) from the cluster centers are computed for all data points, and the centers are updated as stationary points of the joint distance function. The initial centers are arbitrary and computations stop when the centers stop moving.

The method is simple, fast (requiring a small number of cheap iterations) and gives a high percentage of correct classifications. It converges to the true cluster centers for all initial solutions, and is not sensitive to outliers.

Keywords: Distance clustering, probabilistic clustering, Euclidean distance, Mahalanobis distance %A James Abello %A Michael Capalbo %T Finding max-cliques in power-law graphs with large clique coefficients %D Wed Jun 14 23:26:18 2006 %Z Wed Dec 6 16:30:33 EST 2006 %I DIMACS %R 2006-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-10.ps.gz %X Here we show that the Max-Clique problem is hard, even when restricted to sparse low-diameter power-law graphs with a large clustering coefficient. %A James Abello %A Michael Capalbo %T Blocking sequences of sets in infinite graphs %D Thu Nov 16 11:45:45 2006 %Z Wed Dec 6 16:30:38 EST 2006 %I DIMACS %R 2006-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-11.ps.gz %X In this paper, we introduce the graph-theoretic concept of a blocking sequence, and use that concept to prove a conjecture about certain instances of the Firefighter Problem.

More specifically, for any positive integer $d$, let $H^d$ be a $d$-dimensional infinite grid, where at each time-step ($0,1,2, \ldots$), each vertex is in one of three states: {\em on-fire}, {\em fireproofed}, or {\em suseptible}, where a vertex that is suseptible becomes on-fire for the next time step if any of its neighbors are on-fire already, unless it is fireproofed first. Suppose that at time-step 0, some finite set $S_0$ of vertices is on-fire, and that at each time-step $t$, we are allowed to fireproof $f(t)$ vertices of $H^d$ to contain the set of vertices that eventually become on-fire to a finite set. Hartke conjectured in his Ph.D. thesis that $f(t)$ would have to be $\Omega(t^{d-2})$ for us to be successful for all such $S_0$. The main result of this paper is that we prove his conjecture to be correct. %A Endre Boros %A Konrad Borys %A Vladimir Gurvich %A Gabor Rudolf %T Generating 3-Vertex Connected Spanning Subgraphs %D Sun Jul 23 17:57:36 2006 %Z Wed Dec 6 16:30:43 EST 2006 %I DIMACS %R 2006-12 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-12.pdf %X In this paper we present an algorithm to generate all minimal $3$-vertex connected spanning subgraphs of an undirected graph with $n$ vertices and $m$ edges in incremental polynomial time, i.e., for every $K$ we can generate $K$ (or all) minimal $3$-vertex connected spanning subgraphs of a given graph in $O(K^2 log (K) m^2 + K^2 m^3)$ time, where $n$ and $m$ are the number of vertices and edges of the input graph, respectively. This is an improvement over what was previously available and is the same as the best known running time for generating $2$-vertex connected spanning subgraphs. Our result is obtained by applying the decomposition theory of $2$-vertex connected graphs to the graphs obtained from minimal $3$-vertex connected graphs by removing a single edge. %A Endre Boros %A Konrad Borys %A Vladimir Gurvich %A Gabor Rudolf %T Inapproximability Bounds for Shortest-Path Network Interdiction Problems %D Sun Jul 23 18:00:46 2006 %Z Wed Dec 6 16:30:47 EST 2006 %I DIMACS %R 2006-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-13.pdf %X We consider two network interdiction problems: one where a network user tries to traverse a network from a starting vertex $s$ to a target vertex $t$ along the shortest path while an interdictor tries to eliminate all short $s$-$t$ paths by destroying as few vertices (arcs) as possible, and one where the network user, as before, tries to traverse the network from $s$ to $t$ along the shortest path while the interdictor tries to destroy a fixed number of vertices (arcs) so as to cause the biggest increase in the shortest $s$-$t$ path. The latter problem is known as the Most Vital Vertices (Arcs) Problem. In this paper we provide inapproximability bounds for several variants of these problems. %A Michael L. Littman %A Nishkam Ravi %A Arjun Talwar %A Martin Zinkevich %T An Efficient Optimal-Equilibrium Algorithm for Two-player Game Trees %D Sun Jul 23 18:03:12 2006 %Z Wed Dec 6 16:30:52 EST 2006 %I DIMACS %R 2006-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-14.pdf %X Two-player complete-information game trees are perhaps the simplest possible setting for studying general-sum games and the computational problem of finding equilibria. These games admit a simple bottom-up algorithm for finding subgame perfect Nash equilibria efficiently. However, such an algorithm can fail to identify optimal equilibria, such as those that maximize social welfare. The reason is that, counterintuitively, probabilistic action choices are sometimes needed to achieve maximum payoffs. We provide a novel polynomial-time algorithm for this problem that explicitly reasons about stochastic decisions and demonstrate its use in an example card game. %A Marcin Kaminski %A Vadim Lozin %T Polynomial-time algorithm for vertex k-colorability of P5-free graphs %D Sat Aug 26 10:38:02 2006 %Z Wed Dec 6 16:30:56 EST 2006 %I DIMACS %R 2006-15 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-15.ps.gz %X We give the first polynomial-time algorithm for coloring vertices of P5-free graphs with k colors. This settles an open problem and generalizes several previously known results. %A Diogo V. Andrade %A Endre Boros %A Vladimir Gurvich %T On graphs whose maximal cliques and stable sets intersect %D Mon Sep 18 21:49:53 2006 %Z Wed Dec 6 16:31:00 EST 2006 %I DIMACS %R 2006-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-16.pdf %X We say that a graph G has the CIS-property and call G a CIS-graph if each maximal clique and each maximal stable set of G intersect. By definition, G is a CIS-graph if and only if the complementary graph $\bar{G}$ is a CIS-graph too. In this paper we give some necessary and some sufficient conditions for the CIS-property to hold. In general, problems of efficient characterization and recognition of CIS-graphs remain open.

Given an integer $k \geq 2$, a comb (or k-comb) S_k is a graph with 2k vertices k of which, v_1, ..., v_k, form a clique C, while others, v'_1, ..., v'_k, form a stable set S, and (v_i,v'_i) is an edge for all i = 1, ..., k, and there are no other edges. The complementary graph $\bar{S_k}$ is called an anti-comb (or k-anti-comb). Clearly, S and C switch in the complementary graphs. Obviously, the combs and anti-combs are not CIS-graphs, since $C \cap S = \emptyset$. Hence, if a CIS-graph G contains an induced comb (respectively, anti-comb) then it must be settled, that is, G must contain a vertex $v$ connected to all vertices of C and to no vertex of S.

However, these conditions are only necessary but not sufficient for the CIS-property to hold. Our main result is the following theorem: G is a CIS-graph whenever G contains no induced 3-combs and 3-anti-combs, and every induced 2-comb is settled in G.

We also generalize the concept of CIS-graph as follows. Given intege $d \geq 2$ and a complete graph whose edges are colored by d colors G_d = (V; E_1, ..., E_d), we say that G_d is a CIS-d-graph (has the CIS-d-property) if $\bigcap_{i=1}^d C_i \neq \emptyset$ whenever C_i is a maximal color i-free subset of V, that is, $(v,v') \in E_i$ for no $v, v' \in C_i$. Clearly, in case $d=2$ we return to the concept of CIS-graphs. It seems that every CIS-d-graph is a Gallai graph, that is, it cannot contain a triangle colored by 3 distinct colors. We provide partial results in support of this conjecture. We also show that if this conjecture is true then characterization and recognition of CIS-d-graphs is easily reduced to characterization and recognition of CIS-graphs. %A Martin Milanic %A Jerome Monnot %T On the complexity of the exact weighted independent set problem for various graph classes %D Thu Oct 5 10:36:05 2006 %Z Wed Dec 6 16:31:04 EST 2006 %I DIMACS %R 2006-17 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-17.ps.gz %X The exact weighted independent set problem (EWIS) consists of determining whether a given weighted graph contains an independent set whose weight equals a given integer M. We study the problem of determining the complexity of the exact weighted independent set problem, and its restricted version EWIS_\alpha (where the independent set is required to be of maximum size), for particular graph classes. We prove that these problems are strongly NP-complete for cubic bipartite graphs, and extend this result to a more general setting. On the positive side, we distinguish several graph classes where EWIS and EWIS_\alpha can be solved in pseudo-polynomial time. %A Bruce E. Sagan %T Proper Partitions of a Polygon and $k$-Catalan Numbers %D Thu Oct 5 10:36:38 2006 %Z Wed Dec 6 16:31:09 EST 2006 %I DIMACS %R 2006-18 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-18.ps.gz %X Let $P$ be a polygon whose vertices have been colored (labeled) cyclically with the numbers $1,2,\ldots,c$. Motivated by conjectures of Propp, we are led to consider partitions of $P$ into $k$-gons which are proper in the sense that each $k$-gon contains all $c$ colors on its vertices. Counting the number of proper partitions involves a generalization of the $k$-Catalan numbers. We also show that in certain cases, any proper partition can be obtained from another by a sequence of moves called flips. %A Bruce E. Sagan %T Pattern Avoidance in Set Partitions %D Sun Dec 3 22:52:16 2006 %Z Wed Dec 6 16:31:13 EST 2006 %I DIMACS %R 2006-19 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-19.ps.gz %X The study of patterns in permutations is a very active area of current research. Klazar defined and studied an analogous notion of pattern for set partitions. We continue this work, finding exact formulas for the number of set partitions which avoid certain specific patterns. In particular, we enumerate and characterize those partitions avoiding any partition of a 3-element set. This allows us to conclude that the corresponding sequences are P-recursive. Finally, we define a second notion of pattern in a set partition, based on its restricted growth function. Related results are obtained for this new definition. %A Alexander Postnikov %A Bruce E. Sagan %T What power of two divides a weighted Catalan number %D Sun Dec 3 22:55:37 2006 %Z Wed Dec 6 16:31:19 EST 2006 %I DIMACS %R 2006-21 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-21.ps.gz %X Given a sequence of integers b=(b_0,b_1,b_2,...) one gives a Dyck path P of length 2n the weight

The processor problem is a variant of the two-dimensional packing problem, where objects have placement constraints and are allowed to be split into smaller objects. This problem also has applications in the design of algorithms for bandwidth allocation in computer networks and memory management in computer systems.

For the $MaxStretch$ and $TotalStretch$ metrics, the offline version of processor allocation problem is known to be strongly $NP$ Hard. For the online version, it is known that the best achievable competitive ratio for even a randomized algorithm is $\Omega(n)$.

In this paper, we analyze the processor allocation problem in a relaxed framework, where our algorithm is given $O(1)$ more processors than the optimal offline algorithm. In this framework, for the MaxStretch and TotalStretch metrics, we present algorithms that obtains an optimal solution for the following two special cases:

- the ratio of maximum to the minimum number of processors required by jobs is bounded by a constant
- the ratio of maximum to the minimum processing time of jobs is bounded by a constant

Seki is a difficult game. We cannot solve it and present only some partial results and conjectures mostly on the CSMs.

The game is closely related to the so-called seki (shared life)
positions in GO. However, Seki is of independent interest as a
combinatorial game. Those readers who do not know how to play GO can
still understand the whole paper, except Appendix, where we analyze
(seki) positions in GO corresponding to some (seki) matrices. Already
for $3 \times 3$ matrices such positions may be difficult even for
advanced GO players.
%A Endre Boros
%A Vladimir Gurvich
%T On complexity of algorithms for modeling disease transmission
%D Sat May 12 06:56:45 2007
%Z Sat May 12 06:59:54 EDT 2007
%I DIMACS
%R 2007-06
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-06.pdf
%X
We consider simple deterministic models of disease
transmission. Given a set of individuals $I$, we assign a hypergraph
$H_i = (I \setminus \{i\}, E_i)$ to each $i \in I$ and assume that
$i$ will be infected whenever there is a fully infected edge $e \in
E_i$. Along with this general model $M_H$ we also study two special
cases $M_G$ and $M_D$ when for all $i \in I$ the hypergraphs $H_i$
are specified implicitly by a (directed) graph $G = (I,E)$ and
integral positive thresholds $k(i)$ for all $i \in I$. Then we assume
that $i$ will be infected whenever at least $k(i)$ of his
neighbors (predecessors) are infected.
Given a set $S$ of the originally infected individuals (a source) we
generate the closure $T(S) = cl(S)$, that is, the set of all
individuals that will be infected if the above transmission rules are
applied iteratively sufficiently many times. We study all minimal
sources such that
(i) $T(S) = I$, or
(ii) $T(S)$ contains a given individual $q \in I$, or
(iii) $T(S)$ contains an edge of a given ``target'' hypergraph $H$.
We denote these three types of "targets" by $T_I,T_q$, and $T_H$ respectively.
We show that, given a threshold $t$, it is NP-complete to decide
whether there is a source $S$ of size at most $t$. The problem
remains NP-complete
for each of the three models $M_R,M_G$ or $M_D$ and targets $T_I,T_q$
or $T_H$. We also consider enumeration problems and show that if the
transmission rule is given explicitly, $M_R$, then all inclusion
minimal sources can be generated
in incremental polynomial time for all targets $T_I,T_q$, or $T_H$.
On the other hand, generating minimal sources is hard for all targets
if the transmission model is given by a (directed) graph, $M_G$ or
$M_D$, since for these two cases
the input size may be logarithmic in the input size of $M_R$. Indeed,
given $G = (I,E)$ and $k(i)$ for all $i \in I$, a corresponding
hypergraph $H_i$ for some $i \in I$ may be exponential in $|I|$
unless $k(i)$ is bounded by a constant.
%A Sen-Peng Eu
%A Tung-Shan Fu
%T The Cyclic Sieving Phenomenon for Faces of Generalized Cluster Complexes
%D Thu May 17 07:56:19 2007
%Z Thu May 17 07:58:47 EDT 2007
%I DIMACS
%R 2007-07
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-07.ps.gz
%X
The notion of cyclic sieving phenomenon is introduced by Reiner,
Stanton, and White as a generalization of Stembridge's $q=-1$
phenomenon. The generalized cluster complexes associated to root
systems are given by Fomin and Reading as a generalization of the
cluster complexes found by Fomin and Zelevinsky. In this paper, the
faces of various dimensions of the generalized cluster complexes in
type $A_n$, $B_n$, $D_n$, and $I_2(a)$ are shown to exhibit the
cyclic sieving phenomenon under a cyclic group action. For the
cluster complexes of exceptional type $E_6$, $E_7$, $E_8$, $F_4$,
$H_3$, and $H_4$, a verification for such a phenomenon on the
facets is given.
%A Yuri Goncharov
%A Ilya Muchnik
%A Leonid Shvartser
%T Saddle Point Feature Selecion In SVM Regression
%D Mon Jul 16 17:29:04 2007
%Z Mon Jul 16 17:36:19 EDT 2007
%I DIMACS
%R 2007-08
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-08.ps.gz
%X
SVM wrapper method, proposed in our previous work "Simultaneous Feature Selection and Margin Maximization Using Saddle Point Approach" is investigated and examined for the SVM regression. The method simultaneously maximizes margin and minimizes feature space with help of a modification of the standard criterion by adding to the basic objective function a third term, which directly penalizes a chosen set of variables. Our examination of the proposed min-max saddle point algorithm for the SVM regression case on a SAR Benchmark proves the ability of the introduced approach both to select small subspaces of features and to improve the regression prediction quality.
%A Graham Cormode
%A Flip Korn
%A Srikanta Tirthapura
%T Time-Decaying Aggregates in Out-of-order Streams
%D Wed Jul 25 11:27:51 2007
%Z Wed Jul 25 12:15:24 EDT 2007
%I DIMACS
%R 2007-10
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-10.pdf
%X
Processing large data streams is now a major topic in data management.
The data involved can be truly massive, and the required analyses
complex. In a stream of sequential events such as stock feeds, sensor
readings, or IP traffic measurements, tuples pertaining to recent
events are typically more important than older ones. This can be
formalized via time decay functions, which assign weights based on
age. Decay functions such as sliding windows and exponential decay
have been well studied under the assumption of well-ordered arrivals,
i.e., data arrives in the order of increasing time stamps. However,
data quality issues are prevalent in massive streams (due to network
asynchrony and delays or possibly due to features inherent to the
measurement process), and correct order of arrival cannot be
guaranteed.
We focus on the computation of decayed aggregates such as range
queries, quantiles, and heavy hitters on out-of-order streams, where
elements do not necessarily arrive in increasing order of timestamps.
We give the first deterministic algorithms for approximating these
aggregates under the sliding window, exponential and polynomial decay
functions. Our techniques are based on extending existing data stream
summaries, such as the q-digest. We show that the overhead for
allowing out-of-order arrivals is small when compared to the case of
well-ordered arrivals. Our experimental study confirms that these
methods can be applied in practice and investigates how the specific
decay function impacts performance.
%A Sukamol Srikwan
%A Markus Jakobsson
%T Using Cartoons to Teach Internet Security
%D Thu Sep 6 19:17:47 2007
%Z Thu Sep 6 19:18:25 EDT 2007
%I DIMACS
%R 2007-11
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-11.pdf
%X
While good user education can hardly secure a system, we believe that poor user education can put it at serious risk. The current problem of online fraud is exasperated by the fact that most users make security decisions, such as whether to install a given piece of software or not, based on a very rudimentary understanding of risk. We describe the design principles behind SecurityCartoon.com, the first cartoon-based approach aimed at improving the understanding of risk among typical Internet users. We argue why an approach like ours is likely to produce better long-term effects than currently practiced educational efforts with the same general goals. This belief is based on the apparent difference between our approach and currently used alternatives. At the heart of these differences are the four guiding principles of our approach: (1) A research driven content selection, according to which we select educational messages based on user studies; (2) accessibility of the material, to reach and maintain a large readership; (3) user immersion in the material, based on repetitions on a theme; and (4) adaptability to a changing threat.
%A Piotr Berman
%A Bhaskar DasGupta
%T Approximating the Online Set Multicover Problems Via Randomized Winnowing
%D Thu Sep 13 10:12:32 2007
%Z Thu Sep 13 10:23:21 EDT 2007
%I DIMACS
%R 2007-12
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-12.ps.gz
%X
In this paper, we consider the weighted online set k-multicover problem. In this problem, we have an universe V of elements, a family of subsets of V with a positive real cost for every subset and a coverage factor (positive integer) k. A subset of elements are presented online in an arbitrary order. When each element is presented, we are also told the collection of all (at least k) sets and their costs in which the element belongs and we need to select additional sets from this collection if necessary such that our collection of selected sets contains at least k sets that contain the element. The goal is to minimize the total cost of the selected sets (our algorithm and competitive ratio bounds can be extended to the case when a set can be selected at most a prespecified number of times instead of just once; we do not report these extensions for simplicity and also because they have no relevance to the biological applications that motivated our work).
In this paper, we describe a new randomized algorithm for the online multicover problem based on a randomized version of the winnowing approach. This algorithm generalizes and improves some earlier results. We also discuss lower bounds on competitive ratios for deterministic algorithms for general k based on earlier approaches.
%A Michael Leyton
%T Process Grammar
%D Fri Sep 14 11:28:31 2007
%Z Fri Sep 14 11:33:22 EDT 2007
%I DIMACS
%R 2007-09
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-09.pdf
%X
This report gives an exposition of the Process-Grammar, published originally in the journal Artificial Intelligence in 1988, together with a description of some of the subsequent applications of the grammar in meteorology, biology, computer-aided design, chemical engineering, and geology. The Process-Grammar is a means of recovering the process-history of a smooth shape from its curvature extrema, and expressing that evolution in terms of transitions at those extrema. The inference of history follows from the Symmetry-Curvature Duality Theorem of Leyton (1987), which states that, to each curvature extremum, there is a differential symmetry axis leading to and terminating at that extremum; and from an inference rule that states that the symmetry axis is the record of a process. The Process-Grammar expresses the relationship between any two stages in the shape's history as an extrapolation of the processes inferred by the theorem.
%A Boris Mirkin
%A Susana Nascimento
%A Luis Moniz Pereira
%T ACM Classification Can Be Used for Representing Research Organizations
%D Fri Sep 14 11:43:15 2007
%Z Fri Sep 14 11:45:43 EDT 2007
%I DIMACS
%R 2007-13
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-13.pdf
%X
We present a method for representation a Computer Science Research organization by using the ACM Computing Subjects classification tree. ÊThe representation comprises head subjects of the upper level as well as their gaps and offshoots found by parsimoniously mapping main subject clusters, extracted from the data on similarity ACM research topics according to the working in the organization, onto the ACM classification. ÊA robust method for possibly overlapping clustering is described. ÊA real-world example of the representation is provided.
%A Mary Ashley
%A Tanya Berger-Wolf
%A Piotr Berman
%A Wanpracha Chaovalitwongse
%A Bhaskar DasGupta
%A Ming-Yang Kao
%T On Approximating Four Covering/Packing Problems With Applications to Bioinformatics
%D Fri Oct 5 10:25:39 2007
%Z Fri Oct 5 10:30:06 EDT 2007
%I DIMACS
%R 2007-14
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-14.pdf
%X
In this paper, we consider approximability of four covering/packing type problems which have important applications in computational biology. The problems considered in this paper are the triangle packing problem, the full sibling reconstruction problem under two parsimonious assumptions, the maximum profit coverage problem and the 2-coverage problem.
We provide approximation algorithms and inapproximability results
for various values of parameters of interest for these problems.
Our inapproximability constant for the triangle packing problem improves slightly upon the best-known inapproximability
constant that can be achieved from previous results;
this is done by directly transforming the inapproximability gap of Hastad for the problem of maximizing the number
of satisfied equations for a set of equations over GF(2)
and is interesting in its own right. Our inapproximability results on full siblings reconstruction problems answers open questions about the computational complexities of
these problems posed by Berger-Wolf et al.
Our results on the maximum profit coverage problem provides almost matching upper and lower bounds on the approximation ratios for this
problem posed by Hassin and Or.
In spite of the many pessimistic worst-case inapproximability results, we provide consolation to the practitioners by concluding that an implementation of a version of the combinatorial heuristic for the full sibling reconstruction reported in [Berger-Wolf et al, 2007] performs empirically well on two new biological datasets.
%A Endre Boros
%A Khaled Elbassioni
%A Kazuhisa Makino
%T Berge Multiplication for Monotone Boolean Dualization
%D Fri Oct 5 11:45:58 2007
%Z Fri Oct 5 11:50:15 EDT 2007
%I DIMACS
%R 2007-15
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-15.pdf
%X
Given the prime CNF representation $\phi$ of a monotone Boolean function $f:\{0,1\}^n\mapsto\{0,1\}$, the dualization problem calls for finding the corresponding prime DNF representation $\psi$ of $f$. A very simple method (called {\em Berge multiplication} \cite[Page 52--53]{B89}) works by multiplying out the clauses of $\phi$ from left to right in some order, simplifying whenever possible using {\it the absorption law}. We show that for any monotone CNF $\phi$, Berge multiplication can be done in subexponential time, and for many interesting subclasses of monotone CNF's such as CNF's with bounded size, bounded degree, bounded intersection, bounded conformality, and read-once formula, it can be done in polynomial or quasi-polynomial time.
%A Yuri Goncharov
%A Ilya Muchnik
%A Leonid Shvartser
%T Saddle Point Feature Selection in SVM Classification
%D Tue Oct 23 18:19:21 2007
%Z Tue Oct 23 18:25:32 EDT 2007
%I DIMACS
%R 2007-16
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-16.pdf
%X
SVM wrapper feature selection method for the classification
problem, introduced in our previous work, is
analyzed. The method based on modification of the standard SVM
criterion by adding to the basic objective function a third term,
which directly penalizes a chosen set of variables. The criterion
divides the set of all variables into three subsets: deleted,
selected and weighted features. We give more formal derivation of
the saddle point problem to which SVM wrapper method reduces.
Saddle point algorithm described, proof of its convergence and
estimation for the step size of the algorithm done. Effective
calculations of projections used in the saddle point algorithm are
described. The algorithm is examined on a classification Benchmark
and its ability to improve the SVM recognition results is shown.
%A Fred S. Roberts
%T Computer Science and Decision Theory
%D Thu Nov 8 08:09:11 2007
%Z Thu Nov 8 08:12:27 EST 2007
%I DIMACS
%R 2007-18
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-18.pdf
%X This paper reviews applications in computer science that decision
theorists have addressed for years, discusses the requirements posed
by these applications that place great strain on decision
theory/social science methods, and explores applications in the social
and decision sciences of newer decision-theoretic methods developed
with computer science applications in mind. The paper deals with the
relation between computer science and decision-theoretic methods of
consensus, with the relation between computer science and game theory
and decisions, and with ``algorithmic decision theory.''
%A Habiba
%A Chayant Tantipathananandh
%A Tanya Y. Berger-Wolf
%T Betweenness Centrality Measure in Dynamic Networks
%D Tue Nov 27 18:31:17 2007
%Z Fri Dec 28 08:51:06 EST 2007
%I DIMACS
%R 2007-19
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-19.ps.gz
%X
In this paper we propose three methods of measuring betweenness of individuals in networks which are best modeled as graphs with explicit time ordering on their edges. The betweenness centrality index is one of the basic measure in the analysis of social networks, but most of the work done for measuring the betweenness index of individuals is based on the aggregate representation of the network. Many network problems are based on fundamental relationship involving time. We incorporate the time factor in the aggregate graph representation of social networks to create dynamic networks. We define and measure the betweenness in this dynamic framework.
We compare the three betweenness with the standard betweenness measure for
the same network. We show that by incorporating the exact times of
interactions among individuals in a network, we can better study the
betweenness of individuals in the underlying network.
%A Habiba
%A Tanya Y. Berger-Wolf
%T Maximizing the Extent of Spread in a Dynamic Network
%D Tue Nov 27 18:31:18 2007
%Z Fri Dec 28 08:53:00 EST 2007
%I DIMACS
%R 2007-20
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-20.ps.gz
%X
Dynamic population phenomena, such as the spread of diseases, opinions,
and behavior, can be modeled as processes that propagate in a network of
interacting individuals. In this paper, we focus on the problem of
identifying a set of individuals to initiate this spread so that the
resulting extent of the spread is maximized. Kempe et al.("Maximizing the
Spread of Influence in a Social Network". KDD'03) have solved this problem
for the aggregate representation of the population network. Yet,real
populations are inherently dynamic. In this paper we extend the approach
of Kempe et al. to explicitly dynamic networks. We show that, for two
common models of spread, maximizing the extent of spread in a dynamic
network is an NP-hard problem. We present an (1-1/e)-approximation
algorithm, evaluate the performance of the algorithm experimentally on
real datasets and show that it performs well in practice. In addition, we
compare the dynamic and the aggregate network representations both in
terms of the resulting extent of spread and the actual set of initiating
individuals that maximize the spread. We show that there are significant
differences in both cases. Thus, we demonstrate that ignoring the dynamic
aspects of data may result in extremely inaccurate answers and explicitly
dynamic analysis is necessary.
%A Dr. Srikrishnan Divakaran
%T Algorithms and Heuristics for Constrained Generalized Tree Alignment Problem
%D Fri Dec 28 08:43:17 2007
%Z Fri Dec 28 08:49:02 EST 2007
%I DIMACS
%R 2007-21
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2007/2007-21.pdf
%X
In generalized tree alignment problem, we are given a set

In this paper, we present constant approximation algorithms for the constrained generalized tree alignment problem. For the generalized tree alignment problem, a special case of this problem, our algorithms provide a guaranteed error bound of 2-2/Constrained Generalized Tree Alignment Problem: Given a setof S related sequences and a phylogenetic forest comprising of node-disjoint phylogenetic trees that specify the topological constraints that an evolutionary tree of k needs to satisfy, construct a minimum cost evolutionary tree for S . S

Given a digraph D , the set of all pairs (N-(v), N+(v)) constitutes the neighborhood dihypergraph N(D) of D. The Digraph Realization Problem asks whether a given dihypergraph H coincides with N(D) for some digraph D. This problem was introduced by Aigner and Triesch as a natural generalization of the Open Neighborhood Realization Problem for undirected graphs, which is known to be NP-complete.

We show that the Digraph Realization Problem remains NP-complete for orgraphs (orientations of undirected graphs). As a corollary, we show that the Matrix Skew-Symmetrization Problem for square {0, 1, -1} matrices ( a{ij} = -a_{ji} ) is NP-complete. This result can be compared with the known fact that the Matrix Symmetrization Problem for square 0-1 matrices ( a{ij} = a{ji} ) is NP-complete. Extending a negative result of Fomin, Kratochv, Lokshtanov, Mancini and Telle we show that the Digraph Realization Problem remains NP-complete for almost all hereditary classes of digraphs defined by a unique minimal forbidden subdigraph.

Finally, we consider the Matrix Complementation Problem for rectangular 0-1 matrices, and prove that it is polynomial-time equivalent to graph isomorphism. A related known result is that the Matrix Transposability Problem is polynomial-time equivalent to graph isomorphism.

%A Valentina V. Sulimova %A Vadim V. Mottl %A Casimir A. Kulikowski %A Ilya B. Muchnik %T Probabilistic evolutionary model for substitution matrices of PAM and BLOSUM families. %D Thu Jan 8 08:47:19 2009 %Z Thu Jan 8 08:50:28 EST 2009 %I DIMACS %R 2008-16 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2008/2008-16.pdf %X Background: Almost all problems of protein analysis must inevitably be based on comparing the types of amino acids from which Êprotein sequences are composed. Similarities between amino acids are most commonly based on two methods derived from very different approaches: the evolu-tionary based substitution matrixes of the PAM (Point Accepted Mutation) family, derived from phylogenetic trees, and the BLOSUM substitution matrixes which are statistically inferred from multiple alignments of groups of proteins Êwhich, according to their Êauthors, S. and J. Henikoff, are essentially different from the PAM family of matrices. Results: In this paper we prove that the statistical approach for computing substitution matrixes of the BLOSUM family can be explained in terms of the PAM evolutionary model. This means that both of these approaches are actually based on similar types of evolutionary models, and the main difference between them lies in the different initial data for estimating their unknown model parame-ters. We also show that all PAM substitution matrices can be represented as kernel functions in their mathematical structure, and lose their positive semi-definiteness only because of choice of final rep-resentation. Conclusions: The fact that the PAM and BLOSUM substitution matrices are originally positive semidefinite, allows them to be easily used for constructing kernels over a set of proteins, so, with-out loss of biological meaning, these similarity measures can be applied without correction. Fur-thermore, any new substitution matrix will automatically be a kernel if, first, it is estimated by either the Dayhoff or ÊHenikoff techniques and, second, the final representation proposed in the present research is adopted. %A T-H. Hubert Chan %A Khaled Elbassioni %T A QPTAS for TSP with Fat Weakly Disjoint Neighborhoods in Doubling Metrics %D Sun Feb 15 09:15:12 2009 %Z Sun Feb 15 09:19:23 EST 2009 %I DIMACS %R 2009-01 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-01.pdf %X We consider the Traveling Salesman Problem with Neighborhoods (TSPN) in doubling metrics. The goal is to find a shortest tour that visits each of a collection of n subsets (regions or neighborhoods) in the underlying metric space. We give a QPTAS when the regions are what we call $\alpha$-fat weakly disjoint. This notion combines the existing notions of diameter variation, fatness and disjointness for geometric objects and generalizes these notions to any arbitrary metric space. Intuitively, the regions can be grouped into a bounded number of types, where in each type, the regions have diameters within $\alpha$ factor of one another, and each such region can designate a point such that these points are far away from one another. Our result generalizes the PTAS for TSPN on the Euclidean plane by Mitchell [Mit07] and the QPTAS for TSP on doubling metrics by Talwar [Tal04]. We also observe that our techniques directly extend to a QPTAS for the Group Steiner Tree Problem on doubling metrics, with the same assumption on the groups. %A Endre Boros %A Yves Crama %A Peter L. Hammer %A Toshihide Ibaraki %A Alexander Kogan %A Kazuhisa Makino %T Logical Analysis of Data: Classification with Justification %D Fri Feb 20 01:05:12 2009 %Z Fri Feb 20 01:05:52 EST 2009 %I DIMACS %R 2009-02 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-02.pdf %X Learning from examples is a frequently arising challenge, with a large number of algorithms proposed in the classification and data mining literature. The evaluation of the quality of such algorithms is usually carried out \textit{ex post}, on an experimental basis: their performance is measured either by cross validation on benchmark data sets, or by clinical trials. None of these approaches evaluates directly the learning process \textit{ex ante}, on its own merits. In this paper, we discuss a property of rule-based classifiers which we call ``justifiability", and which focuses on the type of information extracted from the given training set in order to classify new observations. We investigate some interesting mathematical properties of justifiable classifiers. In particular, we establish the existence of justifiable classifiers, and we show that several well-known learning approaches, such as decision trees or nearest neighbor based methods, automatically provide justifiable classifiers. We also identify maximal subsets of observations which must be classified in the same way by every justifiable classifiers. Finally, we illustrate by a numerical example that using classifiers based on ``most justifiable" rules does not seem to lead to overfitting, even though it involves an element of optimization. %A Sarmad Abbasi %A Laeeq Aslam %T The cycle discrepancy of three regular graphs %D Thu Feb 26 11:12:52 2009 %Z Thu Feb 26 11:16:08 EST 2009 %I DIMACS %R 2009-03 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-03.pdf %X Abstract: Let G=(V,E) be an undirected graph and C(G) denote the set of all cycles in G. We introduce a graph invariant, cycdisc(G), the cycle discrepancy of G, which is defined as cycdisc(G) = \min_{\chi: V \mapsto \{+1, -1\}} \max_{ c \in C (G)} |\sum_{v \in c} \chi(v)|. We show that, for n >= 6, if G is a theree regular graph with n vertices then cycdisc(G) <= (n +2)/6. This bound is best possible and is achieved by very simple graphs. Our proof is algorithmic and allows us to compute, in O(n^2) time, a labeling, \chi that \max_{ c \in C(G)} | \sum_{v \in c} \chi(v)| <= (n + 2)/6. Some interesting open problems regarding the cycle discrepancy are also suggested. %A Yada Zhu %A Mingyu Li %A Christina M. Young %A Minge Xie %A Elsayed A. Elsayed %T Impact of measurement error on container inspection policies at port-of-entry %D Wed Mar 4 08:40:46 2009 %Z Wed Mar 4 10:10:28 EST 2009 %I DIMACS %R 2009-11 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-11.pdf %X Containers and cargos arriving at port-of-entry are inspected using sensors and devices to detect drugs, weapons, nuclear materials and other illegal items. Measurement errors associated with the inspection process may result in higher percentage of misclassification of containers. In this paper, we propose and formulate three inspection policies for containers at port-of-entry assuming the presence of sensor measurement errors. The optimization of the policies is carried out and the performance of each in terms of misclassification probabilities is compared. In each of the policies, the optimum settings are determined by minimizing the probability of false rejection while limiting the probability of false acceptance at a very low tolerance level. The results show that the policy of repeat inspections improves the performance in terms of correct container classification. %A Endre Boros %A Vladimir A. Gurvich %A Igor E. Zverovich %A Wei Shao %T Assignability of 3-dimensional totally tight matrices %D Fri Mar 6 13:57:02 2009 %Z Fri Mar 6 14:01:13 EST 2009 %I DIMACS %R 2009-06 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-06.pdf %X A 3-dimensional totally tight matrix A = (aijk) has the property that every 2 * 2 submatrix has a constant line [a row or a column]. We prove that all such matrices are assignable, that is it is possible to assign a label to each of the axial planes so that every aijk is equal to at least one of the corresponding labels. The result can be easily extended to the case of multi-dimensional matrices. %A Endre Boros %A Khaled Elbassioni %A Vladimir Gurvich %A Kazuhisa Makino %T On effectivity functions of game forms %D Fri Mar 6 14:10:59 2009 %Z Fri Mar 6 14:11:59 EST 2009 %I DIMACS %R 2009-07 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-07.pdf %X To each game form g an effectivity function (EFF) Eg can be naturally assigned. An EFF E will be called formal (respectively, formal-minor) if E = Eg (respectively, E<=Eg) for a game form g.(i) An EFF is formal if and only if is superadditive and monotone.

(ii) An EFF is formal-minor if and only if it is weakly superadditive.

Theorem (ii) looks more sophisticated, yet, it is simpler and instrumental in the proof of (i). In addition, (ii) has important applications in social choice, game, and even graph theories. Constructive proofs of (i) were given by Moulin, in 1983, and by Peleg, in 1998. (Peleg's proof works also in case of an infnite set of outcomes.) Both constructions are elegant, yet, the set of strategies Xi of each player i in I in g might be doubly exponential in size of the input EFF E. In this paper, we suggest a third construction such that jXij is only linear in the size of E.

One can verify in polynomial time whether an EFF is formal (or superadditive); in contrast, verification of whether an EFF is formal-minor (or weakly superadditive) is a CoNP-complete decision problem. %A Diogo V. Andrade %A Endre Boros %A Vladimir Gurvich %T Not complementary connected and not CIS d-graphs form weakly monotone families %D Fri Mar 6 14:23:46 2009 %Z Fri Mar 6 14:24:04 EST 2009 %I DIMACS %R 2009-08 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-08.pdf %X A d-graph G = (V;E1,...,Ed) is a complete graph whose edges are arbitrarily partitioned into d subsets (colored with d colors); G is a Gallai d-graph if it contains no three-colored triangle $\Delta$; furthermore, G is a CIS d-graph if $\cup_{i=1}^d Si\not= \emptyset$ for every set-family S = {Si | i in [d] }, where Si in V is a maximal independent set of Gi = (V;Ei), the ith chromatic component of G, for all i in [d] = {1 ... d}. A conjecture suggested in 1978 by the third author says that every CIS d-graph is a Gallai d-graph. In this paper we obtain a partial result. Let $\pi$ be the two-colored d-graph on four vertices whose two non-empty chromatic components are isomorphic to P4. It is easily seen that $\pi$ and $\Delta$ are not CIS d-graphs but become CIS after eliminating any vertex. We prove that no other d-graph has this property, that is, every non-CIS d-graph G distinct from $\pi$ and $\Delta$ contains a vertex v in V such that the sub-d-graph G[V \ {v}] is still non-CIS. This result easily follows if the above $\Delta$-conjecture is true, yet, we prove it independently.

A d-graph G = (V;E1,...,Ed) is complementary connected (CC) if the complement Gi = (V;Ei) = (V, U Ej, j in [d]\ {i}) to its ith chromatic component is connected for every i in [d]. It is known that every CC d-graph G, distinct from \pi and \Delta, and a single vertex, contains a vertex v 2 V such that the reduced sub-d-graph G[V \ {v}] is still CC. It is not difficult to show that every non-CC d-graph with contains a vertex v in V such that the sub-d-graph G[V \ {v}] is not CC. %A Daniel Anderson %A Vladimir Gurvich %A Thomas Dueholm Hansen %T On acyclicity of games with cycles %D Fri Mar 6 14:28:38 2009 %Z Fri Mar 6 14:28:50 EST 2009 %I DIMACS %R 2009-09 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-09.pdf %X We study restricted improvement cycles (ri-cycles) in nite positional n-person games with perfect information modeled by directed graphs (digraphs) that may contain directed cycles (di-cycles). We obtain criteria of restricted improvement acyclicity (ri-acyclicity) in two cases: for n = 2 and for acyclic digraphs. We also provide several examples that outline the limits of these criteria and show that, essentially, there are no other ri-acyclic cases. We also discuss connections between ri-acyclicity and some open problems related to Nash-solvability. %A Yezhou Wu %A Wenan Zang %A Cun-Quan Zhang %T A Characterization of Almost CIS Graphs %D Fri Mar 6 14:32:44 2009 %Z Fri Mar 6 14:33:01 EST 2009 %I DIMACS %R 2009-10 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-10.pdf %X A graph G is called CIS if each maximal clique intersects each maximal stable set in G, and is called almost CIS if it has a unique disjoint pair (C; S) consisting of a maximal clique C and a maximal stable set S. While it is still unknown if there exists a good structural characterization of all CIS graphs, in this note we prove the following Andrade-Boros-Gurvich conjecture: A graph is almost CIS if and only if it is a split graph with a unique split partition. %A Jason Perry %A William M. Pottenger %A Chinua Umoja %A Christopher Janneck %T Supporting Cognitive Models of Sensemaking in Analytics Systems %D Tue Nov 24 15:55:16 2009 %Z Tue Nov 24 15:58:03 EST 2009 %I DIMACS %R 2009-12 %U %X Cognitive science is beginning to provide us with well-supported models of the stages that professional analysts go through in the course of conducting an investigation, be it reactive or proactive in nature. These process models are generally advanced within the field of Sensemaking, because the analystÕs primary task can be viewed as "making sense" of a large body of unorganized information. One of the most well-known long-term investigations into the structure of Sensemaking is that of Pirolli and Card et al. Their resulting model provides an initial basis for our research. In using these models to improve analytics systems, we have at least two distinct problems: (1) how to infer high-level knowledge of the Sensemaking states from a record of user interactions with an interactive analysis system, and (2) how to use this knowledge to provide user guidance that results in better human-machine interaction and a more robust investigative process. The answers to these questions lie at the intersection of research in machine learning, knowledge representation, user interfaces and cognitive science, and addressing them requires an end-to-end system perspective. In this report, we survey these problems and discuss our initial approaches. We describe the description logic we have developed to model the problem domain and define a set of machine learning tasks. Then we present our initial user interface design, and then the design of the initial experiments, including the ground truth which is from an actual solved crime case. We conclude with the insights gained thus far into building interactive systems that support usersÕ cognitive models. %A Baiyang Liu %A Casimir Kulikowski %A Ilya Muchnik %T Global Ordering For Multi-Dimensional Data: Comparison with K-means Clustering %D Tue Apr 28 10:25:17 2009 %Z Tue Apr 28 10:28:45 EDT 2009 %I DIMACS %R 2009-13 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-13.pdf %X This paper describes a novel approach to estimate the quality of clustering based on finding a linear ordering for multi-dimensional data by which the clusters of the data fall into intervals on the ordering scale. This permits assessing the result of such local clustering methods like K-means so as to filter inhomogeneous or outlier clusters that can be produced. Preliminary results reported here indicate that the method is valuable to determine, in two dimensions, the number of visually perceived clusters generated by a mixture of Gaussian distribution model, corresponding to the number of actual generating distributions when the means are far apart, but corresponding to the reduced number of clusters arising from the perceived admixture of overlapping distributions when means are chosen to be close. %A Joan Feigenbaum %A Aaron D. Jaggard %A Michael Schapira %T Approximate Privacy: Foundations and Quantification %D Tue May 26 09:04:31 2009 %Z Tue May 26 09:08:05 EDT 2009 %I DIMACS %R 2009-14 %U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-14.pdf %X Increasing use of computers and networks in business, government, recreation, and almost all aspects of daily life has led to a proliferation of online sensitive data about individuals and organizations. Consequently, concern about the

For both the second-price Vickrey auction and the millionaires
problem, we show that not only is perfect privacy impossible or
infeasibly costly to achieve, but even *close approximations*
of perfect privacy suffer from the same lower bounds. By contrast,
we show that, if the values of the parties are drawn uniformly at
random from {0,...,2^k-1}, then, for both problems, simple
and natural communication protocols have privacy-approximation
ratios that are linear in k (*i.e.*, logarithmic in the size
of the space of possible inputs). We conjecture that this improved
privacy-approximation ratio is achievable for *any* probability
distribution.
%A Pavel Kuksa
%A Vladimir Pavlovic
%T Efficient discovery of common patterns in sequences over large alphabets
%D Mon Jun 29 10:05:34 2009
%Z Mon Jun 29 10:09:20 EDT 2009
%I DIMACS
%R 2009-15
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-15.pdf
%X
We consider the problem of identifying motifs, recurring or conserved patterns, in the data modeled as strings or sequences. In particular, we present a new deterministic algorithm for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. The proposed algorithm (1) improves search efficiency compared to existing algorithms, and (2) scales well with the size of alphabet. Our algorithm is several orders of magnitude faster than existing deterministic algorithms for common pattern identification. We evaluate our algorithm on benchmark motif finding problems and real applications in biological sequence analysis and show that our algorithm maintains predictive performance with significant running time improvements.
%A Alexey Nefedov
%A Jiankuan Ye
%A Casimir Kulikowski
%A Ilya Muchnik
%A Kenton Morgan
%T Experimental Study of Support Vector Machines Based on Linear and Quadratic Optimization Criteria
%D Mon Jun 29 10:14:33 2009
%Z Mon Jun 29 10:17:16 EDT 2009
%I DIMACS
%R 2009-18
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-18.pdf
%X
We present results from a comparative empirical study on the performance of two methods for constructing support vector machines (SVMs). The first method is the conventional one based on the quadratic programming approach, which builds the optimal separating hyperplane maximizing the margin between two classes (optimal SVM). The second method is based on the linear programming approach suggested by Vapnik to build a separating hyperplane with the minimum number of support vectors (heuristic SVM). Using synthetic data from two classes, we compare the classification performance of these SVMs, with an in-depth geometrical comparison of their separating hyperplanes and support vectors. We show that both classifiers achieve practically identical classification accuracy and generalization performance. However, the heuristic SVM has many fewer support vectors than the optimal SVM. In addition, in contrast to the optimal SVM, its support vectors lie on the furthermost borders of the classes, at the maximum distance from the opposite class. In our future work, we will seek to find a theoretical basis to explain these geometrical patterns of the heuristic SVM. We will also compare these classifiers using real benchmark data.
%A Jerry Cheng
%A Minge Xie
%A Rong Chen
%A Fred Roberts
%T A Mobile Sensor Network for the Surveillance of Nuclear Materials in Metropolitan Areas
%D Tue Sep 15 09:20:03 2009
%Z Tue Sep 15 09:21:30 EDT 2009
%I DIMACS
%R 2009-19
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-19.pdf
%X
Nuclear attacks are among the most devastating terrorist attacks, with severe losses of human lives as well as damage to infrastructure. It becomes increasingly vital to have sophisticated nuclear surveillance and detection systems deployed in major cities in the U.S. to deter such threats. In this paper, we outline a robust system of a mobile sensor network and develop statistical algorithms and models to provide consistent and pervasive surveillance of nuclear materials in major cities. Specifically, the network consists of a large number of vehicles, such as taxicabs and police cars, on which nuclear sensors and Global Position System (GPS) tracking devices are installed. Real time readings of the sensors are processed at a central surveillance center, where mathematical and statistical analyses are performed. We use simulations to evaluate the effectiveness and detection power of such a network.
%A Fred J. Rispoli
%A Steven Cosares
%T Statistics for a Random Network Design Problem
%D Tue Sep 15 09:28:20 2009
%Z Tue Sep 15 09:30:46 EDT 2009
%I DIMACS
%R 2009-20
%U ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2009/2009-20.pdf
%X
We investigate a random network design problem specified by a complete graph with n nodes whose edges have associated fixed costs that are independent random variables, and variable costs that are also independent random variables. The objective is to find a spanning tree whose total fixed cost plus total variable cost is minimum, where the total variable cost is the sum of variable costs along paths from a source node to every other node. Here we examine the distributions of total fixed cost, total variable and total cost obtained from random tree generation. In particular, we compare potential heuristic solutions such as the minimum spanning tree, the shortest paths tree and the best tree obtained from random tree generation.
%A Noam Goldberg
%A Jonathan Eckstein
%T Tightened L0-Relaxation Penalties for Classification
%D Thu Nov 19 08:09:06 2009
%Z Thu Nov 19 08:10:57 EST 2009
%I DIMACS
%R 2009-21
%U
%X
In optimization-based classification model selection, for example when
using linear programming formulations, a standard approach is to
penalize the L1 norm of some linear functional in order to select sparse
models. Instead, we propose a novel integer linear program for sparse
classifier selection, generalizing the minimum disagreement hyperplane
problem whose complexity has been investigated in computational
learning theory. Specifically, our mixed-integer problem is that of
finding a separating hyperplane with minimum empirical error subject
to an L0 penalty. We show that common "soft margin" linear
programming formulations for robust classification are equivalent to a
continuous relaxation of our model. Since the initial continuous
relaxation is weak, we suggest a tighter relaxation, using novel
cutting planes, to better approximate the integer solution. We
describe a boosting algorithm, based on linear programming with
dynamic generation of cuts and columns, that solves our relaxation. We
demonstrate the classification performance of our proposed algorithm
with experimental results, and justify our selection of parameters
using a minimum description length, compression interpretation of
learning.
%A M.C. Ganiz
%A N.I. Lytkin
%A W.M. Pottenger
%T Leveraging Higher Order Dependencies Between Features for Text Classification
%D Mon Jan 18 06:46:59 2010
%Z Mon Jan 18 06:49:30 EST 2010
%I DIMACS
%R 2009-16
%U
%X
Traditional machine learning methods only consider relationships
between feature values within individual data instances while
disregarding the dependencies that link features across instances. In
this work, we develop a general approach to supervised learning by
leveraging higher-order dependencies between features. We introduce a
novel Bayesian framework for classification named Higher Order Naive
Bayes (HONB). Unlike approaches that assume data instances are
independent, HONB leverages co-occurrence relations between feature
values across different instances. Additionally, we generalize our
framework by developing a novel data-driven space transformation that
allows any classifier operating in vector spaces to take advantage of
these co-occurrence relations. Results obtained on several benchmark
text corpora demonstrate that higher-order approaches achieve
significant improvements in classification accuracy over the baseline
(first-order) methods.
%A Joan Feigenbaum
%A Aaron D. Jaggard
%A Michael Schapira
%T Approximate Privacy: PARs for Set Problems
%D Sun Feb 14 00:26:17 2010
%Z Sun Feb 14 00:27:04 EST 2010
%I DIMACS
%R 2010-01
%U
%X
In previous work (DIMACS TR 2009-14), we introduced the Privacy Approximation Ratio (PAR) and used it to study the privacy of protocols for second-price Vickrey auctions and Yao's millionaires problem. Here, we study the PARs of multiple protocols for both the disjointness problem (in which two participants, each with a private subset of {1,...,k}, determine whether their sets are disjoint) and the intersection problem (in which the two participants, each with a private subset of {1,...,k}, determine the intersection of their private sets).

We show that the privacy, as measured by the PAR, provided by any protocol for each of these problems is necessarily exponential (in k). We also consider the ratio between the subjective PARs with respect to each player in order to show that one protocol for each of these problems is significantly fairer than the others (in the sense that it has a similarly bad effect on the privacy of both players).
%A Prahladh Harsha
%A Moses Charikar
%T Limits of Approximation Algorithms: PCPs and Unique Games (DIMACS Tutorial Lecture Notes)
%D Tue Mar 2 18:53:16 2010
%Z Tue Mar 2 18:54:07 EST 2010
%I DIMACS
%R 2010-02
%U
%X
These are the lecture notes for the DIMACS Tutorial Limits
of Approximation Algorithms: PCPs and Unique Games held at the DIMACS Center, CoRE Building,
Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS
Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic
Foundations of the Internet, and the Center for Computational Intractability with support
from the National Security Agency and the National Science Foundation.
The speakers at the tutorial were Matthew Andrews, Sanjeev Arora,
Moses Charikar, Prahladh Harsha, Subhash Khot, Dana Moshkovitz and Lisa Zhang. The scribes were
Ashkan Aazami, Dev Desai, Igor Gorodezky, Geetha Jagannathan, Alexander S. Kulikov,
Darakhshan J. Mir, Alantha Newman, Aleksandar Nikolov, David Pritchard and Gwen Spencer.
%A Markus Jakobsson
%A Karl-Anders Johansson
%T Assured Detection of MalwareÊWith Applications to Mobile Platforms
%D Sat Mar 13 10:18:17 2010
%Z Sat Mar 13 10:20:58 EST 2010
%I DIMACS
%R 2010-03
%U
%X
We introduce the rst software-based attestation approach with provable security properties, and argue for its importance as a component in a new Anti-Virus paradigm. Our new method is practical and efficient. It enables detection of any malware (that does not commit suicide to remain undetected) - even if the infection occurred before our security measure was loaded. Our new approach works independently of computing platform, and is eminently suited to address the threat of mobile malware, for which the current
Anti-Virus paradigm is poorly suited.
Our approach is based on memory-printing of client devices. Memory-printing is a novel and light-weight cryptographic construction whose core property is that it takes notably longer to compute a function if given less RAM than for which it was configured. This makes it impossible for a malware agent to remain active (e.g., in RAM) without being detected, when the function is configured to use all space that should be free after all active applications are swapped out. Our approach is based on inherent timing differences for random access of RAM, flash, and other storage; and the time to communicate with external devices.