DIMACS 1994-00 Special Year on Molecular Biology


The title of the special year, ``Mathematical Support for Molecular Biology,'' is certainly too broad to give a specific idea of the focus of the year. Indeed, virtually every area of molecular biology has some inherently mathematical problems which need solving, and many areas of computer science and mathematics have something to contribute to the biological sciences. Clearly, not all topics which might constitute ``mathematical support for molecular biology'' can be covered in one ``special year'' nor is it appropriate for DIMACS to attempt to address all such issues. DIMACS has a specific focus, associated with discrete mathematics and theoretical computer science. Discrete mathematics is concerned with designs, patterns, sequences, strings, arrays, ... It is concerned with the existence of these discrete structures, with their identification and characterization, and with optimization problems associated with them. Discrete mathematicians interact with theoretical computer scientists, and both communities feed off of each other in their efforts to develop algorithms for solving problems concerned with discrete structures. The tools of discrete mathematics and theoretical computer science involve such areas of mathematics as combinatorics, graph theory, and probability theory.

Discrete structures have been used to formulate some of the most fundamental concepts of molecular biology. It is now well-known that information storage within a cell is by means of long nucleic molecules which can be thought of as long strings of smaller units called nucleotides. Some of the most important problems of modern molecular biology involve combinatorial and algorithmic questions about these strings. Models of proteins, DNA, and RNA molecules as sequences form a basic tool of molecular biology.

Throughout its history, molecular biology has interacted with methods of discrete mathematics and theoretical computer science. The first nucleic acid sequence was determined in 1965 by R.W. Holley and his co-workers at Cornell University (it contained 77 bases). The method used was the fragmentation stratagem, which involved the combinatorial problem of reconstructing an unknown sequence from fragments obtained by certain enzyme decompositions. Graph-theoretical methods, using eulerian chains, were developed by Hutchinson to give algorithms for implementing the fragmentation stratagem. Of course, these methods are not used any more, and indeed were used only for a short time before other, more efficient methods were adopted. However, they played an important role in the development of molecular biology.

Overlap data arising from the study of mutations led Cal Tech geneticist Seymour Benzer in the late 1950's to pose a graph-theoretical question that led to the study of interval graphs. Characterizations of interval graphs led Benzer to conclude that his overlap data was consistent with the hypothesis of linear gene structure, and played a basic role in establishing the idea that DNA or RNA molecules can be viewed as linear words over a 4-letter alphabet. Interval graphs continue to be important today in connection with the study of restriction maps of DNA chains. Their generalizations to higher dimensions, in particular the 2-unit sphere graphs, arise in biochemistry in problems of macro-molecular conformation.

Because much of modern molecular biology is concerned with information transmission, it should be no surprise that the mathematical methods that have been most useful in the development of information science are perhaps the most useful in the development of molecular biology. It is no accident that DIMACS was founded as a consortium whose two industrial members, AT Bell Laboratories and Bellcore, are concerned with information transmission.

The fundamental premise of our special year is that some of the most central problems in molecular biology are essentially problems involving the combinatorial and algorithmic questions that DIMACS-type researchers are good at solving. It is a basic scientific objective of the year to create partnerships between biological scientists and discrete mathematicians/theoretical computer scientists so that some very important questions of molecular biology can be properly and precisely formulated as mathematical problems. It is the hope of the special year organizers that, by bringing together some of the world's leading discrete mathematicians/theoretical computer scientists, some major progress can be made on these problems. Inevitably, some of this ``progress'' will be purely of a mathematical nature. However, it is our hope that we can establish and nurture lines of communication and collaboration so that the mathematical results that arise from the special year will then be modified and revised so that many of them will be quickly communicated to biological scientists and so that biological data and biological questions will continue to play a central role in refinements of the mathematics.

In planning special years, DIMACS has had a tradition of maintaining flexibility so that as the special year progresses, the scientific focus can change. Still, it is necessary to set an agenda before the year begins, and we try to do that by planning some major activities that give the year focus. In this special year, there are two major activities of this kind, the workshops and the algorithm implementation challenge. We have chosen the workshop topics in areas with a proven history of biological relevance and discrete mathematical structure. We will use a series of mini-workshops to be more exploratory, leaving many of them until later to be planned.

We are planning five main workshops of three to five days duration. The fifth will be the culmination of our algorithm implementation challenge, and is a bit different from the others. Generically speaking, the goals for each of our other four main workshops are similar: Get biologists and mathematical scientists together to try to establish long-term collaborations, have them work together to identify mathematical and algorithmic problems that are of interest to biology, and get mathematical scientists with existing or potential interest in those problems together to work on them, together with biologists, both at the workshop and afterwards. In order to achieve these goals, we have involved both mathematical scientists and biological scientists as chief organizers or principal advisors of each of the main workshops and the organizing committees for these workshops will consist of both leading mathematical scientists and leading biological scientists. The organizing committees will be or have been in on the planning of the workshops and their members will be participating as speakers and discussants.

The first three workshops have all been chosen to reflect biological areas with existing important underlying combinatorial themes.


Index Return to Special Year in Mathematical Support for Molecular Biology Home Page
Return to DIMACS Home Page
Contacting the Center
Document last modified on February 2, 2000.