2004-2011 Special Focus on Information Processing in Biology: Overview

This special focus is jointly sponsored by the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS),
The Biological, Mathematical, and Physical Sciences Interfaces Institute for Quantitative Biology (BioMaPS),
And the Rutgers Center for Molecular Biophysics and Biophysical Chemistry (MB Center).

The Center for Discrete Mathematics and Theoretical Computer Science (DIMACS), the Biological, Mathematical, and Physical Sciences Interfaces Institute for Quantitative Biology (BioMaPS), and the Center for Molecular Biophysics and Biophysical Chemistry (MB Center) plan a ``special focus'' on Information Processing in Biology. This will follow up on DIMACS' highly successful special foci on ``Computational Molecular Biology'' and ``Mathematical Support for Molecular Biology.''

Increasingly, many aspects of biology can be viewed as involving the processing of information. Modern information and computer science have played an important role in such major biological accomplishments as the sequencing of the human genome. On the other hand, biological ideas can inspire new concepts and methods in information science. This special focus is motivated by these two observations. The idea for the focus arose during a November 2003 workshop on ``Information Processing in the Biological Organism'' (http://dimacs.rutgers.edu/Workshops/InfoProcess/) organized by Fred Roberts. The workshop was sponsored by NSF and integrated with the NIH Biomedical Information Science and Technology Initiative (BISTI; see http://www.bisti.nih.gov/). At the workshop, internationally-known researchers discussed model systems concerning the topic of information processing in the biological organism, relevant mathematical foundations and algorithms, and how the topic might inform other disciplines, including computer science. A group of workshop leaders, discussing possible follow-up projects, came up with the idea of a series of workshops that would enhance the interdisciplinary collaborations beginning to form and introduce outstanding junior people to the problems and topics of biological information processing. That idea led to this proposal.

The special focus activities will be organized around a series of workshops with four themes:


Two of these themes represent approaches and two represent areas of application of these approaches.

Theme 1: Algorithmic Approaches to Biological Information Processing.

A major theme of the special focus will revolve around algorithms for biological information processing. We take two points of view here. One involves how biological organisms use ``algorithms'' to process information and another involves how we use algorithmic methods to understand how organisms process information. The two points of view are interrelated and will be reflected in three workshops.

Understanding information processing in the biological organism involves dealing with huge data sets. Modern algorithmic methods for dealing with such data sets, especially algorithms involved in pattern recognition, learning, cluster analysis, and, generally speaking, data mining, are especially relevant. Biological information processing takes advantage of regularities such as repetition, structural motifs and patterns, clustering, etc. Understanding such biological processes might, by analogy, lead us to new data mining algorithms and, in turn, methods of data mining might be useful in understanding how organisms process such regularities. One workshop, Detecting and Processing Regularities in High Throughput Biological Data, will be devoted to this topic.

The massive amounts of information gathered in recent years has made it possible to study complex cellular networks using algorithmic methods of data analysis and information science. Predictions about the structure and behavior of gene regulatory networks provide a major challenge for this kind of approach. One of our workshops, Machine Learning Approaches for Understanding Gene Regulation, will examine machine learning approaches to understanding gene regulation. Modern methods of machine learning are especially appropriate given the nature of the data -- copious but noisy and incomplete -- and also provide tools that have been a major area of research at DIMACS.

One potential goal of work on understanding biological information processing is to provide insight into the potential treatment of diseases. Cancer is a case in point. There are many interconnected processes in tumorigenesis, involving tumor cell signaling and information processing. The development of computational models and algorithms that reflect these interconnected processes is the subject of the workshop on Computational Tumor Modeling

Theme 2: Computer Science, Engineering, and Biology: Applications and Analogies.

The study of analogies between information processing in biology and information processing in computer science and engineering offers promise for the understanding of both and we will investigate these analogies. More generally, we will investigate applications of ideas from the biological sciences in computer science and engineering and vice versa. Such analogies and applications are a second major theme of the special focus.

Nanotechnology is a prime example of the role of analogy and application we have in mind. Thanks to analytical tools capable of probing cells at nanometer levels (one atom or molecule at a time), we can learn a great deal about the chemical and mechanical properties of cells and use these to develop and verify computational models of the ``bio-nanosystem.'' Conversely, molecular building blocks of life such as proteins, nucleic acids, lipids, and carbohydrates, which have critical properties at the nanoscale, can lead to ``bio-inspired'' nanosystems, materials, and computational tools with many uses. These topics will be explored in the Workshop on Nanotechnology and Biology

A second workshop (Control, Communication, and Computing in Biology) along these lines will explore the close analogies between biochemical regulatory networks and engineered automatic control systems, such as those common in the aerospace, chemical, consumer electronics, and automobile industries. The workshop will emphasize feedback, which is a central theme in such analogies, and will explore topics such as connections to fault-tolerant computing and analogies between computer and biological immune systems.

RNA interference (RNAi) is a tool for disrupting or inhibiting the expression of specific genes. This initiates a process of post-transcriptional gene silencing that has roles in viral defense and elsewhere. The mechanism of RNAi presents challenging problems. The workshop on The Mechanism and Applications of the RNA Interference Process will investigate both the mechanism and the applications of RNAi, considering the uses of algorithmic and experimental methods for understanding the phenomenon and the potential insights about information processing to be gained from a better understanding of the mechanism of RNAi.

Theme 3: Biological Circuits and Cellular Signaling.

Biochemical networks in the cell are responsible for processing environmental signals, inducing appropriate cellular responses, and sequencing internal events such as gene expression. Through elaborate mechanisms, they allow cells and entire organisms to perform their basic functions. A third theme of the special focus is the elucidation of the function and role of biological circuits and cellular signaling, with an eye to how non-biological networks can be applied to biological ones and vice versa.

Recent years have witnessed remarkable advances in elucidating the components of cellular networks, thanks to technological achievements such as gene chips. They provide a snapshot of the complete genetic activity of a cell, yet their overall connectivity and functional characteristics are still poorly understood. One of our workshops, Strategies for Reverse Engineering Biological Circuits, will address the problem of ``reverse engineering" network structure from gene expression and other data, a fundamental step in understanding the architecture of cellular networks.

Once the architecture of a particular network is understood, the next step is to characterize its signal processing capabilities. Signal transduction pathways integrate and filter the myriad signals which the cell receives from its environment, and induce events such as transcription initiation or other intracellular responses. A second workshop, Cell Communication and Information Processing in Developing Tissues, will focus on the signaling pathways inherent in tissue development.

The homeostatic control of critical variables to viable ranges, the regulation of metabolic networks which break down nutrients and provide the cell with energy and materials, and the role of genetic networks in timing when different proteins are expressed, are all manifestations of the key role played by feedback in life. Feedback mechanisms are most naturally studied as dynamical systems, and the Workshop on Dynamics of Biological Networks will concern itself with questions of dynamics.

Gene regulatory networks (GRNs) dynamically orchestrate the level of expression for each gene in the genome by controlling whether and how vigorously that gene will be transcribed into RNA. They are at the heart of the information processing function of both the individual cell and of the developmental process. But how do these networks change in evolutionary terms, preserving or altering their functionality? How can the evolutionary mechanisms of GRNs be applied to the development of other (possibly non-biological) information processing units? The workshop on Evolution of Gene Regulatory Networks will address these questions.

Theme 4: Proteomics.

The fourth theme of the special focus revolves around proteomics. We will seek to build on the knowledge gained from genomics to understand the activities and interactions of proteins in the cell. Studying the complete set of proteins expressed by the genome of an organism, cell or tissue type during its lifetime is a complex problem because the number of proteins is so large compared to the number of genes, because proteins can undergo numerous modifications, and because the makeup of the proteome changes frequently in response to the environment.

Understanding how information encoded in the three-dimensional structures that underly complex protein-DNA and protein-protein network interaction is one of the fundamental challenges of biology. The workshop on Information Processing by Protein Structures in Molecular Recognition will emphasize algorithms for discovery of spatial patterns, uncovering of relationships of proteins preceding the emergence of folds, and for simulating the protein-protein and protein-DNA recognition process.

A number of recent studies have aimed to produce mathematical models for the evolution of the proteome networks that include all proteins in an organism and their interactions. A mathematical model that captures some of the basic properties of known proteome networks may provide great help in better understanding genome evolution. The workshop on Proteome Network Evolution will concentrate on random graph models for the evolution of proteome networks with an emphasis on gene duplication.

Understanding the proteome can give us insight into the organization and dynamics of the metabolic, signaling, and regulatory networks underlying the life of a cell and help us to understand how these networks can fail during progression of a disease or how their function can be manipulated through drug or genetic interventions. Two of our workshops have such ``functional proteomics'' as their theme. Defective folding has been implicated in the etiology of a number of degenerative diseases. The Functional Proteomics of Neurodegenerative Diseases workshop will investigate the functional proteomics of neurodegenerative diseases. It will investigate computational and experimental approaches for understanding the mechanism of misfolding and, in particular, of amyloid assembly in which proteins that are normally soluble undergo aggregation to form various intermediate species.

Another workshop on functional proteomics, Implications of Mathematical Models of Infection and Molecular Modeling of Hepatitis B Virus, will study mathematical models of infection and molecular modeling of Hepatitis B virus. Models of disease can inspire new methods and concepts in computer science. Most of the steps in the HBV life-cycle, like reorganization, receptor binding, penetration, release of viral genome into host-cell nucleus, encapsidation of proviral RNA, assembly of virus, budding, etc., are not well understood and this workshop will study models of the different steps, with some emphasis on the information processing involved in virus replication.

Opportunities to Participate: The Special Focus will include:

Up. Index of Special Focus on Information Processing in Biology
DIMACS Homepage
Contacting the Center
Document last modified on September 1, 2010.