Toward Systems Biology 


The adaptation of living organisms to their environment is controlled at the molecular level by large and complex networks of genes, mRNAs,
proteins, metabolites, and their mutual interactions. We have analyzed the network of global transcription regulators controlling the
adaptation of the bacterium Escherichia coli to environmental stress conditions. Even though E. coli is one of the best studied model
organisms, it is currently little understood how a stress signal is sensed and propagated throughout the network of global regulators, and
leads the cell to respond in an adequate way. Using a qualitative method that is able to overcome the current lack of quantitative data
on kinetic parameters and molecular concentrations, we have modeled the carbon starvation response network and simulated the response of
E. coli cells to carbon deprivation. This has allowed us to identify essential features of the transition between exponential and
stationary phase and to make new predictions on the qualitative system behavior following a carbon upshift. The model predictions have been
tested experimentally by means of gene reporter systems.

Data generated by global approaches such transcriptomics, proteomics or metabolomics give only a partial vision of the biological systems 
if they are not thoroughly examined and are still considered as a major hurdle. Within this context, our biological investigation focus on the 
transcriptional regulation study of the sulphur metabolism pathway in the Arabidopsis thaliana model plant in response to a metal stress 
induced by cadmium. The purpose of the study is to identify the molecular factors involved in the cell adaptation to the metal stress. 
Precisely, our aim is to extract coarse grain knowledge about the transcriptional network coordinating the plant response and to discriminate 
the metal-induced specific response from the generic stress responses. As a first step, awaited results are the following: i. targeting transcriptional
modules containing metal responsive genes; ii. identification of putative transcriptional regulators coordinating the metal response 
and iii. associated putative regulatory cis-elements.

To achieve these, several approaches have been already proposed that rely on different methodological framework but all requiring a large amount 
of measurements. In our case, only few microarray datasets about metal stress response are available today. At a first stage, we choose to perform 
a comparative exploration of a large number of Arabidopsis expression microarray datasets among various experimental conditions (including 
cadmium stress condition). The main idea is to select conditions where a subset of genes exhibits a similar pattern of regulation. In this talk, 
I will mainly focus on the methodological approach we have retained and I will present our preliminary results and discuss opening questions.

Joint work with J. Bourguignon and Y. Vandenbrouck

The goal of synthetic biology is to design and construct biological systems that present a desired behavior. The construction of synthetic
gene networks implementing simple functions has demonstrated the feasibility of this approach. However, the design of these networks is
difficult, notably because existing techniques and tools are not adapted to deal with uncertainties on molecular concentrations and parameter values. 
We propose an approach for the analysis of a class of uncertain piecewise-multiaffine differential equation models. This
modeling framework is well-adapted to the experimental data currently-available. Moreover, these models present interesting
mathematical properties that allow the development of efficient algorithms for solving robustness analyses and tuning problems.
These algorithms are implemented in the tool RoVerGeNe, and their practical applicability and biological relevance are demonstrated on the
analysis of the tuning of a synthetic transcriptional cascade built in E. coli.
Joint work with C. Belta and R. Weiss In this talk, we present recent reachability techniques for continuous and hybrid systems and their potential applications to analyze
dynamical systems models in biology. In particular, we focus on a reachability technique for systems with  polynomial differential equations,
which are a useful model for a variety of biological systems. The essence behind the technique we propose can be described as extending
traditional numerical integration to set integration, and set computations in numerical schemes are performed using techniques from
computer aided geometric design, such as the Bezier techniques. There is currently a need in systems biology for formal modeling methods. Cellular systems can often be viewed as interaction networks
with feedback loops. In addition some kind of data are generally lacking, especially kinetic parameters, and the way several
interactions combine on a given node is often ill-defined.  In this context, our goal is to provide formal tools to assist in reasoning,
inference of model parameters, model revision, hypothesis generation. Constraints represent a natural frame to work with complex systems.
Constraint programming is a family of computer science technologies which allow to describe a problem in terms of mathematical relationships or equations.
It is a declarative approach in which all knowledge about structure and behaviour is described. If parameters are unkown, they are considered as
problem variables. No 'reasonable choices' need be done, contrary to what is done when performing simulations. We are currently using two constraint
technologies: Constraint Logic Programming (CLP) and boolean satisfiability (SAT). They differ by their expressiveness and their
underlying solvers. In SAT all knowledge must be represented in propositional (boolean) logic, which is weakly expressive, but very
efficient solvers exist to ckeck the (un)satisfiability of large boolean formulae.
We focus on a particular kind of networks with sigmoidal interactions. Many regulatory systems can be described in this way.
These interactions can be approximated by step functions, and the behaviour of such system can be described by discrete equations in
place of ordinary differential equations (ODEs), resulting in the so-called Thomas networks, and their generalization by de Jong and colleagues.
The discrete nature of this formalism leads to a representation of all the possible behaviours in the form of a transition graph.
When model parameters are unkown,  a set of transition graphs has to be considered.
When all knowledge and hypotheses have been represented as contraints, queries can be asked to the model. If inconsistencies appear in the
resulting constraint system, critical constraints must be identified in order to revise the model. This approach is illustrated with a
model of nutritional stress in E. coli. The discrete model deduced from the ODEs allows to identify the origin of a discrepancy between
model and observations. From this we can go back to the differential representation and propose new biological hypotheses.
Joint work with Fabien Corblin, Sébastien Tripodi, Laurent Trilling Cancer is recognized to be a family of gene-based diseases whose causes are to be found in disruptions of basic biologic processes.
An increasingly deep catalogue of canonical networks details the specific molecular interaction of genes and their products.
However, mapping of disease phenotypes to alterations of these networks of interactions is accomplished indirectly and non-systematically.
Here we objectively identify pathways associated with malignancy, staging, and outcome in cancer through application of an analytic approach that
systematically evaluates differences in the activity and consistency of interactions within canonical biologic processes.
Using large collections of publicly accessible genome-wide gene expression, we identify small, common sets of pathways – Trka Receptor, Apoptosis
response to DNA Damage, Ceramide, Telomerase, CD40L and Calcineurin – whose differences robustly distinguish diverse tumor types from
corresponding normal samples, predict tumor grade, and distinguish phenotypes such as estrogen receptor status and p53 mutation state.
Pathways identified through this analysis perform as well or better than phenotypes used in the original studies in predicting cancer outcome.
This approach provides a means to use genome-wide characterizations to map key biological processes to important clinical features in disease. [view detailed abstract] (Word format) The complexity of biological regulatory networks calls for the development of proper mathematical methods to model their structures
and to obtain insight in their dynamical behaviours. One qualitative approach consists in modelling regulatory networks in terms of logical
equations, using Boolean or multi-level variables.  Recently, we have proposed a novel implementation of the multi-level logical
modelling approach by means of Multi-valued Decision Diagrams. This representation enabled the development of two efficient
algorithms for the dynamical analysis of parameterised regulatory graphs. A first algorithm allows the identification of all stable
states without generating the state transition graph. A second algorithm assess the conditions insuring the functionality of the
feedback circuits found in the regulatory graph.  These algorithms have been implemented into a novel development version of our logical
modelling software GINsim. Their application to logical models of T cell activation and differentiation will be briefly presented The increasing volume of sequenced genomes, and the recent techniques for performing in vitro molecular evolution, have rekindled the
interest for questions on the origin of life. Nevertheless, a gap continues to exist between the research on prebiotic chemistry and
molecule generation, on one hand, and the study of molecular fossils preserved in genomes, on the other. Here we attempt to fill this gap
by using some assumptions about the prebiotic scenario (including a strong stereochemical basis for the genetic code) to determine the RNA
sequences more likely to appear and subsist. A set of minimal RNA rings is exhaustively determined; a subset of them is then selected
through stability arguments, and a particular ring ("AL ring") is finally singled out as the most likely winner of this prebiotic game.
The rings happen to have several structural and statistical properties of modern genes: a repeated AUG codon appears spontaneously
(and is thus made available for becoming a start signal), the form AUG/STOP emerges, and frequency patterns resemble those of present genes.
The whole set of rings was also compared to a database of tRNAs, considering the conserved positions (located in the free parts of the
molecule, essentially the loops); the ring that most closely matched tRNA sequences ?and matched, in fact, the consensus of tRNA at all the
aligned positions?  was AL, the same ring independently selected before. The unselected emergence of gene-like features through two
simple selection steps, and the close similarity between the finally selected ring and tRNA (including some remarkable features of the
resulting alignment), suggest a possible link between the prebiotic world and the first biological molecules, which is amenable for
experimental testing. Even if our scenario is partially wrong, the unlikely coincidences should provide useful hints for other efforts. Based on the discrete definition of biological regulatory networks developped by René Thomas, we provide a computer science formal approach to treat
temporal properties of biological regulatory networks, expressed in Computational Tree Logic.
It is then possible to automatically compute all the models whose behaviour satisfies a set of given temporal properties. The chosen temporal properties
can reflect established knowledge about the model as well as hypotheses which motivate the biological research.
If the set of computed models is not empty, then we can manipulate the temporal formulae which formalize the hypotheses in a computer aided manner,
according to some logical rules. This allows us to derive a set of sensible wet experiments capable to refute or validate the hypotheses.
Our approach is illustrated on the cytotoxicity example in Pseudomonas aeruginosa. Logical modeling approaches are recognized as a useful tool for analyzing biological regulatory networks based on qualitative
data. Among others, R. Thomas introduced a discrete formalism that captures the structure and the qualitative behavior of a
system. However, the resulting representation of the network's dynamics is coarse and non-deterministic due to the restrictive data
incorporated in the model. A more detailed description of the dynamics is possible if we allow for the integration of temporal information on
the different processes involved in the system's behavior. In this talk, we present an extension of the logical Thomas formalism using
the framework of timed automata, and discuss advantages and difficulties of this approach.
The discovery of fluorescent proteins and the development of image acquisition technology have revolutionized cell biological research.
Intracellular protein and membrane transport can now be studied using microscopy of intact living cells. This approach allows the direct
qualitative and quantitative analysis of the dynamics of a wide range of membrane transport processes. The living cell is thus emerging as a
remarkably complex experimental system, even in the case where a simple unidirectional route of a single fluorescent protein is visualized and
analyzed. In this talk I will try to summarize a decade-long study of the transport dynamics of a fluorescently tagged membrane cargo protein
called VSVG. This protein travels “upon request” owing to a shift to permissive temperature that promotes its folding and export from the
endoplasmic reticulum, through the Golgi apparatus to the cell surface, by membrane bound transport carriers. Studying the intracellular
transport of this single protein has provided us with a wealth of information on the dynamic properties of intracellular transport.
Quantitative kinetic modeling allowed us for the first time to obtain the precise kinetic parameters that accurately describe the entire
intracellular route of VSVG using a series of simple mono-exponential equations. These and other emerging dynamic properties, prompted us to
challenge well established dogmas related to transport mechanisms as well as morphological-functional properties of the Golgi apparatus, a
central secretory organelle. Thus far, these are still modest steps in the long journey towards unraveling the mechanisms underlying the
formation and maintenance of cellular complexity. This talk is a response to the increasing difficulty biologists find in agreeing upon a definition of the gene, and indeed, the increasing disarray
in which that concept finds itself. After briefly reviewing these problems, we propose an alternative to both the concept and the word gene – an
alternative that, like the gene, is intended to capture the essence of inheritance, but which is both richer and more expressive. It is also
clearer in its separation of what the organism statically is (what it tangibly inherits) and what it dynamically does (its functionality and
behavior). Our proposal of a genetic functor, or genitor, is a sweeping extension of the classical genotype/phenotype paradigm, yet it appears to be
faithful to the findings of contemporary biology, encompassing many of the recently emerging, and surprisingly complex, links between structure and
functionality. 
Joint work with Evelyn Fox Keller.
Much of the behavioral repertoire of an organism is determined by the dynamics of the underlying genetic regulatory network. We would
therefore like to (i) determine the connections within the genetic regulatory network and (ii) predict the resulting dynamics gene
expression. Extensive and expensive experiments molecular genetics can, of course, provide this information. However, we would like to
derive the relevant properties of the system from much more easily obtained expression data. We have used this approach in bacterial
model systems. We will present a very simple method for reconstructing a genetic regulatory network in the special case of a closed and
linear system: the mutual regulation of the five sigma factors of the cyanobacterium Synechocystis. More recently, we have acquired time
series expression data during growth transitions in the model bacterium Escherichia coli. We will show how to obtain sufficiently
informative time series of gene expression that can be used to explore the underlying genetic regulatory network of the organism. Stochastic phenomena in cellular processes has certainly received a lot of attention in the recent years. Opinions and attitudes on the
relevance of this aspect vary widely and is probably one of the most illustrative 'cultural' issues in the field of systems biology modeling.
On one hand it is clear that the size of cells forces us to at least consider effects due to the small numbers of molecules involved. On the
other, there are results that show that at least in some cases, Nature goes to long distances to control or eliminate the effects of noise and
stochasticity. Either way, ignoring noise and stochasticity is not realistic. There are several well established and quite transparent methods
to treat these effects and I will attempt to cover some of them. I will illustrate the effect of including noise and how some of these methods
work (or not) on the well studied lac operon.

A fundamental problem in cell biology is that of the nature of the coordination between and within metabolic and signalling pathways. 
We have wondered whether this coordination might involve what we term functioning-dependent structures (FDSs), an FDS being 
an assembly of proteins that associate with one another when performing a task and that disassociate when the task is over. In this investigation, 
we have studied numerically the steady-state kinetics of a model system of FDS made of two sequential monomeric enzymes. Our calculation has
shown that such a FDS can display kinetic properties [Thellier et al., FEBS J., 2006] that the individual enzymes cannot [Legent et al., C.R.Biol., 2006].
These include basic input/output characteristics found in electronic circuits such as linearity, invariance, pulsing and switching. Hence, FDSs can
generate kinetics that might regulate and coordinate metabolism and signalling. Finally, we suggest that the occurrence of terms representative of 
the assembly and disassembly of FDSs in the classical expression of the density of entropy production is characteristic of living systems.

Joint work with G. Legent, P. Amar, V. Norris and C. Ripoll

Recently there has been a growing interest in the application of hybrid systems techniques to biological modeling and analysis, since
it has been observed that several biological processes exhibit the interaction of continuous and discrete phenomena. It has also been
recognized that many biological processes are intrinsically uncertain; stochastic phenomena have in fact been shown to be instrumental in
improving the robustness of certain biological processes, or in inducing variability. In this talk we will describe the development of
a stochastic hybrid model for DNA replication, one of the most fundamental processes behind the life of every cell. We will discuss
how the model was instantiated for the fission yeast and present analysis results that suggest that the predictions of the model do not
match conventional biological wisdom and experimental evidence. Interestingly, the problem appears to be not in the model,
but in conventional biological wisdom. This has motivated follow-on experiments (in vitro and in silico) to test two competing biological
hypotheses that could explain the mismatch. Computational modeling of biological systems is becoming increasingly common as scientists attempt to understand biological phenomena in
their full complexity. We distinguish between two types of biological models – mathematical and computational – which differ in their
representations of biological phenomena. We call the approach of constructing computational models of biological systems Executable
Biology, as it focuses on the design of executable computer algorithms that mimic biological phenomena. In this talk I will survey the main
modeling efforts in this direction, emphasize the applicability and benefits of executable models in biological research, and highlight
some of the main challenges that executable biology poses for Biology and Computer Science.
Joint work with Thomas Henzinger In recent years, knowledge about molecules that regulate cell growth has increased exponentially, but our ability to make sense of this
detailed information has not.  This gap between data accumulation and understanding concerns the cancer research community in particular,
because many of the signaling pathways involved in cellular growth are mutated in cancer and some of the major targets of cancer therapy are
proteins that interact with the complex molecular networks. To better understand and simulate the complex regulatory pathways involved in
cancers, one of us (Kurt W. Kohn) has developed a diagrammatic notation, which we refer to as the Molecular Interaction Map (MIM) language.
In recent years, we have created MIMs of various cellular signaling pathways (p53, apoptosis, HIF, EGFR, cell cycle, ATM-Chk2).
The MIMs have attracted wide interest from diverse laboratories around the globe for their powerful ways to organize biological information.
The MIMs have since evolved toward computer simulation and we have developed the first prototypes of electronic MIMs, which are available
on the Internet and allow easy access to annotations and databases. These maps can operate as educational tools, but can also serve as guides
to simulation-based studies aimed to understand the control principles underlying bioregulatory networks.
Joint work with Kurt W. Kohn, Mirit Aladjem, and John N. Weinstein
To a large extend, the biological properties of biochemical systems known from experiments can be formalized in temporal logics, both
qualitatively and quantitatively. Such a formal specification of the behaviors of the system, under various conditions, opens the way to the use
of automated reasoning tools, not only for validating models and their refinements (e.g. by model-checking techniques) but also for infering
parameter values and reaction rules from temporal properties. We report on our experience in the design of the Biochemical Abstract Machine environment
BIOCHAM and in its use in models of signal transduction and of the cell cycle.
Smart-pooling is an experimental methodology susceptible of increasing efficiency, accuracy and coverage in high-throughput screening
projects. It consists in assaying well-chosen pools of probes, such that each probe is present in several pools, hence tested several
times. The goal is to construct the pools so that the positive probes can usually be identified from the pattern of positive pools, despite
the occurrence of false positives and false negatives. While striving for this goal, several interesting mathematical or computational
problems emerge. In this talk I will discuss these questions and our contributions, from the pooling problem (how should the pools be
designed?), to the decoding problem (how to interpret the outcomes?) and finally to an experimental validation in the context of Y2H
interactome mapping.
C. elegans, a small worm, is one of the most well studied animals, and serves as a "model" organism that helps understand fundamental biological
processes that are conserved also in more advanced organisms. This talk surveys efforts to model C. elegans behavior over the past few years,
describes insights gained through the modeling process and outlines some of the main challenges remaining to make such modeling efforts scalable
to large systems and accessible to biologists.