MéForBio - Formal Methods for Bioinformatics
With recognized expertise in the field of formal methods for modelling, analysis, verification and control of dynamic timekeeping systems, our contribution consists in applying these methods to biological systems to help biologists analyze and deduce their dynamic properties. The main areas of research are temporal and hybrid modelling methods (in addition to purely discrete approaches) in order to take into account evolutionary delays. These parameters are crucial to more accurately determine the actual behaviours of the systems analyzed. In addition, static analysis on the one hand, and constraint approaches on the other, allow us to deal with very large systems.
Integration of time series data into discrete models, simulations and accessibility analyses.
We approach the confrontation of large regulatory networks with time-dependent experimental data. The main idea is to model a biological system with process strikes and to verify if various measures of gene variation over time can be explained by the model. A first approach to this problem is to match the in-silico (stochastic) simulation of genes with experimental measurements. To do this, several simulator parameters must be set according to hypotheses established on the method of discretization of quantitative data, which represents an experimental time in our simulation. A second approach is to conduct an accurate accessibility analysis to fully verify whether there is a set of conditions in the model that allow us to obtain a certain qualitative expression scenario. This theme therefore proposes to mix stochastic and accurate reasoning approaches to manage the integration of time series data into discrete models.
o discover the underlying dysfunction of a cellular process, a pathological condition or a biological process, molecular biologists often perform high throughput experimental measurements to identify a set of key molecules that control most experimental data. Ideally, we would like to be able to propose optimal experimental designs (acting on the behaviour of key molecules) that will allow us to understand and control cellular behaviour. This very search for the design of the ideal experiment can be approached by using algorithms on a discrete and qualitative representation of the biological system. We have shown in previous work that reasoning on a discrete, rather than continuous representation of information from a large-scale biological system allows us to integrate high throughput data, known to be heterogeneous, incomplete and noisy. The combinatorial and computational problems behind high throughput data integration are difficult and require the combination of two main aspects: modelling the knowledge of mechanical effects and the rapid and complete exploration of the solution space for discrete variable problems.
In this context, we modeled the transcription and signalling of molecular events using discrete constraints and solved the constraint system using exhaustive methods (based on decision diagrams, dependence graphs, programming by response set). Our results are able to study: (1) system predictions, defined as the intersection of all possible coherent models, (2) the causes of a system disruption and the robustness of the predictions, (3) spatial optimization of incompatible models and (4) spatial optimization of Boolean models, which minimizes the error between high throughput data and qualitative signalling networks. These methods concern difficult computational problems and offer a solution for modelling equilibrium biological systems or their immediate response.
Formalization of Biological Knowledgeation de la Connaissance Biologique
Currently, many electronic and online databases store knowledge on biological regulations. However, in order to model this information and reason about it, existing knowledge must be formalized and limited to a specific vocabulary that can later be interpreted and resolved using specific frameworks. The vocabulary we are interested in is not based on precise quantitative data (expensive and difficult to obtain and reproduce), but rather on discrete and qualitative data. In a first step we proposed a method to automatically obtain interaction graphs from a public database, such as the Interaction Pathways Database. Results of ongoing work propose a list of patterns (biochemical reactions) that can be extracted automatically from public repositories and translated into discrete constraints or regulatory actions, allowing us to model this system in a second step. We can then simulate and reason automatically on the models obtained in an efficient and exhaustive way with frameworks such as Process Strikes or programming by set responses.