About Omnicrobe
Context
In recent years, developments in molecular technologies have led to an exponential growth of experimental data and publications, many of which are open, yet separately accessible. Therefore, it is now crucial to make available to researchers bioinformatics applications that propose unified access to both data and related scientific articles. With the help of text mining tools, they can rapidly access and process textual data, link them with other data and make the results available for microbiology research and technology.
Goal
The Omnicrobe application aggregates knowledge from the literature with knowledge available in other databases on microbial biodiversity, which makes possible their comparison for further analysis. Key knowledge is the microbial biotopes and microbial phenotypes.
Information extraction
Information extraction tools achieve information content analysis and standardization. They automatically analyze textual descriptions of microorganism biotopes so that biotope descriptions originating from different experiments can be compared at a large scale. Here analysis means not only the extraction of the relevant spans of text, but also the normalization or categorization with reference resources (e.g. taxonomy of organisms, ontology of habitats, ontology of phenotypes, ontology of physico-chemical properties, etc.). Information retrieval is supported by a powerful semantic search engine that enables ontology-based query.
More on Omnicrobe
Public data
The information managed by the Omnicrobe application is publicly accessible online. It offers numerous cross-functional avenues of use in different fields like food security, ecology, or human health. The main source of information in Omnicrobe are scientific references from PubMed. Omnicrobe includes an increasing volume of textual and non-textual information from relevant biological databases such as Biological Resource Centers (e.g. INRAE CIRM, DSMZ) and major genetic databases (GenBank). The available data is the result of an automatic predictive text-mining process. Users must be aware that this information is not curated.
Text mining process
The text-mining process behind Omnicrobe has been set up by INRAE using the AlvisNLP environment. It consists in extracting the relevant information, mostly textual, from scientific literature and databases. Words or word groups are identified and assigned a type ("habitat", "phenotype", "use" or "taxon"). They are then normalized, meaning they are assigned either a finer category (e.g. cheese as habitat) or an ID that is shared with other public databases (e.g. 1639 is Listeria monocytogenes ID in the NCBI taxonomy). Reference semantic resources such as nomenclatures, ontologies define these IDs or categories. For example, "Irish dairy farms", "dairy cattle farms" or "dairy farms environment" are designated by the same habitat reference class "dairy farm" according to the OntoBiotope ontology.
Who benefits?
Researchers - Rapid overview of microorganisms and their functionality in their ecosystems, leading to a better ability to understand, control and use them in food processing.
Agro-industrial Technology Institutes - Provide more efficient access to research results and therefore boost innovation.
Agrofood Companies and artisans - Gather information which helps identify food microbes more quickly, thereby increasing food safety and speeding up the development of new products.
Food Safety Agencies - Better ability to determine which microbe might interfere with food products and the origin of harmful microbes.
How to cite Omnicrobe?
The Omnicrobe database and the associated data are free of use, available under the CC-BY license. If you share or adapt it, you must give appropriate credit i.e. provide a link to the license, indicate if changes were made and cite the paper: Dérozier S, Bossy R, Deléger L, Ba M, Chaix E, et al. (2023) Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach. PLOS ONE 18(1): e0272473. https://doi.org/10.1371/journal.pone.0272473.
About us
Core group
Participant teams
Bibliography
Omnicrobe database
Posters
- Dérozier, S., Bossy, R., Deléger, L., Ba, M., Chaix, E., Loux, V., Falentin, H., Nédellec, C. Omnicrobe, an open-access database of microbial habitats, phenotypes and uses extracted from text. Presented at JOBIM 2022, Rennes (2022-07-05 - 2022-07-08).
- Falentin, H., Harlé, O., Dérozier, S., Deléger, L., Chaix, E., Ba, M., Bossy, R., Loux, V., Nédellec, C. Omnicrobe : une base de données d’habitats et de phénotypes microbiens. Presented at 23ème édition du colloque du Club des Bactéries Lactiques, Rennes (2022-06-08 - 2022-06-10).
- Dérozier, S., Deléger, L., Chaix, E., Mekdad R., Ba, M., Bossy, R., Sicard, D., Loux, V., Falentin, H. & Nédellec, C. Florilege: a database gathering microbial habitats, phenotypes and uses. Presented at JOBIM 2020, virtual edition (2020-06-30 - 2020-07-03).
- Chaix, E., Dérozier, S., Deléger, L., Falentin, H., Bohuon, J.B., Ba, M., Bossy, R., Sicard, D., Loux, V. & Nédellec, C. Florilege: an integrative database using text mining and ontologies. In: Abstract JOBIM 2018 (p. 563). Presented at JOBIM 2018, Marseille, FRA (2018-07-03 - 2018-07-06).
- Falentin, H., Chaix, E., Dérozier, S., Weber, M., Buchin, S., Dridi, B., Deutsch, S.-M., Valence-Bertel, F., Casaregola, S., Renault, P., Champomier-Vergès, M.-C., Thierry, A., Zagorec, M., Irlinger, F., Delbès, C., Aubin, S., Bessières, P., Loux, V., Bossy, R., Dibie, J., Sicard, D., Nédellec, C. (2017, October). Florilege: a database gathering microbial phenotypes of food interest. In 4th International Conference on Microbial Diversity 2017, Bari, ITA (2017-10-24 - 2017-10-26).
Data production by text-mining and ontologies
Publications
- Robert Bossy, Louise Deléger, Estelle Chaix, Mouhamadou Ba, Claire Nédellec. Bacteria Biotope at BioNLP Open Shared Tasks 2019, Proceedings of the 5th Workshop on BioNLP Open Shared Tasks joint to EMNLP-IJCNLP 2019, Hong-Kong, nov 2019. DOI: 10.18653/v1/D19-5719.
- Chaix, E., Deléger, L., Bossy, R., Nédellec, C. (2018). Text-mining tools for extracting information about microbial biodiversity in food. Food Microbiology, 1-13. , DOI: 10.1016/j.fm.2018.04.011.
- Nédellec, C., Chaix, E., Bossy, R., Deléger, L., Dérozier, S., Bohuon, J.-B., Loux, V. (2018). L'ontologie OntoBiotope pour l'étude de la biodiversité microbienne. Actes des Journées Francophones d'Extraction et de Gestion des Connaissances (EGC'2018), Paris, janvier 2018. pp.353-358.
- Nédellec, C. , Bossy, R., Chaix, E., Deléger, L. (2017). Text-mining and ontologies: new approaches to knowledge discovery of microbial diversity. In: Proceedings of the 4th International Microbial Diversity Conference (p. 221-227). Bari, ed. Marco Gobetti. Baris, Pub. Simtra. ISBN 978-88-943010-0-7, arXiv:1805.04107.
Report
- Robert Bossy, Claire Nédellec, Julien Jourde, Mouhamadou Ba, Estelle Chaix, Louise Deleger. Bacteria biotope annotation guidelines. 2019, 30 p. hal-02787110.
Analysis of needs
Publications
- Chaix, E., Aubin, S., Deléger, L., Nédellec, C. (2017). Text-mining needs of the food microbiology research community. Presented at 2017 EFITA WCCA Congress, Montpellier, FRA (2017-07-02 - 2017-07-06).
- Przybyła, P., Shardlow, M., Aubin, S., Bossy, R., de Castilho, R. E., Piperidis, S., McNaught, J., Ananiadou, S. (2016). Text mining resources for the life sciences. Database, november (25), 1-30. DOI: 10.1093/database/baw145.
Using Omnicrobe for knowledge discovery
Oral presentation at meetings
- Hélène Falentin. Florilège : une base de données de phénotypes microbiens d’intérêt agro-alimentaire. Journées qualiment, 4 february 2020, Paris, France.
- Hélène Falentin, Stéphanie-Marie Deutsch, Valérie Gagnaire, Anne Thierry, Sandra Dérozier, Claire Nédellec. Bioinformatics tools as a way to select microbial strains for fermented food products, 15th Symposium on Bacterial Genetics and Ecology "Ecosystem drivers in a changing planet", (BAGECO), 27 mai 2019, Lisbonne, Portugal.
- Hélène Falentin, Claire Nédellec, Estelle Chaix, Bedis Dridi, Philippe Bessières, et al.. Florilege: a database gathering microbial phenotypes of food interest. 2017 Scientific MEM days: Journées scientifiques MEM (Métaomiques et écosystèmes microbiens), Jan 2017, Paris, France.
Terms of use and Copyright
The access to Omnicrobe is free for academic/non-commercial users. The Omnicrobe webpage can be browsed and all text downloads can be freely copied. The re publication of information is permitted provided that the source is indicated (see the Source column in Omnicrobe interface). The redistribution of data or commercial use from BacDive requires written permission by the Leibniz-Institut DSMZ.
Sources
Disclaimer
The authors of Omnicrobe information have taken any available measure in order for its content to be accurate, consistent and lawful. However, neither the project consortium as a whole nor the individual partners that implicitly or explicitly participated in the creation and publication of this information hold any sort of responsibility that might occur as a result of using its content.