The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome
1 Biofrontiers Institute, University of Colorado, Boulder, CO, USA
2 Department of Chemistry & Biochemistry, University of Colorado, Boulder, CO, USA
3 Second Genome, San Bruno, CA, USA
4 Department of Computer Science, Northern Arizona University, Flagstaff, AZ, USA
5 Argonne National Laboratory, Argonne, IL, USA
6 Marine Biological Laboratory, Woods Hole, MA, USA
7 Howard Hughes Medical Institute, Boulder, CO, USA
GigaScience 2012, 1:7 doi:10.1186/2047-217X-1-7Published: 12 July 2012
We present the Biological Observation Matrix (BIOM, pronounced “biome”) format: a JSON-based file format for representing arbitrary observation by sample contingency tables with associated sample and observation metadata. As the number of categories of comparative omics data types (collectively, the “ome-ome”) grows rapidly, a general format to represent and archive this data will facilitate the interoperability of existing bioinformatics tools and future meta-analyses.
The BIOM file format is supported by an independent open-source software project (the biom-format project), which initially contains Python objects that support the use and manipulation of BIOM data in Python programs, and is intended to be an open development effort where developers can submit implementations of these objects in other programming languages.
The BIOM file format and the biom-format project are steps toward reducing the “bioinformatics bottleneck” that is currently being experienced in diverse areas of biological sciences, and will help us move toward the next phase of comparative omics where basic science is translated into clinical and environmental applications. The BIOM file format is currently recognized as an Earth Microbiome Project Standard, and as a Candidate Standard by the Genomic Standards Consortium.