Open Access Technical Note

The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome

Daniel McDonald1, Jose C Clemente2, Justin Kuczynski3, Jai Ram Rideout4, Jesse Stombaugh2, Doug Wendel2, Andreas Wilke5, Susan Huse6, John Hufnagle6, Folker Meyer5, Rob Knight127 and J Gregory Caporaso45*

Author Affiliations

1 Biofrontiers Institute, University of Colorado, Boulder, CO, USA

2 Department of Chemistry & Biochemistry, University of Colorado, Boulder, CO, USA

3 Second Genome, San Bruno, CA, USA

4 Department of Computer Science, Northern Arizona University, Flagstaff, AZ, USA

5 Argonne National Laboratory, Argonne, IL, USA

6 Marine Biological Laboratory, Woods Hole, MA, USA

7 Howard Hughes Medical Institute, Boulder, CO, USA

For all author emails, please log on.

GigaScience 2012, 1:7  doi:10.1186/2047-217X-1-7

Published: 12 July 2012

Abstract

Background

We present the Biological Observation Matrix (BIOM, pronounced “biome”) format: a JSON-based file format for representing arbitrary observation by sample contingency tables with associated sample and observation metadata. As the number of categories of comparative omics data types (collectively, the “ome-ome”) grows rapidly, a general format to represent and archive this data will facilitate the interoperability of existing bioinformatics tools and future meta-analyses.

Findings

The BIOM file format is supported by an independent open-source software project (the biom-format project), which initially contains Python objects that support the use and manipulation of BIOM data in Python programs, and is intended to be an open development effort where developers can submit implementations of these objects in other programming languages.

Conclusions

The BIOM file format and the biom-format project are steps toward reducing the “bioinformatics bottleneck” that is currently being experienced in diverse areas of biological sciences, and will help us move toward the next phase of comparative omics where basic science is translated into clinical and environmental applications. The BIOM file format is currently recognized as an Earth Microbiome Project Standard, and as a Candidate Standard by the Genomic Standards Consortium.

Keywords:
Microbial ecology; Comparative genomics; Metagenomics; QIIME; MG-RAST; VAMPS; BIOM