Table 2

One possible set of DNA sequence data compression factors for the various experimental classes
Class Description Rate for Physically Unique samples Rate for Physically Archived/archivable Samples
1 Historical sampling of environment or time specific elements 1.0 1.0
2 Very rare objects 1.0 1.0
3 Longitudinal studies which could in theory be rerun in the future but have a > 10 year horizon to recreate 1.0 2.0
4 Samples acquired from patients or animals with a high individual acquisition cost, but a conceptually continuous generation 1.0 10.0
5 A complex experiment with > 6 month resource development 10.0 100.0
6 A routine experiment with < 6 month resource development 20.0 200.0
7 Verification experiment as a component in an overall flow 1000.0 ∞ (Infinite compression of data indicates no data archiving; it may, however, be useful simply to record that the experiment was carried out.)

Compression is higher for data that are easy or inexpensive to reproduce, and lower for data derived from unique or irreproducible samples.

Cochrane et al.

Cochrane et al. GigaScience 2012 1:2   doi:10.1186/2047-217X-1-2

Open Data