CGtag: complete genomics toolkit and annotation in a cloud-based Galaxy
- Equal contributors
1 Department of Bioinformatics, Erasmus MC, Dr. Molewaterplein 5, 3015 GE Rotterdam, The Netherlands
2 Department of Urology, Erasmus MC, Dr. Molewaterplein 5, 3015 GE Rotterdam, The Netherlands
3 Netherlands Bioinformatics Center, NBIC, Geert Grooteplein 28, 6525 GA Nijmegen, The Netherlands
4 Department of Microbial Ecology, Netherlands Institute of Ecology, NIOO-KNAW, Droevendaalsesteeg 10, 6708 PB Wageningen, The Netherlands
GigaScience 2014, 3:1 doi:10.1186/2047-217X-3-1Published: 24 January 2014
Complete Genomics provides an open-source suite of command-line tools for the analysis of their CG-formatted mapped sequencing files. Determination of; for example, the functional impact of detected variants, requires annotation with various databases that often require command-line and/or programming experience; thus, limiting their use to the average research scientist. We have therefore implemented this CG toolkit, together with a number of annotation, visualisation and file manipulation tools in Galaxy called CGtag (Complete Genomics Toolkit and Annotation in a Cloud-based Galaxy).
In order to provide research scientists with web-based, simple and accurate analytical and visualisation applications for the selection of candidate mutations from Complete Genomics data, we have implemented the open-source Complete Genomics tool set, CGATools, in Galaxy. In addition we implemented some of the most popular command-line annotation and visualisation tools to allow research scientists to select candidate pathological mutations (SNV, and indels). Furthermore, we have developed a cloud-based public Galaxy instance to host the CGtag toolkit and other associated modules.
CGtag provides a user-friendly interface to all research scientists wishing to select candidate variants from CG or other next-generation sequencing platforms’ data. By using a cloud-based infrastructure, we can also assure sufficient and on-demand computation and storage resources to handle the analysis tasks. The tools are freely available for use from an NBIC/CTMM-TraIT (The Netherlands Bioinformatics Center/Center for Translational Molecular Medicine) cloud-based Galaxy instance, or can be installed to a local (production) Galaxy via the NBIC Galaxy tool shed.