Open Access Open Badges Data Note

High-coverage sequencing and annotated assemblies of the budgerigar genome

Ganeshkumar Ganapathy1, Jason T Howard1, James M Ward2, Jianwen Li3, Bo Li3, Yingrui Li3, Yingqi Xiong3, Yong Zhang3, Shiguo Zhou4, David C Schwartz4, Michael Schatz5, Robert Aboukhalil5, Olivier Fedrigo6, Lisa Bukovnik136, Ty Wang2, Greg Wray7, Isabelle Rasolonjatovo8, Roger Winer9, James R Knight9, Sergey Koren1012, Wesley C Warren11, Guojie Zhang3*, Adam M Phillippy1012* and Erich D Jarvis1*

Author Affiliations

1 Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA

2 National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health, Research Triangle Park, Raleigh, NC 27709, USA

3 China National Genebank, BGI-Shenzhen, Shenzhen 518083, China

4 Department of Chemistry, The Laboratory for Molecular and Computational Genomics, Laboratory of Genetics and Biotechnology Center, University of Wisconsin, Madison, WI 53706, USA

5 Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, NY 11724, USA

6 Institute for Genome Sciences & Policy, Duke University, Durham, NC 27710, USA

7 Department of Biology, Center for Systems Biology, Duke University, Durham, NC 27710, USA

8 Illumina Cambridge Ltd, Cambridge, UK

9 454 Life Sciences, Branford, Connecticut 06405, USA

10 Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20740, USA

11 The Genome Institute, Washington University School of Medicine, St. Louis, MO 63110, USA

12 National Biodefense Analysis and Countermeasures Center, Frederick, MD 21702, USA

13 Advanced Liquid Logic Morrisville, Morrisville, NC 27560, USA

For all author emails, please log on.

GigaScience 2014, 3:11  doi:10.1186/2047-217X-3-11

Published: 8 July 2014



Parrots belong to a group of behaviorally advanced vertebrates and have an advanced ability of vocal learning relative to other vocal-learning birds. They can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, little is known about the genetics of these traits. Elucidating the genetic bases would require whole genome sequencing and a robust assembly of a parrot genome.


We present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) -- the most widely studied parrot species in neuroscience and behavior. We present genomic sequence data that includes over 300× raw read coverage from multiple sequencing technologies and chromosome optical maps from a single male animal. The reads and optical maps were used to create three hybrid assemblies representing some of the largest genomic scaffolds to date for a bird; two of which were annotated based on similarities to reference sets of non-redundant human, zebra finch and chicken proteins, and budgerigar transcriptome sequence assemblies. The sequence reads for this project were in part generated and used for both the Assemblathon 2 competition and the first de novo assembly of a giga-scale vertebrate genome utilizing PacBio single-molecule sequencing.


Across several quality metrics, these budgerigar assemblies are comparable to or better than the chicken and zebra finch genome assemblies built from traditional Sanger sequencing reads, and are sufficient to analyze regions that are difficult to sequence and assemble, including those not yet assembled in prior bird genomes, and promoter regions of genes differentially regulated in vocal learning brain regions. This work provides valuable data and material for genome technology development and for investigating the genomics of complex behavioral traits.

Melopsittacus undulatus; Budgerigar; Parakeet; Next-generation sequencing; Hybrid assemblies; Optical maps; Vocal learning