You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Avocado can be used for single sample germline variant calling, trio calling,
and joint variant calling. Avocado has >99% SNP calling accuracy, and >96%
INDEL calling accuracy when paired with ADAM's INDEL realignment pipeline.
When run on a single 32 core machine, Avocado can call variants on a 60x
coverage whole genome sequencing (WGS) dataset in approximately 7 hours. By
using Apache Spark to scale across multiple machines, Avocado can process the
same WGS dataset in approximately 15 minutes when using 1,024 cores.
How avocado?
Building Avocado
Avocado uses Maven to build. To build avocado, cd
into the repository and run "mvn package".
Avocado binaries
Nightly builds of Avocado are available from the OSS Sonatype
repository.
Additionally, we make a Docker image available from Quay.
Avocado has been described in a PhD thesis. To cite this thesis, please cite:
@article{nothaft17,
title={Scalable Systems and Algorithms for Genomic Variant Analysis},
author={Nothaft, Frank Austin},
school = {EECS Department, University of California, Berkeley},
uear = {2017},
month = {Dec},
URL = {https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-204.html},
number = {UCB/EECS-2017-204}
}
A preprint describing Avocado should be released by the end of January 2018.