avocado

A Variant Caller, Distributed

This README represents the TL;DR docs for avocado. More detailed documentation is hosted at Read the Docs.

Who/What/When/Where/Why avocado?

Avocado is a distributed variant caller built on top of the ADAM format and APIs and Apache Spark. Avocado is an open source project and is released under the Apache 2.0 license.

Avocado can be used for single sample germline variant calling, trio calling, and joint variant calling. Avocado has >99% SNP calling accuracy, and >96% INDEL calling accuracy when paired with ADAM's INDEL realignment pipeline. When run on a single 32 core machine, Avocado can call variants on a 60x coverage whole genome sequencing (WGS) dataset in approximately 7 hours. By using Apache Spark to scale across multiple machines, Avocado can process the same WGS dataset in approximately 15 minutes when using 1,024 cores.

How avocado?

Building Avocado

Avocado uses Maven to build. To build avocado, cd into the repository and run "mvn package".

Avocado binaries

Nightly builds of Avocado are available from the OSS Sonatype repository. Additionally, we make a Docker image available from Quay.

License

ADAM is released under the Apache License, Version 2.0.

Citing Avocado

Avocado has been described in a PhD thesis. To cite this thesis, please cite:

@article{nothaft17,
  title={Scalable Systems and Algorithms for Genomic Variant Analysis},
  author={Nothaft, Frank Austin},
  school = {EECS Department, University of California, Berkeley},
  uear = {2017},
  month = {Dec},
  URL = {https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-204.html},
  number = {UCB/EECS-2017-204}
}

A preprint describing Avocado should be released by the end of January 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 346 Commits
avocado-cli		avocado-cli
avocado-core		avocado-core
avocado-distribution		avocado-distribution
bin		bin
docs		docs
scripts		scripts
.gitignore		.gitignore
CHANGES.md		CHANGES.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE_header.txt		LICENSE_header.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

avocado

A Variant Caller, Distributed

Who/What/When/Where/Why avocado?

How avocado?

Building Avocado

Avocado binaries

License

Citing Avocado

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 16

Uh oh!

Languages

License

bigdatagenomics/avocado

Folders and files

Latest commit

History

Repository files navigation

avocado

A Variant Caller, Distributed

Who/What/When/Where/Why avocado?

How avocado?

Building Avocado

Avocado binaries

License

Citing Avocado

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 16

Uh oh!

Languages

Packages