You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nom is a performant yet easy-to-use parser combinators library written in Rust. nom can handle either text or binary data. Here, based on nom, I implemented a parser for the FASTA, a text file format containing one more more protein/DNA/RNA sequences.
Example
externcrate fasta;use fasta::Records;use std::fs::File;use std::io::prelude::*;fnmain(){letmut f = File::open("e.fasta").unwrap();letmut data = String::with_capacity(512);
f.read_to_string(&mut data).unwrap();let a:Vec<_> = Records(&data).into_iter().collect();println!("{:?}", a);}
[Ok(Record { id: "gi|2765644|emb|Z78519.1|CPZ78519", desc: Some("C.pubescens 5.8S rRNA gene and ITS1 and ITS2 DNA"), seq: "ATATGATCGAGTGAATCTGGTGGACTTGTGGTTACTCAGCTCGCCATAGGCTTTGCTTTTGCGGTGACCCTAATTTGTCATTGGGCCTCCTCCCAAGCTTTCCTTGTGGGTTTGAACCTCTAGCACGGTGCAGTA" }), Ok(Record { id: "gi|2765643|emb|Z78518.1|CRZ78518", desc: Some("C.reginae 5.8S rRNA gene and ITS1 and ITS2 DNA"), seq: "CGTAACAAGGTTTCCGTAGGTGAACCTGCGGGAGGATCATTGTTGAGATAGTAGAATATTCGATCGAGTGAATCCGGAGGACTTGTGGTTACTCGGCTCGTCGAAGGCTTAACTTTTGTGGTGACCCTGATTTGT" }), Ok(Record { id: "gi|2765642|emb|Z78517.1|CFZ78517", desc: Some("C.flavum 5.8S rRNA gene and ITS1 and ITS2 DNA"), seq: "CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTGTTGA...
// --snip-- //
...TTAGTTGGGCC" })]
Comparison With bio::io::fasta
As of 2020-08-24, bio::io::fasta is faster.
test tests::bench_read_bio_io_fasta ... bench: 157,047 ns/iter (+/- 19,803)
test tests::bench_read_my_fasta ... bench: 467,353 ns/iter (+/- 53,507)