| CARVIEW |
Select Language
HTTP/1.1 200 OK
Connection: keep-alive
Server: nginx/1.24.0 (Ubuntu)
Content-Type: text/html; charset=utf-8
Cache-Control: public, max-age=300
Content-Encoding: gzip
Via: 1.1 varnish, 1.1 varnish
Accept-Ranges: bytes
Age: 0
Date: Sat, 17 Jan 2026 22:08:08 GMT
X-Served-By: cache-dfw-ktki8620083-DFW, cache-bom-vanm7210045-BOM
X-Cache: MISS, MISS
X-Cache-Hits: 0, 0
X-Timer: S1768687688.917317,VS0,VE1058
Vary: Accept, Accept-Encoding
transfer-encoding: chunked
ngrams-loader: Ngrams loader based on https://www.ngrams.info format
ngrams-loader: Ngrams loader based on https://www.ngrams.info format
Downloads
- ngrams-loader-0.1.0.1.tar.gz [browse] (Cabal source package)
- Package description (as included in the package)
Maintainer's Corner
For package maintainers and hackage trustees
Candidates
- No Candidates
| Versions [RSS] | 0.1.0.0, 0.1.0.1 |
|---|---|
| Dependencies | attoparsec (>=0.11.1 && <0.11.2), base (>=4.6 && <4.7), machines (>=0.2.5 && <0.3), mtl, ngrams-loader, parseargs (>=0.1.5 && <0.1.6), resourcet (>=0.4.3 && <0.5), sqlite-simple (>=0.4.5 && <0.5), text (>=0.11 && <1.2) [details] |
| Tested with | ghc ==7.6.3 |
| License | MIT |
| Copyright | Copyright (C) 2014 Yorick Laupa |
| Author | Yorick Laupa |
| Maintainer | Yorick Laupa <yo.eight@gmail.com> |
| Uploaded | by YorickLaupa at 2014-03-25T09:40:50Z |
| Category | Data |
| Home page | https://github.com/YoEight/ngrams-loader |
| Bug tracker | https://github.com/YoEight/ngrams-loader/issues |
| Source repo | head: git clone git://github.com/YoEight/ngrams-loader.git |
| Distributions | |
| Reverse Dependencies | 1 direct, 0 indirect [details] |
| Executables | ngrams-loader |
| Downloads | 1898 total (6 in the last 30 days) |
| Rating | (no votes yet) [estimated by Bayesian average] |
| Your Rating |
|
| Status | Docs available [build log] Successful builds reported [all 1 reports] |
Readme for ngrams-loader-0.1.0.1
[back to package description]ngrams-loader
Ngrams loader based on https://www.ngrams.info format
Installation
Supposed you have at least cabal 1.18 installed
$ cabal sandbox init
$ cabal install --only-dependencies
$ cabal configure
$ cabal install
-- program located in ~/.cabal-sandbox/bin
Usage
usage: ngrams-loader [options] <n-grams file> <SQLite file>
[-2,--bigram] Parses bigrams
[-3,--trigram] Parses trigrams
[-4,--quadgram] Parses 4-grams
[-5,--pentagram] Parses 5-grams
[-c,--create] Creates table before inserts
<n-grams file> N-grams file
<SQLite file> SQlite db file
Example
ngrams-loader --bigram --create w2.txt bigram.db
It parses each line of w2.txt as a bigram, create bigram table before performing inserts and saves everything in bigram.db
Figures
Specs
- Core i7 3770 @ 3.4GHz
- Gentoo with 3.12.13 Linux kernel (64bits)
- 1.055.386 lines bigram file
ngrams-loader --bigram --create w2.txt bigram.db gets
real 0m16.244s
user 0m15.597s
sys 0m0.143s
Sql Schemas
Bigram
create table bigrams(
frequence int,
word1 varchar(100),
word2 varchar(100)
);
Trigram
create table tridgrams(
frequence int,
word1 varchar(100),
word2 varchar(100),
word3 varchar(100)
);
4-gram
create table quadgrams(
frequence int,
word1 varchar(100),
word2 varchar(100),
word3 varchar(100),
word4 varchar(100)
);
5-gram
create table pentagrams(
frequence int,
word1 varchar(100),
word2 varchar(100),
word3 varchar(100),
word4 varchar(100),
word5 varchar(100)
);
