You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
q's purpose is to bring SQL expressive power to the Linux command line and to provide easy access to text as actual data.
q allows the following:
Performing SQL-like statements directly on tabular text data, auto-caching the data in order to accelerate additional querying on the same file.
Performing SQL statements directly on multi-file sqlite3 databases, without having to merge them or load them into memory
The following table shows the impact of using caching:
Rows
Columns
File Size
Query time without caching
Query time with caching
Speed Improvement
5,000,000
100
4.8GB
4 minutes, 47 seconds
1.92 seconds
x149
1,000,000
100
983MB
50.9 seconds
0.461 seconds
x110
1,000,000
50
477MB
27.1 seconds
0.272 seconds
x99
100,000
100
99MB
5.2 seconds
0.141 seconds
x36
100,000
50
48MB
2.7 seconds
0.105 seconds
x25
Notice that for the current version, caching is not enabled by default, since the caches take disk space. Use -C readwrite or -C read to enable it for a query, or add caching_mode to .qrc to set a new default.
q treats ordinary files as database tables, and supports all SQL constructs, such as WHERE, GROUP BY, JOINs, etc. It supports automatic column name and type detection, and provides full support for multiple character encodings.
Here are some example commands to get the idea:
$ q "SELECT COUNT(*) FROM ./clicks_file.csv WHERE c3 > 32.3"
$ ps -ef | q -H "SELECT UID, COUNT(*) cnt FROM - GROUP BY UID ORDER BY cnt DESC LIMIT 3"
$ q "select count(*) from some_db.sqlite3:::albums a left join another_db.sqlite3:::tracks t on (a.album_id = t.album_id)"