You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ecsv is fast NIF parser and writer based on libcsv
The main purpose of the module is the fast parsing of CSV data in GB volumes.
This requirement leads to the necessity of stream oriented API (see ecsv:parse_stream/4).
ecsv:write/1 and ecsv:write_lines/1 doesn't perform as
expected. For an unknown reason, it is slower than pure Erlang implementation
when compiled using HiPE and used in a real application. On the other hand,
ecsv:parse_stream/5 met expectations and performs around 100MB/s on
commodity HW (i7 2.6GHz).
Current implementation uses enif_make_new_binary() for parsed fields. From
our experience, this call allocates small binaries on the process heap in
contrast to enif_alloc_binary() always allocates on the binary heap which
is slightly slower. Fields from currently parsed line is kept between NIF
calls in an own environment which could lead to bad behavior when parse_raw/3 is called with very short binaries and there is a long row with
many fields. In an extreme case, one long line with short or in worst case
empty fields will lead to quadratic behavior if fed one byte a time. If you
would like parse CSV file with more than 20kB rows with thousands of fields
you should probably use another parser or fix the issue.