You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 28, 2020. It is now read-only.
support various input and output formats (but not too many)
support usage as either a command-line tool or as a programmatic library
make it fast
Language Variants
A big part of my goal was to learn functional programming by implementing this functionality in various languages.
The goal is for each variant to be functionally equivalent, but I’m not quite there yet.
So here's some notes on the status of each one:
Clojure
Fast — possibly the fastest with small windows
Should already be usable as a library
CoffeeScript / Node
Fast
Probably not safe to use as a library yet
Scala
Slow
Window spec is currently hard-coded to 1 day
To Dos
Definite
Support other event timestamp formats
Support other output formats (starting with JSON)
Support cmd-line arg for csv separator
Refactor to be usable as a general-purpose library
CoffeeScript: either in Node or in a browser
Possible
Decide whether data must be passed in sorted or not (would allow for some optimizations)
Add behaviour tests!!
Support parallellization (off by default)
Add an option to specify whether weeks should start on Sunday or Monday
Support rollup windows of N months
Support the input already being a rollup, of which we'd do a bigger rollup
so you might store a per-minute rollup in a file, and generate a per-hour rollup from that
kinda like re-reduce
This'd work really well with a companion tool, something which would support incremental processing by keeping track of where you were in a file or stream, and grabbing only the new part of the data, then updating the pointer
About
Tools and libraries to “roll up” sets of event timestamps into counts for specified windows of time