miller/internal/pkg/stream
John Kerl f233923351
Performance improvement: record-batching (#779)
* Rename inputChannel,outputChannel to readerChannel,writerChannel

* Rename inputChannel,outputChannel to readerChannel,writerChannel (#772)

* Start batched-reader API mods

* Singleton-list step for reader-batching at input

* CLI options for records-per-batch and hash-records

* Push channelized-reader logic into DKVP reader

* Push batching logic into chain-transformer, transformers, and channel-writer

* foo

* cmd/mprof and cmd/mprof2

* cmd/mprof3 and cmd/mprof4

* narrowed in on regexp-splitting on IFS/IPS as perf-hit

* neaten

* channelize nidx

* cmd/mprof5

* channelize CSV reader

* channelize NIDX reader

* Dedupe DKVP-reader and NIDX-reader source files

* channelize CSV-lite reader

* channelize XTAB reader

* batchify JSON reader

* channelize GEN pseudo-reader

* scripts for perf-testing on larger files

* merge with main for #776

* Fix record-batching for join and repl

* Fix comment-handling in channelized XTAB reader

* Fix bug found in positional-rename
2021-12-13 00:57:52 -05:00
..
doc.go Standardize Go-package structure (#746) 2021-11-11 14:15:13 -05:00
README.md Standardize Go-package structure (#746) 2021-11-11 14:15:13 -05:00
stream.go Performance improvement: record-batching (#779) 2021-12-13 00:57:52 -05:00

The streamer uses Go channels to pipe together file-reads, to record-reading/parsing, to a chain of record-transformers, to record-writing/formatting, to terminal standard output.

This is the high-level sketch of Miller.