Commit graph

7 commits

Author SHA1 Message Date
John Kerl
66c4a077fd
Make TSV finally true TSV (#923)
* Spec-TSV

* doc mods; more test cases
2022-02-06 00:13:55 -05:00
John Kerl
a2a9118ad8
Implement shift-lead option for mlr step (#893)
* iterating

* stepper-input refactor in prep for sliding-window PR

* window-keeper util class

* integrate window-keeper into step-transformer
2022-01-23 00:54:39 -05:00
John Kerl
e10fee0724
Improve type-inference performance (#809)
* To-do items for broader platform/go-version benchmarking

* neaten inferrer API

* extend type-inference unit-test cases

* Add benchmark scripts for comparing compiler versions

* mlr version in addition to mlr --version

* some go-benchmark files for Mac/Linux perf comparisons

* neaten perf-scripts

* merge

* type-scan optimization tests

* type-scan optimization infra

* test new inferrer

* mlr --time option

* include --cpuprofile and --traceprofile in on-line help

* sharpen inferred/deferred-type API distinction

* replace old inferrer with newer/faster

* update docs for new type-inferrer
2021-12-27 00:54:21 -05:00
John Kerl
157e567909
Dedupe field names by default (#794) 2021-12-22 21:07:29 -05:00
John Kerl
7a97c9b868
Performance improvement by JIT type inference (#786)
* JIT mlrval type-interfence: mlrval package

* mlrmap refactor

* complete merge from #779

* iterating

* mlrval/format.go

* mlrval/copy.go

* bifs/arithmetic_test.go

* iterate on bifs/collections_test.go

* mlrval_cmp.go

* mlrval JSON iterate

* iterate applying mlrval refactors to dependent packages

* first clean compile in a long while on this branch

* results of first post-compile profiling

* testing

* bugfix in ofmt formatting

* bugfix in octal-supporess

* go fmt

* neaten

* regression tests all passing
2021-12-20 23:56:04 -05:00
John Kerl
f233923351
Performance improvement: record-batching (#779)
* Rename inputChannel,outputChannel to readerChannel,writerChannel

* Rename inputChannel,outputChannel to readerChannel,writerChannel (#772)

* Start batched-reader API mods

* Singleton-list step for reader-batching at input

* CLI options for records-per-batch and hash-records

* Push channelized-reader logic into DKVP reader

* Push batching logic into chain-transformer, transformers, and channel-writer

* foo

* cmd/mprof and cmd/mprof2

* cmd/mprof3 and cmd/mprof4

* narrowed in on regexp-splitting on IFS/IPS as perf-hit

* neaten

* channelize nidx

* cmd/mprof5

* channelize CSV reader

* channelize NIDX reader

* Dedupe DKVP-reader and NIDX-reader source files

* channelize CSV-lite reader

* channelize XTAB reader

* batchify JSON reader

* channelize GEN pseudo-reader

* scripts for perf-testing on larger files

* merge with main for #776

* Fix record-batching for join and repl

* Fix comment-handling in channelized XTAB reader

* Fix bug found in positional-rename
2021-12-13 00:57:52 -05:00
John Kerl
bc72cd1857
More Go-package restructuring (#748) 2021-11-12 12:49:55 -05:00