Commit graph

30 commits

Author SHA1 Message Date
Abirdcfly
17420e9594
delete unreachable test code caused by os.Exit (#1073)
Signed-off-by: Abirdcfly <fp544037857@gmail.com>

Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-08-11 08:13:49 -04:00
Fulvio Scapin
c9e0559cf6
[Docs] moving --xvright out of the FLATTEN-UNFLATTEN FLAGS section (#1065) 2022-08-01 09:53:40 -04:00
John Kerl
de9dbfc212 Fix panic on 'mlr sort -n' 2022-03-28 23:26:37 -04:00
John Kerl
8f04d7671d
Restore --tsvlite (#984)
* Restore --tsvlite flag

* todo

* doc-build artifacts

* doc note on --tsv vs --tsvlite and backslashed data
2022-03-15 09:01:20 -04:00
John Kerl
228f73415e
Add --implicit-tsv-header as alias for --implicit-csv-header, etc (#952)
* Add -implicit-tsv-header as aliias for --implicit-csv-header, etc

* doc-build artifacts for previous commit
2022-02-21 12:48:43 -05:00
John Kerl
d637559fea
On-line help for -s flag (#926)
* On-line help for -s flag

* doc-build artifacts
2022-02-06 11:20:10 -05:00
John Kerl
9f5a11f707
New --lazy-quotes flag for helping with malformed CSV (#925)
* Lazy-quotes option for CSV parser

* doc-build artifacts

* test cases

* doc-proofreads
2022-02-06 01:05:35 -05:00
John Kerl
66c4a077fd
Make TSV finally true TSV (#923)
* Spec-TSV

* doc mods; more test cases
2022-02-06 00:13:55 -05:00
John Kerl
56ef6d30b1
--nidx --fs x should be the same as --fs x --nidx (#912)
* Fix multiple on-line-help issues from #907

* build-artifacts for previous commit

* --nidx --fs x should be the same as --fs x --nidx
2022-02-01 00:14:02 -05:00
John Kerl
bef2fa74de
Update default colorization (#904)
* colorization experiment

* todo

* Add dependency on github.com/johnkerl/lumin

* lumin dependency

* more badges in README.md

* on-line help for bold/underine/reverse

* update webdocs
2022-01-30 14:12:47 -05:00
Stephen Kitt
d536318ed6
Use int64 wherever "64-bit integer" is assumed (#902)
Miller assumes 64-bit integers, but in Go, the int type varies in size
depending on the architecture: 32-bit architectures have int
equivalent to int32. As a result, the supported range of integer
values is greatly reduced on 32-bit architectures compared to what is
suggested by the documentation.

This patch explicitly uses int64 wherever 64-bit integers are
assumed.

Test cases affected by the behaviour of the random generator are
updated to reflect the new values (the existing seed doesn't produce
the same behaviour since the way random values are generated has
changed).

Signed-off-by: Stephen Kitt <steve@sk2.org>
2022-01-27 12:06:25 -05:00
John Kerl
dad6456022
Clarify source for printf-style formatting (#895) 2022-01-24 22:52:38 -05:00
John Kerl
da91878939
Distinguish between JSON and JSON Lines formats (#844)
* Draw a distinction between JSON and JSON Lines formats

* Add JSON Lines to on-line help example

* Have JSON format default to --jlistwrap and --jvstack

* Update test cases for --jlistwrap output for JSON output format

* Have JSON format default to --jlistwrap and --jvstack for --{X}2j as well

* Make --jlistwrap / --jvstack as legacy flags, since now --json and --jsonl

* Add --c2l, --l2c, etc. command-line flags

* docmods for JSON Lines

* Update regression-test cases for JSON / JSON Lines distinction
2022-01-09 11:11:54 -05:00
John Kerl
1b9526e585
More codespell fixes (#834)
* Fix mlr tail -n4

* More codespell fixes
2022-01-03 21:40:53 -05:00
John Kerl
e10fee0724
Improve type-inference performance (#809)
* To-do items for broader platform/go-version benchmarking

* neaten inferrer API

* extend type-inference unit-test cases

* Add benchmark scripts for comparing compiler versions

* mlr version in addition to mlr --version

* some go-benchmark files for Mac/Linux perf comparisons

* neaten perf-scripts

* merge

* type-scan optimization tests

* type-scan optimization infra

* test new inferrer

* mlr --time option

* include --cpuprofile and --traceprofile in on-line help

* sharpen inferred/deferred-type API distinction

* replace old inferrer with newer/faster

* update docs for new type-inferrer
2021-12-27 00:54:21 -05:00
John Kerl
984df274bb Minor neatens 2021-12-25 12:16:10 -05:00
John Kerl
096bb9bc12
Make --ifs-regex and --ips-regex explicit command-line flags (#799)
* Function-pointerize IXS/IXSRegex to reduce runtime iffelsing

* remove IsRegexString and SuppressIXSRegex

* regression tests passing

* doc updates
2021-12-25 00:00:18 -05:00
John Kerl
9c8f8680d6 Fix minor doc-formatting issue on PR #794 2021-12-22 21:29:39 -05:00
John Kerl
157e567909
Dedupe field names by default (#794) 2021-12-22 21:07:29 -05:00
John Kerl
93862f16f9
update mlr -O behavior for #756 (#788) 2021-12-21 22:40:34 -05:00
John Kerl
7a97c9b868
Performance improvement by JIT type inference (#786)
* JIT mlrval type-interfence: mlrval package

* mlrmap refactor

* complete merge from #779

* iterating

* mlrval/format.go

* mlrval/copy.go

* bifs/arithmetic_test.go

* iterate on bifs/collections_test.go

* mlrval_cmp.go

* mlrval JSON iterate

* iterate applying mlrval refactors to dependent packages

* first clean compile in a long while on this branch

* results of first post-compile profiling

* testing

* bugfix in ofmt formatting

* bugfix in octal-supporess

* go fmt

* neaten

* regression tests all passing
2021-12-20 23:56:04 -05:00
John Kerl
58d9ad19bc
Record hashing perf (#781)
* todo

* Rename inputChannel,outputChannel to readerChannel,writerChannel

* Rename inputChannel,outputChannel to readerChannel,writerChannel (#772)

* Start batched-reader API mods

* Singleton-list step for reader-batching at input

* CLI options for records-per-batch and hash-records

* Push channelized-reader logic into DKVP reader

* Push batching logic into chain-transformer, transformers, and channel-writer

* foo

* cmd/mprof and cmd/mprof2

* cmd/mprof3 and cmd/mprof4

* narrowed in on regexp-splitting on IFS/IPS as perf-hit

* neaten

* channelize nidx

* cmd/mprof5

* channelize CSV reader

* channelize NIDX reader

* Dedupe DKVP-reader and NIDX-reader source files

* channelize CSV-lite reader

* channelize XTAB reader

* batchify JSON reader

* channelize GEN pseudo-reader

* scripts for perf-testing on larger files

* merge with main for #776

* Fix record-batching for join and repl

* Fix comment-handling in channelized XTAB reader

* Fix bug found in positional-rename

* Use --no-hash-records by default
2021-12-14 22:41:40 -05:00
John Kerl
f233923351
Performance improvement: record-batching (#779)
* Rename inputChannel,outputChannel to readerChannel,writerChannel

* Rename inputChannel,outputChannel to readerChannel,writerChannel (#772)

* Start batched-reader API mods

* Singleton-list step for reader-batching at input

* CLI options for records-per-batch and hash-records

* Push channelized-reader logic into DKVP reader

* Push batching logic into chain-transformer, transformers, and channel-writer

* foo

* cmd/mprof and cmd/mprof2

* cmd/mprof3 and cmd/mprof4

* narrowed in on regexp-splitting on IFS/IPS as perf-hit

* neaten

* channelize nidx

* cmd/mprof5

* channelize CSV reader

* channelize NIDX reader

* Dedupe DKVP-reader and NIDX-reader source files

* channelize CSV-lite reader

* channelize XTAB reader

* batchify JSON reader

* channelize GEN pseudo-reader

* scripts for perf-testing on larger files

* merge with main for #776

* Fix record-batching for join and repl

* Fix comment-handling in channelized XTAB reader

* Fix bug found in positional-rename
2021-12-13 00:57:52 -05:00
John Kerl
7c9cc61ac9
Fix issue with pipe or dot as IFS for DKVP/NIDX/CSV-lite (#778) 2021-12-12 23:25:16 -05:00
John Kerl
107ab5c4c9
Performance improvement for CSV-lite and DKVP (#774)
* Clarify build-from-source steps

* Performance improvement for CSV-lite and DKVP
2021-12-08 22:47:53 -05:00
John Kerl
cef753c232
Performance improvement: buffered output (#765)
* Experimental #!/usr/bin/env mlr -s feature

* Allow flags after verbs, for shebang support

* Performance optmization: less frequent syscall.write on output
2021-12-01 20:12:31 -05:00
John Kerl
d9dbb1e92e
Improvements for shebang-script handling (#758)
* Experimental #!/usr/bin/env mlr -s feature

* Allow flags after verbs, for shebang support
2021-11-24 01:15:28 -05:00
John Kerl
4394819cf8
Approximate-match feature for online help (#754)
* Explicitly support approximate-match help

* Unit tests for exact and approximate help
2021-11-17 23:37:48 -05:00
John Kerl
bc72cd1857
More Go-package restructuring (#748) 2021-11-12 12:49:55 -05:00
John Kerl
e2b6ec2391
Standardize Go-package structure (#746) 2021-11-11 14:15:13 -05:00