Commit graph

162 commits

Author SHA1 Message Date
Balki
e67bdef98e
cut: Consider -o flag even when using regexes with -r (#1823)
* cut: Consider `-o` flag even when using regexes with `-r`

* update doc for cut -r flag
2025-07-03 18:54:09 -04:00
Christian G. Warden
df73ad8ec0
Add surv Verb to Estimate a Survival Curve (#1788)
Add a surv verb to estimate a survival curve using Kaplan-Meier.  It
requires duration and status (event or censored) columns, and outputs
each distinct duration and corresponding probability of survival.
2025-05-15 18:17:08 -04:00
John Kerl
34bc8a1c3d
Fix print within begin{}/end{} (#1795)
* codemod per se

* unit-test coverage

* lint
2025-05-01 17:18:17 -04:00
John Kerl
100166532c
Fix joinv with "" separator (#1794)
* codemod per se

* unit-test coverage
2025-05-01 17:08:55 -04:00
John Kerl
cc1cd954ea
Fix unflatten with field names like . .x or x..y (#1735)
* Fix unflatten with field name like `.` `.x` or `x..y`

* docs & test data
2024-12-23 12:27:08 -05:00
John Kerl
9f77bbe096
Add help strings for -a/-r in sub/gsub/ssub (#1721)
* Help strings for `-a`/`-r` in `sub`/`gsub`/`ssub`

* `mlr regtest -p test/cases/cli-help` to update expected outputs

* artifacts from `make dev`
2024-11-23 10:13:36 -05:00
John Kerl
047cb4bc28
Static-check fixes from @lespea #1657, batch 1/n (#1703) 2024-10-27 11:42:43 -04:00
John Kerl
05aa16cfcf
Join docs wrong link (#1695)
* Fix join-docs link in online help

* run `make dev` and commit the artifacts
2024-10-17 09:11:03 -04:00
Stephen Kitt
7a0320fc27
Typo fix: programmatically (#1679)
Signed-off-by: Stephen Kitt <steve@sk2.org>
2024-10-06 17:30:12 -04:00
John Kerl
31d6164181
Fix 1668 error-source (#1672)
* Fix 1668 error-source

* run `make dev`
2024-10-05 09:25:47 -04:00
John Kerl
4a2f349289
Update source material for #1665 (#1666)
* Fix source info for #1665

* run `make dev`
2024-10-02 08:46:27 -04:00
John Kerl
f33c0b2cd6
Error in splita/splitax when field contains a single non-string value (#1629) 2024-08-25 19:00:24 -04:00
John Kerl
73e2117b43
Misc. codespell findings (#1628) 2024-08-25 17:40:57 -04:00
John Kerl
1015f18e7b
Fix prepipe handling when filenames have whitespace (#1627)
* Fix prepipe handling when filenames have whitespace

* unit-test data

* Windows-only unit-test item

* Fix Windows fails; neaten
2024-08-25 17:40:07 -04:00
John Kerl
16a898cff4
Fix binary data in JSON output (#1626) 2024-08-25 15:00:51 -04:00
John Kerl
202a79d0e2
On-line help for mlr summary --transpose (#1581)
* On-line help for `mlr summary --transpose`

* run `make dev`
2024-06-08 13:37:07 -04:00
John Kerl
16ab199194
Add mad accumulator for stats1 DSL function (#1561)
* Add `mad` accumulator for `stats1` DSL function

* regression files

* make dev output
2024-05-11 15:55:27 -04:00
John Kerl
5ac48516f7
Add a stat DSL function (#1560)
* Add a `stat` DSL function [WIP]

* artifacts from `make dev`

* regression test
2024-05-09 18:39:44 -04:00
John Kerl
83c44e6d74
Add descriptions for put and filter verbs (#1529)
* Add more info in online help about what put/filter do

* `make dev` artifacts
2024-03-16 17:09:01 -04:00
John Kerl
f01bb92da7
Avoid spurious [] on JSON output in some cases (#1528)
* JSON empty vs `[]` handling [WIP]

* unit-test mods
2024-03-16 17:00:59 -04:00
John Kerl
aff4b9f32d
Improved file-not-found handling (#1508) 2024-02-26 00:12:31 -05:00
John Kerl
f5eaf290cf
mlr sparsify (#1498)
* mlr sparsify

* regression-test cases

* typofix

* Remove mods due to processor-architecture change
2024-02-18 10:56:26 -05:00
John Kerl
e5ec9f67bd
Implement all/by-regex field selection (-a/-r) for mlr sub, gsub, and ssub (#1480)
* Code-dedupe `sub`, `gsub`, and `ssub` verbs

* More dedupe

* Start with -a

* Implement -r

* unit-test cases

* Windows command-line parsing
2024-01-23 17:18:13 -05:00
John Kerl
81d11365a0
mlr reorder with regex support [WIP] (#1473)
* mlr reorder with regex support for field-name selection

* neaten

* -r -b/-a; unit-test cases
2024-01-21 15:17:33 -05:00
John Kerl
ac65675ab1
Auto-unsparsify CSV and TSV on output (#1479)
* Auto-unsparsify CSV

* Update unit-test cases

* More unit-test cases

* Key-change handling for CSV output

* Same for TSV, with unit-test and doc updates
2024-01-20 18:43:49 -05:00
John Kerl
af021f28d7
Support markdown format on input (#1478)
* Support markdown on input

* unit-test files

* doc mods

* Unit-test cases for I/O-format keystroke-savers

* -i/-o md as well as -i/-o markdown
2024-01-20 16:51:15 -05:00
John Kerl
36b4654445
Fix typos in tests for PPRINT barred input (#1476) 2024-01-20 14:07:27 -05:00
John Kerl
794a754c36
Support PPRINT barred input (#1472)
* Support PPRINT barred input

* regression-test files

* output from `make dev`

* doc updates
2024-01-20 12:59:12 -05:00
John Kerl
d2559b8387
Have clean_whitespace re-run type inference (#1464)
* Have `clean_whitespace` re-infer types

* make dev output

* unit-test files

* drive-by typofix

* make dev output
2024-01-01 18:39:27 -05:00
John Kerl
e3b98cd621
On-line help info for mlr join --lk "" (#1458)
* Doc info for `mlr join --lk ""`

* make dev output
2023-12-24 12:43:26 -05:00
John Kerl
0e3a54ed68
Implement mlr uniq -x (#1457)
* mlr uniq -x

* unit-test cases

* make dev
2023-12-23 16:20:11 -05:00
John Kerl
c6b745537a
New strmatch/strmatchx DSL functions (#1448)
* New `match`/`matchx` DSL functions

* unit-test cases

* match/matchx -> strmatch/strmatchx

* help strings for strmatch and strmatchx

* update regex doc page re strmatch/strmatchx

* unit-test update
2023-12-19 14:34:54 -05:00
John Kerl
4706b4bb78
Document and unit-test regex-capture reset logic (#1451)
* mlr --norc cat was erroring

* Document and unit-test regex-capture reset logic
2023-12-19 09:47:59 -05:00
John Kerl
b13adbe6c0
mlr --norc cat was erroring (#1450) 2023-12-19 09:33:34 -05:00
John Kerl
4053d7684c
Preserve regex captures across stack frames (#1447)
* privatize state.RegexCaptures

* stack frame for regex captures

* merge

* unit-test case

* docs re stack frames for regex captures

* more
2023-12-18 10:21:09 -05:00
John Kerl
18a9eaa377
Fix ragged-CSV auto-pad (#1428) 2023-11-19 23:53:53 -05:00
John Kerl
5b6a1d4713
JSONL output does not properly handle keys with quotes (#1425)
* mlr --l2j, --j2l

* make dev for previous commit

* fix #1424

* unit-test cases

* iterate
2023-11-11 18:58:49 -05:00
John Kerl
0493a0debd
Fatal-on-data-error mlr -x option (#1373)
* Fatal-on-data-error `mlr -x` option [WIP]

* arithmetic.go error-reason propagation

* more

* more

* more

* renames

* doc page

* namefix

* fix broken test

* make dev
2023-08-30 19:39:22 -04:00
John Kerl
879f272f79
Typofix in uif/uof percentiles (#1375)
* typofix in uif/uof percentiles

* fix regression-test data
2023-08-30 11:13:35 -04:00
John Kerl
5146dd7f90
New contains DSL function (#1374)
* New `contains` DSL function

* unit-test files, and docs
2023-08-27 21:46:24 -04:00
John Kerl
069c068298
Summing up empty data (#1370)
* empty plus value is value

* unit-test cases

* make-docs output

* docs files

* on-line table for null-handling arithmetic rules

* doc mods
2023-08-26 21:24:34 -04:00
John Kerl
d341cc6dd3
DSL functions for summary stats over arrays / maps (#1364)
* DSL stats functions [WIP]

* refactor

* move percentile computation to bifs module; iterate

* mode and antimode

* percentile iterate

* percentile sketching

* neaten

* unit-test iterate

* unify old & new min & max functions

* unit-test cases

* code-dedupe between mode and antimode

* make mode/antimode ties deterministic via first-found-wins rule

* online help strings for new stats DSL functions

* artifacts from `make dev`

* help info on how min/max now recurse into collections

* artifacts from `make dev`

* typofix
2023-08-26 16:02:30 -04:00
John Kerl
4405f732a1 make-dev artifacts from previous commit 2023-08-23 16:19:37 -04:00
Mr. Lance E Sloan
e2338195ba
filename options for split (iss. #1365) (#1366)
* #1365 - filename options for `split`

* Don't use joiner string when prefix is empty.
* Add option to specify joiner string.
* Add option to not URL-escape file names.

* #1365 - update documentation

* #1365 - don't URL-escape file name prefix

I **_thought_** it'd be cool to apply URL-escaping to the file name prefix as well, just in case it included spaces or other characters.  I forgot that a common use for the prefix is to specify a directory path that will contain the file.  When the slashes ("`/`") of the path are URL-escaped, they become "`%2F`" and the directories will not be created.  So, I moved the prefix handling code to come after the URL-escaping.

* #1365 - new `split` options for CLI help output

* #1365 - fix escape/suffix logic error

Trying to make the `return` statement cleaner, I thought it'd be good to add the file name suffix immediately after the file name is URL-escaped.  I'd forgotten that the suffix will not be added if the new `-e` option is used to skip URL-escaping.  So, I put the suffix back where I had it.

* #1365 - add `split` to the "10 minutes" document

Not strictly part of this issue, but as I was checking for docs that I should update as a result of my changes, I noticed this document showed how to split data using the `put` and `tee` combination, but not about the `split` verb.

* #1365 - updated manpage

When I ran `make dev`, generating `data-diving-examples.md` failed.  The two `manpage.txt` files ended up empty, but `mlr.1` seems to be correct.

---------

Co-authored-by: Mr. Lance E Sloan (sloanlance) <sloanlance@users.noreply.github.com>
2023-08-23 16:08:48 -04:00
John Kerl
2107d520fa
Can't use ${field_name} if it contains UTF-8 characters also encodeable as Latin-1 (#1363)
* unit-test data

* docgen

* windows unit-test accommodations
2023-08-20 12:20:15 -04:00
John Kerl
9d1d2e07ca
Do wildcard globbing on Windows (#1362)
* Glob wildcards on Windows

* test/cases/globbing/0001
2023-08-19 17:40:35 -04:00
John Kerl
793f52c470
sub, gsub, and ssub verbs (#1361)
* sub, gsub, and ssub verbs

* doc mods

* content for verbs reference page

* test/cases/verb-sub-gsub-ssub/
2023-08-19 17:23:01 -04:00
John Kerl
d4a3bf99b2
Support ZSTD compression in-process (#1360)
* Support ZSTD compression in-process

* doc mods

* unit-test cases

* doc-gen artifacts
2023-08-19 15:22:59 -04:00
John Kerl
52db2bf422
Small typos in documentation of mlr nest (#1352)
* Typofix in `nest` documentation

* update test/cases/cli-help

* artifacts from `make dev`
2023-08-09 10:50:26 -04:00
John Kerl
b30aceae36
Add %s format specifier for strftime (#1335) 2023-07-04 17:00:02 -04:00