Fix alignment & formatting for webdoc/manpage autogen

This commit is contained in:
John Kerl 2021-09-07 23:40:15 -04:00
parent be1b026ff3
commit a34fdafe8a
16 changed files with 2778 additions and 1657 deletions

View file

@ -547,14 +547,41 @@ While you can do format conversion using `mlr --icsv --ojson cat myfile.csv`, th
<b>mlr help format-conversion</b>
</pre>
<pre class="pre-non-highlight-in-pair">
TO DO: brief list of formats w/ xref to m6 webdocs.
Examples: --csv for CSV-formatted input and output; --icsv --opprint for
CSV-formatted input and pretty-printed output.
Please use --iformat1 --oformat2 rather than --format1 --oformat2.
The latter sets up input and output flags for format1, not all of which
are overridden in all cases by setting output format to format2.
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help basic-examples
mlr help data-formats
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
</pre>
<!---
@ -583,21 +610,41 @@ You can include comments within your data files, and either have them ignored, o
<b>mlr help comments-in-data</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Miller lets you put comments in your data, such as
# This is a comment for a CSV file
a,b,c
1,2,3
4,5,6
Notes:
* Comments are only honored at the start of a line.
* In the absence of any of the below four options, comments are data like
any other text. (The comments-in-data feature is opt-in.)
* When `--pass-comments` is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help basic-examples
mlr help data-formats
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
</pre>
Examples:

View file

@ -28,9 +28,9 @@ NAME
as CSV and tabular JSON.
SYNOPSIS
Usage: mlr [I/O options] {verb} [verb-dependent options ...] {zero or
more file names} Output of one verb may be chained as input to another
using "then", e.g.
Usage: mlr [flags] {verb} [verb-dependent options ...] {zero or more
file names} Output of one verb may be chained as input to another using
"then", e.g.
mlr stats1 -a min,mean,max -f flag,u,v -g color then sort -f color
Please see 'mlr help topics' for more information. Please also see
https://johnkerl.org/miller6
@ -116,6 +116,43 @@ DATA FORMATS
| fox jumped | Record 2: "1":"fox", "2":"jumped"
+---------------------+
HELP OPTIONS
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help basic-examples
mlr help data-formats
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
VERB LIST
altkv bar bootstrap cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
@ -149,117 +186,171 @@ FUNCTION LIST
version ! != !=~ % & && * ** + - . .* .+ .- ./ / // &lt; &lt;&lt; &lt;= == =~ &gt; &gt;= &gt;&gt; &gt;&gt;&gt;
?: ?? ??? ^ ^^ | || ~
HELP OPTIONS
Type 'mlr help {topic}' for any of the following:
mlr help topics
mlr help auxents
mlr help basic-examples
mlr help comments-in-data
mlr help compressed-data
mlr help csv-options
mlr help data-format-options
mlr help data-formats
mlr help double-quoting
mlr help format-conversion
mlr help function
mlr help keyword
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help list-functions-as-paragraph
mlr help list-functions-as-table
mlr help list-keywords
mlr help list-keywords-as-paragraph
mlr help list-verbs
mlr help list-verbs-as-paragraph
mlr help misc
mlr help mlrrc
mlr help number-formatting
mlr help output-colorization
mlr help separator-options
mlr help type-arithmetic-info
mlr help usage-functions
mlr help usage-functions-by-class
mlr help usage-keywords
mlr help usage-verbs
mlr help verb
Shorthands:
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
COMMENTS-IN-DATA FLAGS
Miller lets you put comments in your data, such as
OPTIONS
In the following option flags, the version with "i" designates the
input stream, "o" the output stream, and the version without prefix
sets the option for both input and output stream. For example: --irs
sets the input record separator, --ors the output record separator, and
--rs sets both the input and output separator to the given value.
# This is a comment for a CSV file
a,b,c
1,2,3
4,5,6
DATA-FORMAT OPTIONS
--idkvp --odkvp --dkvp Delimited key-value pairs, e.g "a=1,b=2"
(Miller's default format).
Notes:
--inidx --onidx --nidx Implicitly-integer-indexed fields (Unix-toolkit style).
-T Synonymous with "--nidx --fs tab".
* Comments are only honored at the start of a line.
* In the absence of any of the below four options, comments are data like
any other text. (The comments-in-data feature is opt-in.)
* When `--pass-comments` is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
--icsv --ocsv --csv Comma-separated value (or tab-separated with --fs tab, etc.)
--pass-comments Immediately print commented lines (prefixed by `#`)
within the input.
--pass-comments-with {string}
Immediately print commented lines within input, with
specified prefix.
--skip-comments Ignore commented lines (prefixed by `#`) within the
input.
--skip-comments-with {string}
Ignore commented lines within input, with specified
prefix.
--itsv --otsv --tsv Keystroke-savers for "--icsv --ifs tab",
"--ocsv --ofs tab", "--csv --fs tab".
--iasv --oasv --asv Similar but using ASCII FS 0x1f and RS 0x1e\n",
--iusv --ousv --usv Similar but using Unicode FS U+241F (UTF-8 0xe2909f)\n",
and RS U+241E (UTF-8 0xe2909e)\n",
COMPRESSED-DATA FLAGS
Miller offers a few different ways to handle reading data files which have been compressed.
--icsvlite --ocsvlite --csvlite Comma-separated value (or tab-separated with --fs tab, etc.).
The 'lite' CSV does not handle RFC-CSV double-quoting rules; is
slightly faster and handles heterogeneity in the input stream via
empty newline followed by new header line. See also
https://johnkerl.org/miller6/file-formats.html#csv-tsv-asv-usv-etc
* Decompression done within the Miller process itself: `--bz2in` `--gzin` `--zin`
* Decompression done outside the Miller process: `--prepipe` `--prepipex`
--itsvlite --otsvlite --tsvlite Keystroke-savers for "--icsvlite --ifs tab",
"--ocsvlite --ofs tab", "--csvlite --fs tab".
-t Synonymous with --tsvlite.
--iasvlite --oasvlite --asvlite Similar to --itsvlite et al. but using ASCII FS 0x1f and RS 0x1e\n",
--iusvlite --ousvlite --usvlite Similar to --itsvlite et al. but using Unicode FS U+241F (UTF-8 0xe2909f)\n",
and RS U+241E (UTF-8 0xe2909e)\n",
Using `--prepipe` and `--prepipex` you can specify an action to be
taken on each input file. The prepipe command must be able to read from
standard input; it will be invoked with `{command} &lt; {filename}`. The
prepipex command must take a filename as argument; it will be invoked with
`{command} {filename}`.
--ipprint --opprint --pprint Pretty-printed tabular (produces no
output until all input is in).
--right Right-justifies all fields for PPRINT output.
--barred Prints a border around PPRINT output
(only available for output).
Examples:
--omd Markdown-tabular (only available for output).
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--ixtab --oxtab --xtab Pretty-printed vertical-tabular.
--xvright Right-justifies values for XTAB format.
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
`mlr ... | {your compression command} &gt; outputfilenamegoeshere`
--ijson --ojson --json JSON tabular: sequence or list of one-level
maps: {...}{...} or [{...},{...}].
--jvstack Put one key-value pair per line for JSON output.
--no-jvstack Put objects/arrays all on one line for JSON output.
--jsonx --ojsonx Keystroke-savers for --json --jvstack
--jsonx --ojsonx and --ojson --jvstack, respectively.
--jlistwrap Wrap JSON output in outermost [ ].
--flatsep {string} Separator for flattening multi-level JSON keys,
e.g. '{"a":{"b":3}}' becomes a:b =&gt; 3 for
non-JSON formats. Defaults to ..\n",
Lastly, note that if `--prepipe` or `--prepipex` is specified, it replaces any
decisions that might have been made based on the file suffix. Likewise,
`--gzin`/`--bz2in`/`--zin` are ignored if `--prepipe` is also specified.
-p is a keystroke-saver for --nidx --fs space --repifs
--bz2in Uncompress bzip2 within the Miller process. Done by
default if file ends in `.bz2`.
--gzin Uncompress gzip within the Miller process. Done by
default if file ends in `.gz`.
--prepipe {decompression command}
You can, of course, already do without this for
single input files, e.g. `gunzip &lt; myfile.csv.gz |
mlr ...`. Allowed at the command line, but not in
`.mlrrc` to avoid unexpected code execution.
--prepipe-bz2 Same as `--prepipe bz2`, except this is allowed in
`.mlrrc`.
--prepipe-gunzip Same as `--prepipe gunzip`, except this is allowed in
`.mlrrc`.
--prepipe-zcat Same as `--prepipe zcat`, except this is allowed in
`.mlrrc`.
--prepipex {decompression command}
Like `--prepipe` with one exception: doesn't insert
`&lt;` between command and filename at runtime. Useful
for some commands like `unzip -qc` which don't read
standard input. Allowed at the command line, but not
in `.mlrrc` to avoid unexpected code execution.
--zin Uncompress zlib within the Miller process. Done by
default if file ends in `.z`.
Examples: --csv for CSV-formatted input and output; --icsv --opprint for
CSV-ONLY FLAGS
--allow-ragged-csv-input or --ragged
If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line
has more fields than the header line, use integer
field labels as in the implicit-header case.
--headerless-csv-output Print only CSV data lines; do not print CSV header
lines.
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line
1 of input files. Tip: combine with `label` to
recreate missing headers.
--no-implicit-csv-header Opposite of `--implicit-csv-header`. This is the
default anyway -- the main use is for the flags to
`mlr join` if you have main file(s) which are
headerless but you want to join in on a file which
does have a CSV header. Then you could use `mlr --csv
--implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ...
your-headerless.csv`.
-N Keystroke-saver for `--implicit-csv-header
--headerless-csv-output`.
FILE-FORMAT FLAGS
TO DO: brief list of formats w/ xref to m6 webdocs.
Examples: `--csv` for CSV-formatted input and output; `--icsv --opprint` for
CSV-formatted input and pretty-printed output.
Please use --iformat1 --oformat2 rather than --format1 --oformat2.
The latter sets up input and output flags for format1, not all of which
are overridden in all cases by setting output format to format2.
Please use `--iformat1 --oformat2` rather than `--format1 --oformat2`.
The latter sets up input and output flags for `format1`, not all of which
are overridden in all cases by setting output format to `format2`.
FORMAT-CONVERSION KEYSTROKE-SAVERS
--asv or --asvlite Use ASV format for input and output data.
--csv or -c Use CSV format for input and output data.
--csvlite Use CSV-lite format for input and output data.
--dkvp Use DKVP format for input and output data.
--iasv or --iasvlite Use ASV format for input data.
--icsv Use CSV format for input data.
--icsvlite Use CSV-lite format for input data.
--idkvp Use DKVP format for input data.
--ijson Use JSON format for input data.
--inidx Use NIDX format for input data.
--io {format name} Use format name for input and output data. For
example: `--io csv` is the same as `--csv`.
--ipprint Use PPRINT format for input data.
--itsv Use TSV format for input data.
--itsvlite Use TSV-lite format for input data.
--iusv or --iusvlite Use USV format for input data.
--ixtab Use XTAB format for input data.
--json or -j Use JSON format for input and output data.
--nidx Use NIDX format for input and output data.
--oasv or --oasvlite Use ASV format for output data.
--ocsv Use CSV format for output data.
--ocsvlite Use CSV-lite format for output data.
--odkvp Use DKVP format for output data.
--ojson Use JSON format for output data.
--omd Use markdown-tabular format for output data.
--onidx Use NIDX format for output data.
--opprint Use PPRINT format for output data.
--otsv Use TSV format for output data.
--otsvlite Use TSV-lite format for output data.
--ousv or --ousvlite Use USV format for output data.
--oxtab Use XTAB format for output data.
--pprint Use PPRINT format for input and output data.
--tsv Use TSV format for input and output data.
--tsvlite or -t Use TSV-lite format for input and output data.
--usv or --usvlite Use USV format for input and output data.
--xtab Use XTAB format for input and output data.
-i {format name} Use format name for input data. For example: `-i csv`
is the same as `--icsv`.
-o {format name} Use format name for output data. For example: `-o
csv` is the same as `--ocsv`.
FLATTEN-UNFLATTEN FLAGS
--flatsep or --jflatsep or --oflatsep {string}
Separator for flattening multi-level JSON keys, e.g.
`{"a":{"b":3}}` becomes `a:b =&gt; 3` for non-JSON
formats. Defaults to `.`.
--no-auto-flatten
--no-auto-unflatten
FORMAT-CONVERSION KEYSTROKE-SAVER FLAGS
As keystroke-savers for format-conversion you may use the following:
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--t2c --t2d --t2n --t2j --t2x --t2p --t2m
--d2c --d2t --d2n --d2j --d2x --d2p --d2m
--n2c --n2t --n2d --n2j --n2x --n2p --n2m
@ -270,130 +361,270 @@ OPTIONS
PPRINT, and markdown, respectively. Note that markdown format is available for
output only.
SEPARATORS
THIS IS STILL TBD FOR MILLER 6
--c2b Use CSV for input, PPRINT with `--barred` for output.
--c2d Use CSV for input, DKVP for output.
--c2j Use CSV for input, JSON for output.
--c2m Use CSV for input, markdown-tabular for output.
--c2n Use CSV for input, NIDX for output.
--c2p Use CSV for input, PPRINT for output.
--c2t Use CSV for input, TSV for output.
--c2x Use CSV for input, XTAB for output.
--d2b Use DKVP for input, PPRINT with `--barred` for
output.
--d2c Use DKVP for input, CSV for output.
--d2j Use DKVP for input, JSON for output.
--d2m Use DKVP for input, markdown-tabular for output.
--d2n Use DKVP for input, NIDX for output.
--d2p Use DKVP for input, PPRINT for output.
--d2t Use DKVP for input, TSV for output.
--d2x Use DKVP for input, XTAB for output.
--j2b Use JSON for input, PPRINT with --barred for output.
--j2c Use JSON for input, CSV for output.
--j2d Use JSON for input, DKVP for output.
--j2m Use JSON for input, markdown-tabular for output.
--j2n Use JSON for input, NIDX for output.
--j2p Use JSON for input, PPRINT for output.
--j2t Use JSON for input, TSV for output.
--j2x Use JSON for input, XTAB for output.
--n2b Use NIDX for input, PPRINT with `--barred` for
output.
--n2c Use NIDX for input, CSV for output.
--n2d Use NIDX for input, DKVP for output.
--n2j Use NIDX for input, JSON for output.
--n2m Use NIDX for input, markdown-tabular for output.
--n2p Use NIDX for input, PPRINT for output.
--n2t Use NIDX for input, TSV for output.
--n2x Use NIDX for input, XTAB for output.
--p2c Use PPRINT for input, CSV for output.
--p2d Use PPRINT for input, DKVP for output.
--p2j Use PPRINT for input, JSON for output.
--p2m Use PPRINT for input, markdown-tabular for output.
--p2n Use PPRINT for input, NIDX for output.
--p2t Use PPRINT for input, TSV for output.
--p2x Use PPRINT for input, XTAB for output.
--t2b Use TSV for input, PPRINT with `--barred` for output.
--t2c Use TSV for input, CSV for output.
--t2d Use TSV for input, DKVP for output.
--t2j Use TSV for input, JSON for output.
--t2m Use TSV for input, markdown-tabular for output.
--t2n Use TSV for input, NIDX for output.
--t2p Use TSV for input, PPRINT for output.
--t2x Use TSV for input, XTAB for output.
--x2b Use XTAB for input, PPRINT with `--barred` for
output.
--x2c Use XTAB for input, CSV for output.
--x2d Use XTAB for input, DKVP for output.
--x2j Use XTAB for input, JSON for output.
--x2m Use XTAB for input, markdown-tabular for output.
--x2n Use XTAB for input, NIDX for output.
--x2p Use XTAB for input, PPRINT for output.
--x2t Use XTAB for input, TSV for output.
-p Keystroke-saver for `--nidx --fs space --repifs`.
-T Keystroke-saver for `--nidx --fs tab`.
COMPRESSED I/O
Decompression done within the Miller process itself:
--gzin Uncompress gzip within the Miller process. Done by default if file ends in ".gz".
--bz2in Uncompress bz2ip within the Miller process. Done by default if file ends in ".bz2".
--zin Uncompress zlib within the Miller process. Done by default if file ends in ".z".
JSON-ONLY FLAGS
These are flags which are applicable to JSON format.
Decompression done outside the Miller process:
--prepipe {command} You can, of course, already do without this for single input files,
e.g. "gunzip &lt; myfile.csv.gz | mlr ..."
--prepipex {command} Like --prepipe with one exception: doesn't insert '&lt;' between
command and filename at runtime. Useful for some commands like 'unzip -qc'
which don't read standard input.
--jlistwrap or --jl Wrap JSON output in outermost `[ ]`.
--jvstack Put one key-value pair per line for JSON output
(multi-line output).
--no-jvstack Put objects/arrays all on one line for JSON output.
Using --prepipe and --prepipex you can specify an action to be taken on each
input file. This prepipe command must be able to read from standard input; it
will be invoked with {command} &lt; {filename}.
LEGACY FLAGS
These are flags which don't do anything in the current Miller version.
They are accepted as no-op flags in order to keep old scripts from breaking.
Examples:
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--jknquoteint Type information from JSON input files is now
preserved throughout the processing stream.
--jquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--json-fatal-arrays-on-input
Miller now supports arrays as of version 6.
--json-map-arrays-on-input
Miller now supports arrays as of version 6.
--json-skip-arrays-on-input
Miller now supports arrays as of version 6.
--jsonx The `--jvstack` flag is now default true in Miller 6.
--jvquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--mmap Miller no longer uses memory-mapping to access data
files.
--no-fflush The current implementation of Miller does not use
buffered output, so there is no longer anything to
suppress here.
--no-mmap Miller no longer uses memory-mapping to access data
files.
--ojsonx The `--jvstack` flag is now default true in Miller 6.
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
mlr ... | {your compression command} &gt; outputfilenamegoeshere
MISCELLANEOUS FLAGS
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once.
Example: `mlr --from a.dat --from b.dat cat` is the
same as `mlr cat a.dat b.dat`.
--load {filename} Load DSL script file for all put/filter operations on
the command line. If the name following `--load` is a
directory, load all `*.mlr` files in that directory.
This is just like `put -f` and `filter -f` except
it's up-front on the command line, so you can do
something like `alias mlr='mlr --load ~/myscripts'`
if you like.
--mfrom {filenames} Use this to specify one of more input files before
the verb(s), rather than after. May be used more than
once. The list of filename must end with `--`. This
is useful for example since `--from *.csv` doesn't do
what you might hope but `--mfrom *.csv --` does.
--mload {filenames} Like `--load` but works with more than one filename,
e.g. `--mload *.mlr --`.
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style
codes for floating-point nummbers. If not specified,
default formatting is used. See also the `fmtnum`
function and the `format-values` verb.
--seed {n} with `n` of the form `12345678` or `0xcafefeed`. For
`put`/`filter` `urand`, `urandint`, and `urand32`.
-I Process files in-place. For each file name on the
command line, output is written to a temp file in the
same directory, which is then renamed over the
original. Each file is processed in isolation: if the
output format is CSV, CSV headers will be present in
each output file, statistics are only over each
file's own records; and so on.
-n Process no input files, nor standard input either.
Useful for `mlr put` with `begin`/`end` statements
only. (Same as `--from /dev/null`.) Also useful in
`mlr -n put -v '...'` for analyzing abstract syntax
trees (if that's your thing).
Lastly, note that if --prepipe or --prepipex is specified, it replaces any
decisions that might have been made based on the file suffix. Also,
--gzin/--bz2in/--zin are ignored if --prepipe is also specified.
OUTPUT-COLORIZATION FLAGS
Miller uses colors to highlight outputs. You can specify color preferences.
Note: output colorization does not work on Windows.
COMMENTS IN DATA
--skip-comments Ignore commented lines (prefixed by "#")
within the input.
--skip-comments-with {string} Ignore commented lines within input, with
specified prefix.
--pass-comments Immediately print commented lines (prefixed by "#")
within the input.
--pass-comments-with {string} Immediately print commented lines within input, with
specified prefix.
Things having colors:
Notes:
* Comments are only honored at the start of a line.
* In the absence of any of the above four options, comments are data like
any other text.
* When pass-comments is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc in regression-test output
* Some online-help strings
CSV-SPECIFIC OPTIONS
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line 1
of input files. Tip: combine with "label" to recreate
missing headers.
--no-implicit-csv-header Do not use --implicit-csv-header. This is the default
anyway -- the main use is for the flags to 'mlr join' if you have
main file(s) which are headerless but you want to join in on
a file which does have a CSV header. Then you could use
'mlr --csv --implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ... your-headerless.csv'
--allow-ragged-csv-input|--ragged If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line has more
fields than the header line, use integer field labels as in
the implicit-header case.
--headerless-csv-output Print only CSV data lines.
-N Keystroke-saver for --implicit-csv-header --headerless-csv-output.
Rules for coloring:
DOUBLE-QUOTING FOR CSV/CSVLITE OUTPUT
THIS IS STILL WIP FOR MILLER 6
--quote-all Wrap all fields in double quotes
--quote-none Do not wrap any fields in double quotes, even if they have
OFS or ORS in them
--quote-minimal Wrap fields in double quotes only if they have OFS or ORS
in them (default)
--quote-numeric Wrap fields in double quotes only if they have numbers
in them
--quote-original Wrap fields in double quotes if and only if they were
quoted on input. This isn't sticky for computed fields:
e.g. if fields a and b were quoted on input and you do
"put '$c = $a . $b'" then field c won't inherit a or b's
was-quoted-on-input flag.
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: `mlr --csv cat foo.csv`
* Example: no color: `mlr --csv cat foo.csv &gt; bar.csv`
* Example: no color: `mlr --csv cat foo.csv | less`
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
NUMBER FORMATTING
THIS IS STILL WIP FOR MILLER 6
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style codes for
floating-point nummbers. If not specified, default formatting is used.
See also the fmtnum function within mlr put (mlr --help-all-functions);
see also the format-values function.
Mechanisms for coloring:
OTHER OPTIONS
--seed {n} with n of the form 12345678 or 0xcafefeed. For put/filter
urand()/urandint()/urand32().
--nr-progress-mod {m}, with m a positive integer: print filename and record
count to os.Stderr every m input records.
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once. Example:
"mlr --from a.dat --from b.dat cat" is the same as
"mlr cat a.dat b.dat".
--mfrom {filenames} -- Use this to specify one of more input files before the verb(s),
rather than after. May be used more than once.
The list of filename must end with "--". This is useful
for example since "--from *.csv" doesn't do what you might
hope but "--mfrom *.csv --" does.
--load {filename} Load DSL script file for all put/filter operations on the command line.
If the name following --load is a directory, load all "*.mlr" files
in that directory. This is just like "put -f" and "filter -f"
except it's up-front on the command line, so you can do something like
alias mlr='mlr --load ~/myscripts' if you like.
--mload {names} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-n Process no input files, nor standard input either. Useful
for mlr put with begin/end statements only. (Same as --from
/dev/null.) Also useful in "mlr -n put -v '...'" for
analyzing abstract syntax trees (if that's your thing).
-I Process files in-place. For each file name on the command
line, output is written to a temp file in the same
directory, which is then renamed over the original. Each
file is processed in isolation: if the output format is
CSV, CSV headers will be present in each output file
statistics are only over each file's own records; and so on.
* Miller uses ANSI escape sequences only. This does not work on Windows except within Cygwin.
* Requires `TERM` environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable `export MLR_NO_COLOR=true` means don't color even if stdout+TTY.
* Environment variable `export MLR_ALWAYS_COLOR=true` means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to `less -r`.
* Command-line flags `--no-color` or `-M`, `--always-color` or `-C`.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* `export MLR_KEY_COLOR=208`, `MLR_VALUE_COLOR=33`, etc.:
`MLR_KEY_COLOR` `MLR_VALUE_COLOR` `MLR_PASS_COLOR` `MLR_FAIL_COLOR`
`MLR_REPL_PS1_COLOR` `MLR_REPL_PS2_COLOR` `MLR_HELP_COLOR`
* Command-line flags `--key-color 208`, `--value-color 33`, etc.:
`--key-color` `--value-color` `--pass-color` `--fail-color`
`--repl-ps1-color` `--repl-ps2-color` `--help-color`
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided, the latter take precedence.
Please do mlr `--list-color-codes` to see the available color codes (like 170), and
`mlr --list-color-names` to see available names (like `orchid`).
--always-color or -C
--fail-color
--help-color
--key-color
--list-color-codes
--list-color-names
--no-color or -M
--pass-color
--value-color
PPRINT-ONLY FLAGS
These are flags which are applicable to PPRINT output format.
--barred Prints a border around PPRINT output (not available
for input).
--right Right-justifies all fields for PPRINT output.
SEPARATOR FLAGS
Separator options:
--rs --irs --ors Record separators, e.g. 'lf' or '\\r\\n'
--fs --ifs --ofs --repifs Field separators, e.g. comma
--ps --ips --ops Pair separators, e.g. equals sign
TODO: auto-detect is still TBD for Miller 6
Notes about line endings:
* Default line endings (`--irs` and `--ors`) are "auto" which means autodetect from
the input file format, as long as the input file(s) have lines ending in either
LF (also known as linefeed, `\n`, `0x0a`, or Unix-style) or CRLF (also known as
carriage-return/linefeed pairs, `\r\n`, `0x0d 0x0a`, or Windows-style).
* If both `irs` and `ors` are `auto` (which is the default) then LF input will lead to LF
output and CRLF input will lead to CRLF output, regardless of the platform you're
running on.
* The line-ending autodetector triggers on the first line ending detected in the input
stream. E.g. if you specify a CRLF-terminated file on the command line followed by an
LF-terminated file then autodetected line endings will be CRLF.
* If you use `--ors {something else}` with (default or explicitly specified) `--irs auto`
then line endings are autodetected on input and set to what you specify on output.
* If you use `--irs {something else}` with (default or explicitly specified) `--ors auto`
then the output line endings used are LF on Unix/Linux/BSD/MacOSX, and CRLF on Windows.
Notes about all other separators:
* IPS/OPS are only used for DKVP and XTAB formats, since only in these formats
do key-value pairs appear juxtaposed.
* IRS/ORS are ignored for XTAB format. Nominally IFS and OFS are newlines;
XTAB records are separated by two or more consecutive IFS/OFS -- i.e.
a blank line. Everything above about `--irs/--ors/--rs auto` becomes `--ifs/--ofs/--fs`
auto for XTAB format. (XTAB's default IFS/OFS are "auto".)
* OFS must be single-character for PPRINT format. This is because it is used
with repetition for alignment; multi-character separators would make
alignment impossible.
* OPS may be multi-character for XTAB format, in which case alignment is
disabled.
* TSV is simply CSV using tab as field separator (`--fs tab`).
* FS/PS are ignored for markdown format; RS is used.
* All FS and PS options are ignored for JSON format, since they are not relevant
to the JSON format.
* You can specify separators in any of the following ways, shown by example:
- Type them out, quoting as necessary for shell escapes, e.g.
`--fs '|' --ips :`
- C-style escape sequences, e.g. `--rs '\r\n' --fs '\t'`.
- To avoid backslashing, you can use any of the following names:
TODO desc-to-chars map
* Default separators by format:
TODO default_xses
--fs {string} Specify FS for input and output.
--ifs {string} Specify FS for input.
--ips {string} Specify PS for input.
--irs {string} Specify RS for input.
--ofs {string} Specify FS for output.
--ops {string} Specify PS for output.
--ors {string} Specify RS for output.
--ps {string} Specify PS for input and output.
--repifs Let IFS be repeated: e.g. for splitting on multiple
spaces.
--rs {string} Specify RS for input and output.
AUXILIARY COMMANDS
Available subcommands:
@ -407,80 +638,6 @@ AUXILIARY COMMANDS
repl
For more information, please invoke mlr {subcommand} --help.
REPL
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
OUTPUT COLORIZATION
Things having colors:
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc
in regression-test output
* Some online-help strings
Rules for coloring:
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: mlr --csv cat foo.csv
* Example: no color: mlr --csv cat foo.csv &gt; bar.csv
* Example: no color: mlr --csv cat foo.csv | less
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
Mechanisms for coloring:
* Miller uses ANSI escape sequences only. This does not work on Windows except on Cygwin.
* Requires TERM environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable export MLR_NO_COLOR=true means don't color even if stdout+TTY.
* Environment variable export MLR_ALWAYS_COLOR=true means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to less -r.
* Command-line flags --no-color or -M, --always-color or -C.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* export MLR_KEY_COLOR=208, MLR_VALUE_COLOR-33, etc.:
MLR_KEY_COLOR MLR_VALUE_COLOR MLR_PASS_COLOR MLR_FAIL_COLOR
MLR_REPL_PS1_COLOR MLR_REPL_PS2_COLOR MLR_HELP_COLOR
* Command-line flags --key-color 208, --value-color 33, etc.:
--key-color --value-color --pass-color --fail-color
--repl-ps1-color --repl-ps2-color --help-color
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided,the latter take precedence.
Please do mlr --list-color-codes to see the available color codes (like 170), and
mlr --list-color-names to see available names (like orchid).
MLRRC
You can set up personal defaults via a $HOME/.mlrrc and/or ./.mlrrc.
For example, if you usually process CSV, then you can put "--csv" in your .mlrrc file
@ -514,6 +671,36 @@ MLRRC
See also:
https://miller.readthedocs.io/en/latest/customization.html
REPL
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
VERBS
altkv
Usage: mlr altkv [options]
@ -2547,5 +2734,5 @@ SEE ALSO
2021-09-05 MILLER(1)
2021-09-08 MILLER(1)
</pre>

View file

@ -7,9 +7,9 @@ NAME
as CSV and tabular JSON.
SYNOPSIS
Usage: mlr [I/O options] {verb} [verb-dependent options ...] {zero or
more file names} Output of one verb may be chained as input to another
using "then", e.g.
Usage: mlr [flags] {verb} [verb-dependent options ...] {zero or more
file names} Output of one verb may be chained as input to another using
"then", e.g.
mlr stats1 -a min,mean,max -f flag,u,v -g color then sort -f color
Please see 'mlr help topics' for more information. Please also see
https://johnkerl.org/miller6
@ -95,6 +95,43 @@ DATA FORMATS
| fox jumped | Record 2: "1":"fox", "2":"jumped"
+---------------------+
HELP OPTIONS
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help basic-examples
mlr help data-formats
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
VERB LIST
altkv bar bootstrap cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
@ -128,117 +165,171 @@ FUNCTION LIST
version ! != !=~ % & && * ** + - . .* .+ .- ./ / // < << <= == =~ > >= >> >>>
?: ?? ??? ^ ^^ | || ~
HELP OPTIONS
Type 'mlr help {topic}' for any of the following:
mlr help topics
mlr help auxents
mlr help basic-examples
mlr help comments-in-data
mlr help compressed-data
mlr help csv-options
mlr help data-format-options
mlr help data-formats
mlr help double-quoting
mlr help format-conversion
mlr help function
mlr help keyword
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help list-functions-as-paragraph
mlr help list-functions-as-table
mlr help list-keywords
mlr help list-keywords-as-paragraph
mlr help list-verbs
mlr help list-verbs-as-paragraph
mlr help misc
mlr help mlrrc
mlr help number-formatting
mlr help output-colorization
mlr help separator-options
mlr help type-arithmetic-info
mlr help usage-functions
mlr help usage-functions-by-class
mlr help usage-keywords
mlr help usage-verbs
mlr help verb
Shorthands:
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
COMMENTS-IN-DATA FLAGS
Miller lets you put comments in your data, such as
OPTIONS
In the following option flags, the version with "i" designates the
input stream, "o" the output stream, and the version without prefix
sets the option for both input and output stream. For example: --irs
sets the input record separator, --ors the output record separator, and
--rs sets both the input and output separator to the given value.
# This is a comment for a CSV file
a,b,c
1,2,3
4,5,6
DATA-FORMAT OPTIONS
--idkvp --odkvp --dkvp Delimited key-value pairs, e.g "a=1,b=2"
(Miller's default format).
Notes:
--inidx --onidx --nidx Implicitly-integer-indexed fields (Unix-toolkit style).
-T Synonymous with "--nidx --fs tab".
* Comments are only honored at the start of a line.
* In the absence of any of the below four options, comments are data like
any other text. (The comments-in-data feature is opt-in.)
* When `--pass-comments` is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
--icsv --ocsv --csv Comma-separated value (or tab-separated with --fs tab, etc.)
--pass-comments Immediately print commented lines (prefixed by `#`)
within the input.
--pass-comments-with {string}
Immediately print commented lines within input, with
specified prefix.
--skip-comments Ignore commented lines (prefixed by `#`) within the
input.
--skip-comments-with {string}
Ignore commented lines within input, with specified
prefix.
--itsv --otsv --tsv Keystroke-savers for "--icsv --ifs tab",
"--ocsv --ofs tab", "--csv --fs tab".
--iasv --oasv --asv Similar but using ASCII FS 0x1f and RS 0x1e\n",
--iusv --ousv --usv Similar but using Unicode FS U+241F (UTF-8 0xe2909f)\n",
and RS U+241E (UTF-8 0xe2909e)\n",
COMPRESSED-DATA FLAGS
Miller offers a few different ways to handle reading data files which have been compressed.
--icsvlite --ocsvlite --csvlite Comma-separated value (or tab-separated with --fs tab, etc.).
The 'lite' CSV does not handle RFC-CSV double-quoting rules; is
slightly faster and handles heterogeneity in the input stream via
empty newline followed by new header line. See also
https://johnkerl.org/miller6/file-formats.html#csv-tsv-asv-usv-etc
* Decompression done within the Miller process itself: `--bz2in` `--gzin` `--zin`
* Decompression done outside the Miller process: `--prepipe` `--prepipex`
--itsvlite --otsvlite --tsvlite Keystroke-savers for "--icsvlite --ifs tab",
"--ocsvlite --ofs tab", "--csvlite --fs tab".
-t Synonymous with --tsvlite.
--iasvlite --oasvlite --asvlite Similar to --itsvlite et al. but using ASCII FS 0x1f and RS 0x1e\n",
--iusvlite --ousvlite --usvlite Similar to --itsvlite et al. but using Unicode FS U+241F (UTF-8 0xe2909f)\n",
and RS U+241E (UTF-8 0xe2909e)\n",
Using `--prepipe` and `--prepipex` you can specify an action to be
taken on each input file. The prepipe command must be able to read from
standard input; it will be invoked with `{command} < {filename}`. The
prepipex command must take a filename as argument; it will be invoked with
`{command} {filename}`.
--ipprint --opprint --pprint Pretty-printed tabular (produces no
output until all input is in).
--right Right-justifies all fields for PPRINT output.
--barred Prints a border around PPRINT output
(only available for output).
Examples:
--omd Markdown-tabular (only available for output).
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--ixtab --oxtab --xtab Pretty-printed vertical-tabular.
--xvright Right-justifies values for XTAB format.
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
`mlr ... | {your compression command} > outputfilenamegoeshere`
--ijson --ojson --json JSON tabular: sequence or list of one-level
maps: {...}{...} or [{...},{...}].
--jvstack Put one key-value pair per line for JSON output.
--no-jvstack Put objects/arrays all on one line for JSON output.
--jsonx --ojsonx Keystroke-savers for --json --jvstack
--jsonx --ojsonx and --ojson --jvstack, respectively.
--jlistwrap Wrap JSON output in outermost [ ].
--flatsep {string} Separator for flattening multi-level JSON keys,
e.g. '{"a":{"b":3}}' becomes a:b => 3 for
non-JSON formats. Defaults to ..\n",
Lastly, note that if `--prepipe` or `--prepipex` is specified, it replaces any
decisions that might have been made based on the file suffix. Likewise,
`--gzin`/`--bz2in`/`--zin` are ignored if `--prepipe` is also specified.
-p is a keystroke-saver for --nidx --fs space --repifs
--bz2in Uncompress bzip2 within the Miller process. Done by
default if file ends in `.bz2`.
--gzin Uncompress gzip within the Miller process. Done by
default if file ends in `.gz`.
--prepipe {decompression command}
You can, of course, already do without this for
single input files, e.g. `gunzip < myfile.csv.gz |
mlr ...`. Allowed at the command line, but not in
`.mlrrc` to avoid unexpected code execution.
--prepipe-bz2 Same as `--prepipe bz2`, except this is allowed in
`.mlrrc`.
--prepipe-gunzip Same as `--prepipe gunzip`, except this is allowed in
`.mlrrc`.
--prepipe-zcat Same as `--prepipe zcat`, except this is allowed in
`.mlrrc`.
--prepipex {decompression command}
Like `--prepipe` with one exception: doesn't insert
`<` between command and filename at runtime. Useful
for some commands like `unzip -qc` which don't read
standard input. Allowed at the command line, but not
in `.mlrrc` to avoid unexpected code execution.
--zin Uncompress zlib within the Miller process. Done by
default if file ends in `.z`.
Examples: --csv for CSV-formatted input and output; --icsv --opprint for
CSV-ONLY FLAGS
--allow-ragged-csv-input or --ragged
If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line
has more fields than the header line, use integer
field labels as in the implicit-header case.
--headerless-csv-output Print only CSV data lines; do not print CSV header
lines.
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line
1 of input files. Tip: combine with `label` to
recreate missing headers.
--no-implicit-csv-header Opposite of `--implicit-csv-header`. This is the
default anyway -- the main use is for the flags to
`mlr join` if you have main file(s) which are
headerless but you want to join in on a file which
does have a CSV header. Then you could use `mlr --csv
--implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ...
your-headerless.csv`.
-N Keystroke-saver for `--implicit-csv-header
--headerless-csv-output`.
FILE-FORMAT FLAGS
TO DO: brief list of formats w/ xref to m6 webdocs.
Examples: `--csv` for CSV-formatted input and output; `--icsv --opprint` for
CSV-formatted input and pretty-printed output.
Please use --iformat1 --oformat2 rather than --format1 --oformat2.
The latter sets up input and output flags for format1, not all of which
are overridden in all cases by setting output format to format2.
Please use `--iformat1 --oformat2` rather than `--format1 --oformat2`.
The latter sets up input and output flags for `format1`, not all of which
are overridden in all cases by setting output format to `format2`.
FORMAT-CONVERSION KEYSTROKE-SAVERS
--asv or --asvlite Use ASV format for input and output data.
--csv or -c Use CSV format for input and output data.
--csvlite Use CSV-lite format for input and output data.
--dkvp Use DKVP format for input and output data.
--iasv or --iasvlite Use ASV format for input data.
--icsv Use CSV format for input data.
--icsvlite Use CSV-lite format for input data.
--idkvp Use DKVP format for input data.
--ijson Use JSON format for input data.
--inidx Use NIDX format for input data.
--io {format name} Use format name for input and output data. For
example: `--io csv` is the same as `--csv`.
--ipprint Use PPRINT format for input data.
--itsv Use TSV format for input data.
--itsvlite Use TSV-lite format for input data.
--iusv or --iusvlite Use USV format for input data.
--ixtab Use XTAB format for input data.
--json or -j Use JSON format for input and output data.
--nidx Use NIDX format for input and output data.
--oasv or --oasvlite Use ASV format for output data.
--ocsv Use CSV format for output data.
--ocsvlite Use CSV-lite format for output data.
--odkvp Use DKVP format for output data.
--ojson Use JSON format for output data.
--omd Use markdown-tabular format for output data.
--onidx Use NIDX format for output data.
--opprint Use PPRINT format for output data.
--otsv Use TSV format for output data.
--otsvlite Use TSV-lite format for output data.
--ousv or --ousvlite Use USV format for output data.
--oxtab Use XTAB format for output data.
--pprint Use PPRINT format for input and output data.
--tsv Use TSV format for input and output data.
--tsvlite or -t Use TSV-lite format for input and output data.
--usv or --usvlite Use USV format for input and output data.
--xtab Use XTAB format for input and output data.
-i {format name} Use format name for input data. For example: `-i csv`
is the same as `--icsv`.
-o {format name} Use format name for output data. For example: `-o
csv` is the same as `--ocsv`.
FLATTEN-UNFLATTEN FLAGS
--flatsep or --jflatsep or --oflatsep {string}
Separator for flattening multi-level JSON keys, e.g.
`{"a":{"b":3}}` becomes `a:b => 3` for non-JSON
formats. Defaults to `.`.
--no-auto-flatten
--no-auto-unflatten
FORMAT-CONVERSION KEYSTROKE-SAVER FLAGS
As keystroke-savers for format-conversion you may use the following:
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--t2c --t2d --t2n --t2j --t2x --t2p --t2m
--d2c --d2t --d2n --d2j --d2x --d2p --d2m
--n2c --n2t --n2d --n2j --n2x --n2p --n2m
@ -249,130 +340,270 @@ OPTIONS
PPRINT, and markdown, respectively. Note that markdown format is available for
output only.
SEPARATORS
THIS IS STILL TBD FOR MILLER 6
--c2b Use CSV for input, PPRINT with `--barred` for output.
--c2d Use CSV for input, DKVP for output.
--c2j Use CSV for input, JSON for output.
--c2m Use CSV for input, markdown-tabular for output.
--c2n Use CSV for input, NIDX for output.
--c2p Use CSV for input, PPRINT for output.
--c2t Use CSV for input, TSV for output.
--c2x Use CSV for input, XTAB for output.
--d2b Use DKVP for input, PPRINT with `--barred` for
output.
--d2c Use DKVP for input, CSV for output.
--d2j Use DKVP for input, JSON for output.
--d2m Use DKVP for input, markdown-tabular for output.
--d2n Use DKVP for input, NIDX for output.
--d2p Use DKVP for input, PPRINT for output.
--d2t Use DKVP for input, TSV for output.
--d2x Use DKVP for input, XTAB for output.
--j2b Use JSON for input, PPRINT with --barred for output.
--j2c Use JSON for input, CSV for output.
--j2d Use JSON for input, DKVP for output.
--j2m Use JSON for input, markdown-tabular for output.
--j2n Use JSON for input, NIDX for output.
--j2p Use JSON for input, PPRINT for output.
--j2t Use JSON for input, TSV for output.
--j2x Use JSON for input, XTAB for output.
--n2b Use NIDX for input, PPRINT with `--barred` for
output.
--n2c Use NIDX for input, CSV for output.
--n2d Use NIDX for input, DKVP for output.
--n2j Use NIDX for input, JSON for output.
--n2m Use NIDX for input, markdown-tabular for output.
--n2p Use NIDX for input, PPRINT for output.
--n2t Use NIDX for input, TSV for output.
--n2x Use NIDX for input, XTAB for output.
--p2c Use PPRINT for input, CSV for output.
--p2d Use PPRINT for input, DKVP for output.
--p2j Use PPRINT for input, JSON for output.
--p2m Use PPRINT for input, markdown-tabular for output.
--p2n Use PPRINT for input, NIDX for output.
--p2t Use PPRINT for input, TSV for output.
--p2x Use PPRINT for input, XTAB for output.
--t2b Use TSV for input, PPRINT with `--barred` for output.
--t2c Use TSV for input, CSV for output.
--t2d Use TSV for input, DKVP for output.
--t2j Use TSV for input, JSON for output.
--t2m Use TSV for input, markdown-tabular for output.
--t2n Use TSV for input, NIDX for output.
--t2p Use TSV for input, PPRINT for output.
--t2x Use TSV for input, XTAB for output.
--x2b Use XTAB for input, PPRINT with `--barred` for
output.
--x2c Use XTAB for input, CSV for output.
--x2d Use XTAB for input, DKVP for output.
--x2j Use XTAB for input, JSON for output.
--x2m Use XTAB for input, markdown-tabular for output.
--x2n Use XTAB for input, NIDX for output.
--x2p Use XTAB for input, PPRINT for output.
--x2t Use XTAB for input, TSV for output.
-p Keystroke-saver for `--nidx --fs space --repifs`.
-T Keystroke-saver for `--nidx --fs tab`.
COMPRESSED I/O
Decompression done within the Miller process itself:
--gzin Uncompress gzip within the Miller process. Done by default if file ends in ".gz".
--bz2in Uncompress bz2ip within the Miller process. Done by default if file ends in ".bz2".
--zin Uncompress zlib within the Miller process. Done by default if file ends in ".z".
JSON-ONLY FLAGS
These are flags which are applicable to JSON format.
Decompression done outside the Miller process:
--prepipe {command} You can, of course, already do without this for single input files,
e.g. "gunzip < myfile.csv.gz | mlr ..."
--prepipex {command} Like --prepipe with one exception: doesn't insert '<' between
command and filename at runtime. Useful for some commands like 'unzip -qc'
which don't read standard input.
--jlistwrap or --jl Wrap JSON output in outermost `[ ]`.
--jvstack Put one key-value pair per line for JSON output
(multi-line output).
--no-jvstack Put objects/arrays all on one line for JSON output.
Using --prepipe and --prepipex you can specify an action to be taken on each
input file. This prepipe command must be able to read from standard input; it
will be invoked with {command} < {filename}.
LEGACY FLAGS
These are flags which don't do anything in the current Miller version.
They are accepted as no-op flags in order to keep old scripts from breaking.
Examples:
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--jknquoteint Type information from JSON input files is now
preserved throughout the processing stream.
--jquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--json-fatal-arrays-on-input
Miller now supports arrays as of version 6.
--json-map-arrays-on-input
Miller now supports arrays as of version 6.
--json-skip-arrays-on-input
Miller now supports arrays as of version 6.
--jsonx The `--jvstack` flag is now default true in Miller 6.
--jvquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--mmap Miller no longer uses memory-mapping to access data
files.
--no-fflush The current implementation of Miller does not use
buffered output, so there is no longer anything to
suppress here.
--no-mmap Miller no longer uses memory-mapping to access data
files.
--ojsonx The `--jvstack` flag is now default true in Miller 6.
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
mlr ... | {your compression command} > outputfilenamegoeshere
MISCELLANEOUS FLAGS
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once.
Example: `mlr --from a.dat --from b.dat cat` is the
same as `mlr cat a.dat b.dat`.
--load {filename} Load DSL script file for all put/filter operations on
the command line. If the name following `--load` is a
directory, load all `*.mlr` files in that directory.
This is just like `put -f` and `filter -f` except
it's up-front on the command line, so you can do
something like `alias mlr='mlr --load ~/myscripts'`
if you like.
--mfrom {filenames} Use this to specify one of more input files before
the verb(s), rather than after. May be used more than
once. The list of filename must end with `--`. This
is useful for example since `--from *.csv` doesn't do
what you might hope but `--mfrom *.csv --` does.
--mload {filenames} Like `--load` but works with more than one filename,
e.g. `--mload *.mlr --`.
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style
codes for floating-point nummbers. If not specified,
default formatting is used. See also the `fmtnum`
function and the `format-values` verb.
--seed {n} with `n` of the form `12345678` or `0xcafefeed`. For
`put`/`filter` `urand`, `urandint`, and `urand32`.
-I Process files in-place. For each file name on the
command line, output is written to a temp file in the
same directory, which is then renamed over the
original. Each file is processed in isolation: if the
output format is CSV, CSV headers will be present in
each output file, statistics are only over each
file's own records; and so on.
-n Process no input files, nor standard input either.
Useful for `mlr put` with `begin`/`end` statements
only. (Same as `--from /dev/null`.) Also useful in
`mlr -n put -v '...'` for analyzing abstract syntax
trees (if that's your thing).
Lastly, note that if --prepipe or --prepipex is specified, it replaces any
decisions that might have been made based on the file suffix. Also,
--gzin/--bz2in/--zin are ignored if --prepipe is also specified.
OUTPUT-COLORIZATION FLAGS
Miller uses colors to highlight outputs. You can specify color preferences.
Note: output colorization does not work on Windows.
COMMENTS IN DATA
--skip-comments Ignore commented lines (prefixed by "#")
within the input.
--skip-comments-with {string} Ignore commented lines within input, with
specified prefix.
--pass-comments Immediately print commented lines (prefixed by "#")
within the input.
--pass-comments-with {string} Immediately print commented lines within input, with
specified prefix.
Things having colors:
Notes:
* Comments are only honored at the start of a line.
* In the absence of any of the above four options, comments are data like
any other text.
* When pass-comments is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc in regression-test output
* Some online-help strings
CSV-SPECIFIC OPTIONS
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line 1
of input files. Tip: combine with "label" to recreate
missing headers.
--no-implicit-csv-header Do not use --implicit-csv-header. This is the default
anyway -- the main use is for the flags to 'mlr join' if you have
main file(s) which are headerless but you want to join in on
a file which does have a CSV header. Then you could use
'mlr --csv --implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ... your-headerless.csv'
--allow-ragged-csv-input|--ragged If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line has more
fields than the header line, use integer field labels as in
the implicit-header case.
--headerless-csv-output Print only CSV data lines.
-N Keystroke-saver for --implicit-csv-header --headerless-csv-output.
Rules for coloring:
DOUBLE-QUOTING FOR CSV/CSVLITE OUTPUT
THIS IS STILL WIP FOR MILLER 6
--quote-all Wrap all fields in double quotes
--quote-none Do not wrap any fields in double quotes, even if they have
OFS or ORS in them
--quote-minimal Wrap fields in double quotes only if they have OFS or ORS
in them (default)
--quote-numeric Wrap fields in double quotes only if they have numbers
in them
--quote-original Wrap fields in double quotes if and only if they were
quoted on input. This isn't sticky for computed fields:
e.g. if fields a and b were quoted on input and you do
"put '$c = $a . $b'" then field c won't inherit a or b's
was-quoted-on-input flag.
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: `mlr --csv cat foo.csv`
* Example: no color: `mlr --csv cat foo.csv > bar.csv`
* Example: no color: `mlr --csv cat foo.csv | less`
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
NUMBER FORMATTING
THIS IS STILL WIP FOR MILLER 6
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style codes for
floating-point nummbers. If not specified, default formatting is used.
See also the fmtnum function within mlr put (mlr --help-all-functions);
see also the format-values function.
Mechanisms for coloring:
OTHER OPTIONS
--seed {n} with n of the form 12345678 or 0xcafefeed. For put/filter
urand()/urandint()/urand32().
--nr-progress-mod {m}, with m a positive integer: print filename and record
count to os.Stderr every m input records.
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once. Example:
"mlr --from a.dat --from b.dat cat" is the same as
"mlr cat a.dat b.dat".
--mfrom {filenames} -- Use this to specify one of more input files before the verb(s),
rather than after. May be used more than once.
The list of filename must end with "--". This is useful
for example since "--from *.csv" doesn't do what you might
hope but "--mfrom *.csv --" does.
--load {filename} Load DSL script file for all put/filter operations on the command line.
If the name following --load is a directory, load all "*.mlr" files
in that directory. This is just like "put -f" and "filter -f"
except it's up-front on the command line, so you can do something like
alias mlr='mlr --load ~/myscripts' if you like.
--mload {names} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-n Process no input files, nor standard input either. Useful
for mlr put with begin/end statements only. (Same as --from
/dev/null.) Also useful in "mlr -n put -v '...'" for
analyzing abstract syntax trees (if that's your thing).
-I Process files in-place. For each file name on the command
line, output is written to a temp file in the same
directory, which is then renamed over the original. Each
file is processed in isolation: if the output format is
CSV, CSV headers will be present in each output file
statistics are only over each file's own records; and so on.
* Miller uses ANSI escape sequences only. This does not work on Windows except within Cygwin.
* Requires `TERM` environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable `export MLR_NO_COLOR=true` means don't color even if stdout+TTY.
* Environment variable `export MLR_ALWAYS_COLOR=true` means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to `less -r`.
* Command-line flags `--no-color` or `-M`, `--always-color` or `-C`.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* `export MLR_KEY_COLOR=208`, `MLR_VALUE_COLOR=33`, etc.:
`MLR_KEY_COLOR` `MLR_VALUE_COLOR` `MLR_PASS_COLOR` `MLR_FAIL_COLOR`
`MLR_REPL_PS1_COLOR` `MLR_REPL_PS2_COLOR` `MLR_HELP_COLOR`
* Command-line flags `--key-color 208`, `--value-color 33`, etc.:
`--key-color` `--value-color` `--pass-color` `--fail-color`
`--repl-ps1-color` `--repl-ps2-color` `--help-color`
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided, the latter take precedence.
Please do mlr `--list-color-codes` to see the available color codes (like 170), and
`mlr --list-color-names` to see available names (like `orchid`).
--always-color or -C
--fail-color
--help-color
--key-color
--list-color-codes
--list-color-names
--no-color or -M
--pass-color
--value-color
PPRINT-ONLY FLAGS
These are flags which are applicable to PPRINT output format.
--barred Prints a border around PPRINT output (not available
for input).
--right Right-justifies all fields for PPRINT output.
SEPARATOR FLAGS
Separator options:
--rs --irs --ors Record separators, e.g. 'lf' or '\\r\\n'
--fs --ifs --ofs --repifs Field separators, e.g. comma
--ps --ips --ops Pair separators, e.g. equals sign
TODO: auto-detect is still TBD for Miller 6
Notes about line endings:
* Default line endings (`--irs` and `--ors`) are "auto" which means autodetect from
the input file format, as long as the input file(s) have lines ending in either
LF (also known as linefeed, `\n`, `0x0a`, or Unix-style) or CRLF (also known as
carriage-return/linefeed pairs, `\r\n`, `0x0d 0x0a`, or Windows-style).
* If both `irs` and `ors` are `auto` (which is the default) then LF input will lead to LF
output and CRLF input will lead to CRLF output, regardless of the platform you're
running on.
* The line-ending autodetector triggers on the first line ending detected in the input
stream. E.g. if you specify a CRLF-terminated file on the command line followed by an
LF-terminated file then autodetected line endings will be CRLF.
* If you use `--ors {something else}` with (default or explicitly specified) `--irs auto`
then line endings are autodetected on input and set to what you specify on output.
* If you use `--irs {something else}` with (default or explicitly specified) `--ors auto`
then the output line endings used are LF on Unix/Linux/BSD/MacOSX, and CRLF on Windows.
Notes about all other separators:
* IPS/OPS are only used for DKVP and XTAB formats, since only in these formats
do key-value pairs appear juxtaposed.
* IRS/ORS are ignored for XTAB format. Nominally IFS and OFS are newlines;
XTAB records are separated by two or more consecutive IFS/OFS -- i.e.
a blank line. Everything above about `--irs/--ors/--rs auto` becomes `--ifs/--ofs/--fs`
auto for XTAB format. (XTAB's default IFS/OFS are "auto".)
* OFS must be single-character for PPRINT format. This is because it is used
with repetition for alignment; multi-character separators would make
alignment impossible.
* OPS may be multi-character for XTAB format, in which case alignment is
disabled.
* TSV is simply CSV using tab as field separator (`--fs tab`).
* FS/PS are ignored for markdown format; RS is used.
* All FS and PS options are ignored for JSON format, since they are not relevant
to the JSON format.
* You can specify separators in any of the following ways, shown by example:
- Type them out, quoting as necessary for shell escapes, e.g.
`--fs '|' --ips :`
- C-style escape sequences, e.g. `--rs '\r\n' --fs '\t'`.
- To avoid backslashing, you can use any of the following names:
TODO desc-to-chars map
* Default separators by format:
TODO default_xses
--fs {string} Specify FS for input and output.
--ifs {string} Specify FS for input.
--ips {string} Specify PS for input.
--irs {string} Specify RS for input.
--ofs {string} Specify FS for output.
--ops {string} Specify PS for output.
--ors {string} Specify RS for output.
--ps {string} Specify PS for input and output.
--repifs Let IFS be repeated: e.g. for splitting on multiple
spaces.
--rs {string} Specify RS for input and output.
AUXILIARY COMMANDS
Available subcommands:
@ -386,80 +617,6 @@ AUXILIARY COMMANDS
repl
For more information, please invoke mlr {subcommand} --help.
REPL
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
OUTPUT COLORIZATION
Things having colors:
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc
in regression-test output
* Some online-help strings
Rules for coloring:
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: mlr --csv cat foo.csv
* Example: no color: mlr --csv cat foo.csv > bar.csv
* Example: no color: mlr --csv cat foo.csv | less
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
Mechanisms for coloring:
* Miller uses ANSI escape sequences only. This does not work on Windows except on Cygwin.
* Requires TERM environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable export MLR_NO_COLOR=true means don't color even if stdout+TTY.
* Environment variable export MLR_ALWAYS_COLOR=true means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to less -r.
* Command-line flags --no-color or -M, --always-color or -C.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* export MLR_KEY_COLOR=208, MLR_VALUE_COLOR-33, etc.:
MLR_KEY_COLOR MLR_VALUE_COLOR MLR_PASS_COLOR MLR_FAIL_COLOR
MLR_REPL_PS1_COLOR MLR_REPL_PS2_COLOR MLR_HELP_COLOR
* Command-line flags --key-color 208, --value-color 33, etc.:
--key-color --value-color --pass-color --fail-color
--repl-ps1-color --repl-ps2-color --help-color
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided,the latter take precedence.
Please do mlr --list-color-codes to see the available color codes (like 170), and
mlr --list-color-names to see available names (like orchid).
MLRRC
You can set up personal defaults via a $HOME/.mlrrc and/or ./.mlrrc.
For example, if you usually process CSV, then you can put "--csv" in your .mlrrc file
@ -493,6 +650,36 @@ MLRRC
See also:
https://miller.readthedocs.io/en/latest/customization.html
REPL
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
VERBS
altkv
Usage: mlr altkv [options]
@ -2526,4 +2713,4 @@ SEE ALSO
2021-09-05 MILLER(1)
2021-09-08 MILLER(1)

View file

@ -48,10 +48,6 @@ EOF
help = `mlr help show-help-for-flag '#{flag}'`
puts "* `#{headline}`: #{help}"
end
#puts '```'
#system("mlr help list-flags-for-section '#{section_name}'")
#puts '```'
end
puts

View file

@ -38,36 +38,34 @@ Please also see https://johnkerl.org/miller6
</pre>
<pre class="pre-non-highlight-in-pair">
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help auxents
mlr help basic-examples
mlr help data-formats
mlr help function
mlr help keyword
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help list-functions-as-paragraph
mlr help list-keywords
mlr help list-keywords-as-paragraph
mlr help list-verbs
mlr help list-verbs-as-paragraph
mlr help mlrrc
mlr help number-formatting
mlr help type-arithmetic-info
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help usage-verbs
mlr help verb
mlr help comments-in-data
mlr help compressed-data
mlr help data-format-options
mlr help double-quoting
mlr help format-conversion
mlr help separator-options
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
@ -81,36 +79,34 @@ Shorthands:
</pre>
<pre class="pre-non-highlight-in-pair">
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help auxents
mlr help basic-examples
mlr help data-formats
mlr help function
mlr help keyword
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help list-functions-as-paragraph
mlr help list-keywords
mlr help list-keywords-as-paragraph
mlr help list-verbs
mlr help list-verbs-as-paragraph
mlr help mlrrc
mlr help number-formatting
mlr help type-arithmetic-info
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help usage-verbs
mlr help verb
mlr help comments-in-data
mlr help compressed-data
mlr help data-format-options
mlr help double-quoting
mlr help format-conversion
mlr help separator-options
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
@ -121,8 +117,14 @@ Shorthands:
Etc.
## Command-line flags
This is a command-line version of the [List of command-line flags](reference-main-flag-list.md) page.
See `mlr help flags` for a full listing.
## Per-verb help
This is a command-line version of the [List of verbs](reference-verbs.md) page.
Given the name of a verb (from `mlr -l`) you can invoke it with `--help` or `-h` -- or, use `mlr help verb`:
<pre class="pre-highlight-in-pair">
@ -177,6 +179,7 @@ Etc.
## Per-function help
This is a command-line version of the [DSL built-in functions](reference-dsl-builtin-functions.md) page.
Given the name of a DSL function (from `mlr -f`) you can use `mlr help function` for details:
<pre class="pre-highlight-in-pair">

View file

@ -20,8 +20,14 @@ GENMD_EOF
Etc.
## Command-line flags
This is a command-line version of the [List of command-line flags](reference-main-flag-list.md) page.
See `mlr help flags` for a full listing.
## Per-verb help
This is a command-line version of the [List of verbs](reference-verbs.md) page.
Given the name of a verb (from `mlr -l`) you can invoke it with `--help` or `-h` -- or, use `mlr help verb`:
GENMD_RUN_COMMAND
@ -40,6 +46,7 @@ Etc.
## Per-function help
This is a command-line version of the [DSL built-in functions](reference-dsl-builtin-functions.md) page.
Given the name of a DSL function (from `mlr -f`) you can use `mlr help function` for details:
GENMD_RUN_COMMAND

View file

@ -9,10 +9,6 @@ E flatten/unflatten page
? twi-dm re all-contribs: all-contributors.org
C flags LUTs
o bootstrap internal-only list-sections and help-for-section and a prototype autogen.rb
o implement autogen for docs6 flags page
- internal-only help-verbs as support for docs6 autogen
- audit backticking
o audit `mlr --list` section output
- file-format top
- flatten/unflatten top
@ -21,12 +17,30 @@ C flags LUTs
- then nilabend on empty help
o connect to climain & remove old if-else-if chains
- needs forReader / forWriter flags
o width-32 programmatic w/ splits on "\n"
- uncolor " or "
o assert non-nulls on all flag/section/table elements
o width-80 audits on mlr --list
o desc-to-chars map
o default_xses
o work off all TODOs in src/auxents/help/entry.go, src/cli/flag_types.go, src/cli/option_parse.go
- also just be happy w/ the code
? mlr help topics autogen namee from FLAG_TABLE -- ?
// ----------------------------------------------------------------
// TODO: REMOVE
// Callsites:
// * src/climain/mlrcli_parse.go
// ParseCommandLine
// MainOptions (--cpuprofile, --version, etc)
// ParseReaderOptions
// ParseWriterOptions
// ParseReaderWriterOptions
// ParseMiscOptions
// help.ParseTerminalUsage
// * handleMlrrcLine
// * nest/tee/join/put/filter:
// ParseReaderOptions
// ParseWriterOptions
// !! must use only cli package, not climain package
* nikos materials -> fold in

View file

@ -162,8 +162,6 @@ are overridden in all cases by setting output format to `format2`.
`: Use XTAB format for input data.
* `--json or -j
`: Use JSON format for input and output data.
* `--jsonx
`: TODO
* `--nidx
`: Use NIDX format for input and output data.
* `--oasv or --oasvlite
@ -176,8 +174,6 @@ are overridden in all cases by setting output format to `format2`.
`: Use DKVP format for output data.
* `--ojson
`: Use JSON format for output data.
* `--ojsonx
`: TODO
* `--omd
`: Use markdown-tabular format for output data.
* `--onidx
@ -268,10 +264,12 @@ They are accepted as no-op flags in order to keep old scripts from breaking.
`: Type information from JSON input files is now preserved throughout the processing stream.
* `--json-fatal-arrays-on-input
`: Miller now supports arrays as of version 6.
* `--json-skip-arrays-on-input
* `--json-map-arrays-on-input
`: Miller now supports arrays as of version 6.
* `--json-skip-arrays-on-input
`: Miller now supports arrays as of version 6.
* `--jsonx
`: The `--jvstack` flag is now default true in Miller 6.
* `--jvquoteall
`: Type information from JSON input files is now preserved throughout the processing stream.
* `--mmap
@ -280,6 +278,8 @@ They are accepted as no-op flags in order to keep old scripts from breaking.
`: The current implementation of Miller does not use buffered output, so there is no longer anything to suppress here.
* `--no-mmap
`: Miller no longer uses memory-mapping to access data files.
* `--ojsonx
`: The `--jvstack` flag is now default true in Miller 6.
## Miscellaneous flags
@ -294,8 +294,8 @@ They are accepted as no-op flags in order to keep old scripts from breaking.
`: Use this to specify one of more input files before the verb(s), rather than after. May be used more than once. The list of filename must end with `--`. This is useful for example since `--from *.csv` doesn't do what you might hope but `--mfrom *.csv --` does.
* `--mload {filenames}
`: Like `--load` but works with more than one filename, e.g. `--mload *.mlr --`.
* `--ofmt
`:
* `--ofmt {format}
`: E.g. %.18f, %.0f, %9.6e. Please use sprintf-style codes for floating-point nummbers. If not specified, default formatting is used. See also the `fmtnum` function and the `format-values` verb.
* `--seed {n}
`: with `n` of the form `12345678` or `0xcafefeed`. For `put`/`filter` `urand`, `urandint`, and `urand32`.
* `-I
@ -341,13 +341,13 @@ How you can control colorization:
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* `export MLR_KEY_COLOR=208`, `MLR_VALUE_COLOR=33`, etc.:
`MLR_KEY_COLOR` `MLR_VALUE_COLOR` `MLR_PASS_COLOR` `MLR_FAIL_COLOR`
`MLR_REPL_PS1_COLOR` `MLR_REPL_PS2_COLOR` `MLR_HELP_COLOR`
* Command-line flags `--key-color 208`, `--value-color 33`, etc.:
`--key-color` `--value-color` `--pass-color` `--fail-color`
`--repl-ps1-color` `--repl-ps2-color` `--help-color`
* This is particularly useful if your terminal's background color clashes with current settings.
* `export MLR_KEY_COLOR=208`, `MLR_VALUE_COLOR=33`, etc.:
`MLR_KEY_COLOR` `MLR_VALUE_COLOR` `MLR_PASS_COLOR` `MLR_FAIL_COLOR`
`MLR_REPL_PS1_COLOR` `MLR_REPL_PS2_COLOR` `MLR_HELP_COLOR`
* Command-line flags `--key-color 208`, `--value-color 33`, etc.:
`--key-color` `--value-color` `--pass-color` `--fail-color`
`--repl-ps1-color` `--repl-ps2-color` `--help-color`
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided, the latter take precedence.

View file

@ -21,82 +21,174 @@ import (
type tZaryHandlerFunc func()
type tUnaryHandlerFunc func(arg string)
type shorthandInfo struct {
shorthand string
longhand string
type tHandlerLookupTable struct {
sections []tHandlerInfoSection
}
type handlerInfo struct {
name string
zaryHandlerFunc tZaryHandlerFunc
unaryHandlerFunc tUnaryHandlerFunc
type tHandlerInfoSection struct {
name string
handlerInfos []tHandlerInfo
// Some handlers are used only for webdoc/manpage autogen and needn't
// clutter up the on-line help experience for the interactive user
internal bool
}
type tHandlerInfo struct {
name string
zaryHandlerFunc tZaryHandlerFunc
unaryHandlerFunc tUnaryHandlerFunc
}
type tShorthandTable struct {
shorthandInfos []tShorthandInfo
}
type tShorthandInfo struct {
shorthand string
longhand string
}
// We get a Golang "initialization loop" if this is defined statically. So, we
// use a "package init" function.
var shorthandLookupTable = []shorthandInfo{}
var handlerLookupTable = []handlerInfo{}
var handlerLookupTable = tHandlerLookupTable{}
var shorthandLookupTable = tShorthandTable{}
func init() {
// For things like 'mlr -f', invoked through the CLI parser which does not
// go through our HelpMain().
shorthandLookupTable = []shorthandInfo{
{shorthand: "-l", longhand: "list-verbs"},
{shorthand: "-L", longhand: "usage-verbs"},
{shorthand: "-f", longhand: "list-functions"},
{shorthand: "-F", longhand: "usage-functions"},
{shorthand: "-k", longhand: "list-keywords"},
{shorthand: "-K", longhand: "usage-keywords"},
}
// For things like 'mlr help foo', invoked through the auxent framework
// which goes through our HelpMain().
handlerLookupTable = []handlerInfo{
{name: "topics", zaryHandlerFunc: listTopics},
{name: "auxents", zaryHandlerFunc: helpAuxents},
{name: "basic-examples", zaryHandlerFunc: helpBasicExamples},
{name: "data-formats", zaryHandlerFunc: helpDataFormats},
{name: "function", unaryHandlerFunc: helpForFunction},
{name: "keyword", unaryHandlerFunc: helpForKeyword},
{name: "list-functions", zaryHandlerFunc: listFunctions},
{name: "list-function-classes", zaryHandlerFunc: listFunctionClasses},
{name: "list-functions-in-class", unaryHandlerFunc: listFunctionsInClass},
{name: "list-functions-as-paragraph", zaryHandlerFunc: listFunctionsAsParagraph},
{name: "list-keywords", zaryHandlerFunc: listKeywords},
{name: "list-keywords-as-paragraph", zaryHandlerFunc: listKeywordsAsParagraph},
{name: "list-verbs", zaryHandlerFunc: listVerbs},
{name: "list-verbs-as-paragraph", zaryHandlerFunc: listVerbsAsParagraph},
{name: "mlrrc", zaryHandlerFunc: helpMlrrc},
{name: "number-formatting", zaryHandlerFunc: helpNumberFormatting},
{name: "type-arithmetic-info", zaryHandlerFunc: helpTypeArithmeticInfo},
{name: "usage-functions", zaryHandlerFunc: usageFunctions},
{name: "usage-functions-by-class", zaryHandlerFunc: usageFunctionsByClass},
{name: "usage-keywords", zaryHandlerFunc: usageKeywords},
{name: "usage-verbs", zaryHandlerFunc: usageVerbs},
{name: "verb", unaryHandlerFunc: helpForVerb},
handlerLookupTable = tHandlerLookupTable{
sections: []tHandlerInfoSection{
{
name: "Essentials",
handlerInfos: []tHandlerInfo{
{name: "topics", zaryHandlerFunc: listTopics},
{name: "basic-examples", zaryHandlerFunc: helpBasicExamples},
{name: "data-formats", zaryHandlerFunc: helpDataFormats},
},
},
{
name: "Flags",
handlerInfos: []tHandlerInfo{
{name: "flags", zaryHandlerFunc: showFlagHelp},
// Per-section entries will be computed and installed below
},
},
{
name: "Verbs",
handlerInfos: []tHandlerInfo{
{name: "list-verbs", zaryHandlerFunc: listVerbs},
{name: "usage-verbs", zaryHandlerFunc: usageVerbs},
{name: "verb", unaryHandlerFunc: helpForVerb},
},
},
{
name: "Functions",
handlerInfos: []tHandlerInfo{
{name: "list-functions", zaryHandlerFunc: listFunctions},
{name: "list-function-classes", zaryHandlerFunc: listFunctionClasses},
{name: "list-functions-in-class", unaryHandlerFunc: listFunctionsInClass},
{name: "usage-functions", zaryHandlerFunc: usageFunctions},
{name: "usage-functions-by-class", zaryHandlerFunc: usageFunctionsByClass},
{name: "function", unaryHandlerFunc: helpForFunction},
},
},
{
name: "Keywords",
handlerInfos: []tHandlerInfo{
{name: "list-keywords", zaryHandlerFunc: listKeywords},
{name: "usage-keywords", zaryHandlerFunc: usageKeywords},
{name: "keyword", unaryHandlerFunc: helpForKeyword},
},
},
{
name: "Other",
handlerInfos: []tHandlerInfo{
{name: "auxents", zaryHandlerFunc: helpAuxents},
{name: "mlrrc", zaryHandlerFunc: helpMlrrc},
{name: "output-colorization", zaryHandlerFunc: helpOutputColorization},
{name: "type-arithmetic-info", zaryHandlerFunc: helpTypeArithmeticInfo},
},
},
{
name: "Internal/docgen",
internal: true,
handlerInfos: []tHandlerInfo{
{name: "list-verbs-as-paragraph", zaryHandlerFunc: listVerbsAsParagraph},
{name: "list-functions-as-paragraph", zaryHandlerFunc: listFunctionsAsParagraph},
{name: "list-keywords-as-paragraph", zaryHandlerFunc: listKeywordsAsParagraph},
{name: "list-functions-as-table", zaryHandlerFunc: listFunctionsAsTable},
{name: "list-flag-sections", zaryHandlerFunc: listFlagSections},
{name: "print-info-for-section", unaryHandlerFunc: printInfoForSection},
{name: "list-flags-for-section", unaryHandlerFunc: listFlagsForSection},
{name: "show-help-for-section", unaryHandlerFunc: showHelpForSection},
{name: "show-help-for-section-via-downdash", unaryHandlerFunc: showHelpForSectionViaDowndash},
{name: "show-headline-for-flag", unaryHandlerFunc: showHeadlineForFlag},
{name: "show-help-for-flag", unaryHandlerFunc: showHelpForFlag},
},
},
},
}
// TODO: to flags-sections
{name: "comments-in-data", zaryHandlerFunc: helpCommentsInData},
{name: "compressed-data", zaryHandlerFunc: helpCompressedDataOptions},
{name: "data-format-options", zaryHandlerFunc: helpDataFormatOptions},
{name: "double-quoting", zaryHandlerFunc: helpDoubleQuoting},
{name: "format-conversion", zaryHandlerFunc: helpFormatConversionKeystrokeSaverOptions},
{name: "separator-options", zaryHandlerFunc: helpSeparatorOptions},
// This is a wee bit clever. The rest of the topics in the table have names
// manually keyed in. But we want to produce `mlr help csv-only-flags` for
// flag-section named "CSV-only flags", etc. Here we can't key in the names
// since we want to compute them dynamically from cli.FLAG_TABLE which is
// Miller's wqy of tracking command-line flags.
// Internal-only
{name: "list-functions-as-table", zaryHandlerFunc: listFunctionsAsTable, internal: true},
{name: "list-flag-sections", zaryHandlerFunc: listFlagSections, internal: true},
{name: "print-info-for-section", unaryHandlerFunc: printInfoForSection, internal: true},
{name: "list-flags-for-section", unaryHandlerFunc: listFlagsForSection, internal: true},
{name: "show-headline-for-flag", unaryHandlerFunc: showHeadlineForFlag, internal: true},
{name: "show-help-for-flag", unaryHandlerFunc: showHelpForFlag, internal: true},
//{name: "comments-in-data-flags"},
//{name: "compressed-data-flags"},
//{name: "csv-only-flags"},
//{name: "file-format-flags"},
//{name: "flatten-unflatten-flags"},
//{name: "format-conversion-keystroke-saver-flags"},
//{name: "json-only-flags"},
//{name: "legacy-flags"},
//{name: "miscellaneous-flags"},
//{name: "output-colorization-flags"},
//{name: "pprint-only-flags"},
//{name: "separator-flags"},
// TBD: have an info-only handler in addition to flags-section
{name: "output-colorization", zaryHandlerFunc: helpOutputColorization},
// For this file's topic-lookup table, find and extend the section called "Flags".
for i, section := range handlerLookupTable.sections {
if section.name != "Flags" {
continue
}
// Ask the flags table for a list of flag-section names, downcased and
// with spaces replaced with dashes -- "downdashed" -- making the
// punctuation/casing style for online help.
downdashSectionNames := cli.FLAG_TABLE.GetDowndashSectionNames()
// Note: `j, _` rather than `_, downdashSectionName` since the latter
// is a data copy while the former allows us to do a reference. The
// former won't produce correct lookup-table data.
for j, _ := range downdashSectionNames {
downdashSectionName := downdashSectionNames[j]
// Patch a new entry into the "Flags" section of our lookup table.
entry := tHandlerInfo{
name: downdashSectionName,
// Make a function which passes in "csv-only-flags" etc. to the FLAG_TABLE.
zaryHandlerFunc: func() {
showHelpForSectionViaDowndash(downdashSectionName)
},
}
handlerLookupTable.sections[i].handlerInfos = append(handlerLookupTable.sections[i].handlerInfos, entry)
}
}
// For things like 'mlr -f', invoked through the CLI parser which does not
// go through our HelpMain().
shorthandLookupTable = tShorthandTable{
shorthandInfos: []tShorthandInfo{
{shorthand: "-g", longhand: "flags"},
{shorthand: "-l", longhand: "list-verbs"},
{shorthand: "-L", longhand: "usage-verbs"},
{shorthand: "-f", longhand: "list-functions"},
{shorthand: "-F", longhand: "usage-functions"},
{shorthand: "-k", longhand: "list-keywords"},
{shorthand: "-K", longhand: "usage-keywords"},
},
}
}
@ -115,25 +207,27 @@ func HelpMain(args []string) int {
// "mlr help something" where we recognize the something
name := args[0]
for _, info := range handlerLookupTable {
if info.name == name {
if info.zaryHandlerFunc != nil {
if len(args) != 1 {
fmt.Printf("mlr help %s takes no additional argument.\n", name)
for _, section := range handlerLookupTable.sections {
for _, info := range section.handlerInfos {
if info.name == name {
if info.zaryHandlerFunc != nil {
if len(args) != 1 {
fmt.Printf("mlr help %s takes no additional argument.\n", name)
return 0
}
info.zaryHandlerFunc()
return 0
}
info.zaryHandlerFunc()
return 0
}
if info.unaryHandlerFunc != nil {
if len(args) < 2 {
fmt.Printf("mlr help %s takes at least one required argument.\n", name)
if info.unaryHandlerFunc != nil {
if len(args) < 2 {
fmt.Printf("mlr help %s takes at least one required argument.\n", name)
return 0
}
for _, arg := range args[1:] {
info.unaryHandlerFunc(arg)
}
return 0
}
for _, arg := range args[1:] {
info.unaryHandlerFunc(arg)
}
return 0
}
}
}
@ -164,12 +258,14 @@ func ParseTerminalUsage(arg string) bool {
return true
}
// "mlr -l" is shorthand for "mlr help list-verbs", etc.
for _, sinfo := range shorthandLookupTable {
for _, sinfo := range shorthandLookupTable.shorthandInfos {
if sinfo.shorthand == arg {
for _, info := range handlerLookupTable {
if info.name == sinfo.longhand {
info.zaryHandlerFunc()
return true
for _, section := range handlerLookupTable.sections {
for _, info := range section.handlerInfos {
if info.name == sinfo.longhand {
info.zaryHandlerFunc()
return true
}
}
}
}
@ -185,17 +281,25 @@ func handleDefault() {
// ----------------------------------------------------------------
func listTopics() {
fmt.Println("Type 'mlr help {topic}' for any of the following:")
for _, info := range handlerLookupTable {
if !info.internal {
fmt.Printf(" mlr help %s\n", info.name)
for _, section := range handlerLookupTable.sections {
if !section.internal {
fmt.Printf("%s:\n", section.name)
for _, info := range section.handlerInfos {
fmt.Printf(" mlr help %s\n", info.name)
}
}
}
fmt.Println("Shorthands:")
for _, info := range shorthandLookupTable {
for _, info := range shorthandLookupTable.shorthandInfos {
fmt.Printf(" mlr %s = mlr help %s\n", info.shorthand, info.longhand)
}
}
// ----------------------------------------------------------------
func showFlagHelp() {
cli.FLAG_TABLE.ShowHelp()
}
// ----------------------------------------------------------------
func helpAuxents() {
fmt.Print(`Miller has a few otherwise-standalone executables packaged within it.
@ -219,16 +323,6 @@ mlr --icsv --opprint --from example.csv sort -nr index then cut -f shape,quantit
`)
}
// ----------------------------------------------------------------
func helpCommentsInData() {
cli.CommentsInDataPrintInfo()
}
// ----------------------------------------------------------------
func helpCompressedDataOptions() {
cli.CompressedDataPrintInfo()
}
// ----------------------------------------------------------------
func helpDataFormats() {
fmt.Printf(
@ -295,36 +389,6 @@ NIDX: implicitly numerically indexed (Unix-toolkit style)
`)
}
// ----------------------------------------------------------------
func helpDataFormatOptions() {
cli.FileFormatPrintInfo()
}
// ----------------------------------------------------------------
// TBD FOR MILLER 6:
func helpDoubleQuoting() {
fmt.Printf("THIS IS STILL WIP FOR MILLER 6\n")
fmt.Println(
`--quote-all Wrap all fields in double quotes
--quote-none Do not wrap any fields in double quotes, even if they have
OFS or ORS in them
--quote-minimal Wrap fields in double quotes only if they have OFS or ORS
in them (default)
--quote-numeric Wrap fields in double quotes only if they have numbers
in them
--quote-original Wrap fields in double quotes if and only if they were
quoted on input. This isn't sticky for computed fields:
e.g. if fields a and b were quoted on input and you do
"put '$c = $a . $b'" then field c won't inherit a or b's
was-quoted-on-input flag.`)
}
// ----------------------------------------------------------------
func helpFormatConversionKeystrokeSaverOptions() {
cli.FileFormatPrintInfo()
}
// ----------------------------------------------------------------
func helpMlrrc() {
fmt.Print(
@ -367,24 +431,6 @@ func helpOutputColorization() {
cli.OutputColorizationPrintInfo()
}
// ----------------------------------------------------------------
// TBD FOR MILLER 6:
func helpNumberFormatting() {
fmt.Printf("THIS IS STILL WIP FOR MILLER 6\n")
fmt.Printf(" --ofmt {format} E.g. %%.18f, %%.0f, %%9.6e. Please use sprintf-style codes for\n")
fmt.Printf(" floating-point nummbers. If not specified, default formatting is used.\n")
fmt.Printf(" See also the fmtnum function within mlr put (mlr --help-all-functions);\n")
fmt.Printf(" see also the format-values function.\n")
}
// ----------------------------------------------------------------
// TBD FOR MILLER 6:
func helpSeparatorOptions() {
cli.SeparatorPrintInfo()
}
// ----------------------------------------------------------------
func helpTypeArithmeticInfo() {
mlrvals := []*types.Mlrval{
@ -420,10 +466,14 @@ func helpTypeArithmeticInfo() {
}
// ----------------------------------------------------------------
// listFlagSections is for webdoc/manpage autogen.
// listFlagSections et al. are for webdoc/manpage autogen in the miller/docs
// and miller/man subdirectories. Unlike showFlagHelp where all looping over
// the flags table, its sections, and flags within each section is done within
// this Go program, by contrast the following few methods expose the hierarchy
// to standard output, letting the calling programs (nominally Ruby autogen
// scripts) control their own looping and formatting.
func listFlagSections() {
// xxx temp factorization
cli.FLAG_TABLE.ListFlagSections()
}
@ -443,18 +493,66 @@ func listFlagsForSection(sectionName string) {
}
}
// For manpage autogen: just produce text
func showHelpForSection(sectionName string) {
if !cli.FLAG_TABLE.ShowHelpForSection(sectionName) {
fmt.Printf(
"mlr: flag-section \"%s\" not found. Please use \"mlr help list-flag-sections\" for a list.\n",
sectionName)
}
}
// For on-the-fly `mlr help foo-bar-flags` where `Foo-bar flags` is the name of
// a section in the FLAG_TABLE. See the func-init block at the top of this
// file.
func showHelpForSectionViaDowndash(downdashSectionName string) {
if !cli.FLAG_TABLE.ShowHelpForSectionViaDowndash(downdashSectionName) {
fmt.Printf("mlr: flag-section \"%s\" not found.\n", downdashSectionName)
}
}
// For webdocs autogen: we want the headline separately so we can backtick it.
func showHeadlineForFlag(flagName string) {
if !cli.FLAG_TABLE.ShowHeadlineForFlag(flagName) {
fmt.Printf("mlr: flag \"%s\" not found..\n", flagName)
}
}
// For webdocs autogen
func showHelpForFlag(flagName string) {
if !cli.FLAG_TABLE.ShowHelpForFlag(flagName) {
fmt.Printf("mlr: flag \"%s\" not found..\n", flagName)
}
}
// ----------------------------------------------------------------
func listVerbs() {
if isatty.IsTerminal(os.Stdout.Fd()) {
transformers.ListVerbNamesAsParagraph()
} else {
transformers.ListVerbNamesVertically()
}
}
func listVerbsAsParagraph() {
transformers.ListVerbNamesAsParagraph()
}
func helpForVerb(arg string) {
transformerSetup := transformers.LookUp(arg)
if transformerSetup != nil {
transformerSetup.UsageFunc(os.Stdout, true, 0)
} else {
fmt.Printf(
"mlr: verb \"%s\" not found. Please use \"mlr help list-verbs\" for a list.\n",
arg)
}
}
func usageVerbs() {
transformers.UsageVerbs()
}
// ----------------------------------------------------------------
func listFunctions() {
if isatty.IsTerminal(os.Stdout.Fd()) {
@ -512,31 +610,3 @@ func usageKeywords() {
func helpForKeyword(arg string) {
cst.UsageForKeyword(arg)
}
// ----------------------------------------------------------------
func listVerbs() {
if isatty.IsTerminal(os.Stdout.Fd()) {
transformers.ListVerbNamesAsParagraph()
} else {
transformers.ListVerbNamesVertically()
}
}
func listVerbsAsParagraph() {
transformers.ListVerbNamesAsParagraph()
}
func helpForVerb(arg string) {
transformerSetup := transformers.LookUp(arg)
if transformerSetup != nil {
transformerSetup.UsageFunc(os.Stdout, true, 0)
} else {
fmt.Printf(
"mlr: verb \"%s\" not found. Please use \"mlr help list-verbs\" for a list.\n",
arg)
}
}
func usageVerbs() {
transformers.UsageVerbs()
}

View file

@ -1,7 +1,39 @@
// TODO: comment
// TODO: note complexity b/c serving many uses: main CLI, .mlrrc, some verbs; OLH/man/docs autogen
// TODO: why not go flags
// TODO: auto-alpha
// ================================================================
// Miller support for command-line flags.
//
// * Flags are used for several purposes:
//
// o Command-line parsing the main mlr program.
//
// o Record-reader and record-writer options for a few verbs such as join and
// tee. E.g. `mlr --csv join -f foo.tsv --tsv ...`: the main input files are
// CSV but the join-in file is TSV>
//
// o Processing .mlrrc files.
//
// o Autogenerating on-line help for `mlr help flags`.
//
// o Autogenerating the manpage for `man mlr`.
//
// o Autogenerating webdocs (mkdocs).
//
// * For these reasons, flags are organized into tables; for documentation
// purposes, flags are organized into sections (see src/cli/option_parse.go).
//
// * The Flag struct separates out flag name (e.g. `--csv`), any alternate
// names (e.g. `-c`), any arguments the flag may take, a help string, and a
// command-line parser function.
//
// * The tabular structure may seem overwrought; in fact it has been a blessing
// to develop the tabular structure since these flags objects need to serve
// so many roles as listed above.
//
// * I don't use Go flags for a few reasons. The most important one is that I
// need to handle repeated flags, e.g. --from can be used more than once for
// mlr, and -f/-n/-r etc can be used more than once for mlr sort, etc. I also
// insist on total control of flag formatting including alphabetization of
// flags for on-line help and documentation systems.
// ================================================================
package cli
@ -11,36 +43,23 @@ import (
"strings"
"mlr/src/colorizer"
"mlr/src/lib"
)
// ----------------------------------------------------------------
// Callsites:
// * src/climain/mlrcli_parse.go
// ParseCommandLine
// MainOptions (--cpuprofile, --version, etc)
// ParseReaderOptions
// ParseWriterOptions
// ParseReaderWriterOptions
// ParseMiscOptions
// help.ParseTerminalUsage
// * handleMlrrcLine
// * nest/tee/join/put/filter:
// ParseReaderOptions
// ParseWriterOptions
// !! must use only cli package, not cli package
// sections:
// how to factor
// reader/writer/readerwriter/misc
// -> split necessary for verbs & what they do/don't accept
// vs
// data-format options, --x2y, separators, compressed, comments in data,
// csv-specific, number-formatting, other
// -> split useful for on-line help
// ----------------------------------------------------------------
// TODO: comment
// Data types used within the flags table.
// FlagParser is a function which takes a flag such as `--foo`.
//
// * It should assume that a flag.Owns method has already been invoked to be
// sure that this function is indeed the right one to call for `--foo`.
//
// * The FlagParser function is responsible for advancing *pargi by 1 (if
// `--foo`) or 2 (if `--foo bar`), checking to see if argc is long enough in
// the latter case, and mutating the options struct.
//
// * Successful handling of the flag is indicated by this function making a
// non-zero increment of *pargi.
type FlagParser func(
args []string,
argc int,
@ -48,35 +67,62 @@ type FlagParser func(
options *TOptions,
)
type SectionInfoPrinter func()
// ----------------------------------------------------------------
// FlagTable holds all the flags for Miller, organized into sections.
type FlagTable struct {
sections []*FlagSection
}
// FlagSection holds all the flags in a given cateogory, where these
// categories exist for documentation purposes.
//
// The name should be right-cased for webdocs. For on-line help and
// manpage use, it will get fully uppercased.
//
// The infoPrinter provides summary/overview for all flags in the
// section, for on-line help / webdocs.
type FlagSection struct {
name string // TODO: lowercase? capcase? upper? make methods?
// xxx common-info func
infoPrinter SectionInfoPrinter
name string
infoPrinter func()
flags []Flag
}
// Flag is a container for all runtime as well as documentation information for
// a flag.
type Flag struct {
// More common case: the flag has just one spelling, like "--ifs".
// In most cases, the flag has just one spelling, like "--ifs".
name string
// Less common case: the flag has more than one spelling, like "-h" and "--help",
// or "-c" and "--csv".
// In some cases, the flag has more than one spelling, like "-h" and
// "--help", or "-c" and "--csv". The altNames field can be omitted from
// struct initializers, which in Go means it will read as nil.
altNames []string
// If not "", a name for the flag's argument, for on-line help. E.g. the "bar" in ""--foo {bar}".
arg string
// If not "", a name for the flag's argument, for on-line help. E.g. the
// "bar" in ""--foo {bar}". It should always be written in curly braces.
arg string
// Help string for `mlr help flags`, `man mlr`, and webdocs.
// * It should be all one line within the source code. The text will be
// reformatted as a paragraph for on-line help / manpage, so there should
// be no attempt at line-breaking within the help string.
// * Any code bits should be marked with backticks. These look OK for
// on-line help / manpage, and render marvelously for webdocs which
// take markdown.
// * After changing flags you can run `sh build-go-src-test-man-doc.sh`
// followed by `git diff` to see how the output looks. See also
// the README.md files in the docs6 and man6 directories for how
// to look at the autogenned docs pre-commit.
help string
// A function for parsing the command line, as described above.
parser FlagParser
// TODO: comment
// reader, writer, reader/writer, misc = neither
// Any flag intended for record-reading only (e.g. for `mlr join`)
// should set forReader = true.
// Any flag intended for record-writing only (e.g. for `mlr tee`)
// should set forWriter = true.
// TODO: rethink this to make the normal case non-error-prone.
forReader bool
forWriter bool
}
@ -84,8 +130,9 @@ type Flag struct {
// ================================================================
// FlagTable methods
// Sort organizes the sections in the table alphabetically, to make on-line help
// easier to read.
// Sort organizes the sections in the table alphabetically, to make on-line
// help easier to read. This is done from func-init context so on-line help
// will always be easy to navigate.
func (ft *FlagTable) Sort() {
// Go sort API: for ascending sort, return true if element i < element j.
sort.Slice(ft.sections, func(i, j int) bool {
@ -93,6 +140,10 @@ func (ft *FlagTable) Sort() {
})
}
// Parse is for parsing a flag on the command line. Given say `--foo`, if a
// Flag object is found which owns the flag, and if its parser accepts it (e.g.
// `bar` is present and spelt correctly if the flag-parser expects `--foo bar`)
// then the return value is true, else false.
func (ft *FlagTable) Parse(
args []string,
argc int,
@ -106,16 +157,18 @@ func (ft *FlagTable) Parse(
// Let the flag-parser advance *pargi, depending on how many
// arguments follow the flag. E.g. `--ifs pipe` will advance
// *pargi by 2; `-I` will advance it by 1.
oargi := *pargi
flag.parser(args, argc, pargi, options)
return true
nargi := *pargi
return nargi > oargi
}
}
}
return false
}
// TODO: more options for OLH
func (ft *FlagTable) ListTemp() {
// ShowHelp prints all-in-one on-line help, nominally for `mlr help flags`.
func (ft *FlagTable) ShowHelp() {
for i, section := range ft.sections {
if i > 0 {
fmt.Println()
@ -127,14 +180,45 @@ func (ft *FlagTable) ListTemp() {
}
}
// TODO: comment more. For webdoc/manpage autogen.
// ListFlagSections exposes some of the flags-table structure, so Ruby autogen
// scripts for on-line help and webdocs can traverse the structure with looping
// inside their own code.
func (ft *FlagTable) ListFlagSections() {
for _, section := range ft.sections {
fmt.Println(section.name)
}
}
// TODO: comment more. For webdoc/manpage autogen.
// PrintInfoForSection exposes some of the flags-table structure, so Ruby
// autogen scripts for on-line help and webdocs can traverse the structure with
// looping inside their own code.
func (ft *FlagTable) ShowHelpForSection(sectionName string) bool {
for _, section := range ft.sections {
if sectionName == section.name {
section.PrintInfo()
section.ShowHelpForFlags()
return true
}
}
return false
}
// TODO: comment
func (ft *FlagTable) ShowHelpForSectionViaDowndash(downdashSectionName string) bool {
for _, section := range ft.sections {
if downdashSectionName == section.GetDowndashSectionName() {
fmt.Println(colorizer.MaybeColorizeHelp(strings.ToUpper(section.name), true))
section.PrintInfo()
section.ShowHelpForFlags()
return true
}
}
return false
}
// PrintInfoForSection exposes some of the flags-table structure, so Ruby
// autogen scripts for on-line help and webdocs can traverse the structure with
// looping inside their own code.
func (ft *FlagTable) PrintInfoForSection(sectionName string) bool {
for _, section := range ft.sections {
if sectionName == section.name {
@ -145,7 +229,9 @@ func (ft *FlagTable) PrintInfoForSection(sectionName string) bool {
return false
}
// TODO: comment more. For webdoc/manpage autogen.
// ListFlagsForSection exposes some of the flags-table structure, so Ruby
// autogen scripts for on-line help and webdocs can traverse the structure with
// looping inside their own code.
func (ft *FlagTable) ListFlagsForSection(sectionName string) bool {
for _, section := range ft.sections {
if sectionName == section.name {
@ -156,7 +242,10 @@ func (ft *FlagTable) ListFlagsForSection(sectionName string) bool {
return false
}
// TODO: comment more. For webdoc/manpage autogen.
// Given flag named `--foo`, altName `-f`, and argument spec `{bar}`, the
// headline is `--foo or -f {bar}`. This is the bit which is highlighted in
// on-line help; its length is also used for alignment decisions in the on-line
// help and the manapge.
func (ft *FlagTable) ShowHeadlineForFlag(flagName string) bool {
for _, fs := range ft.sections {
for _, flag := range fs.flags {
@ -169,7 +258,10 @@ func (ft *FlagTable) ShowHeadlineForFlag(flagName string) bool {
return false
}
// TODO: comment more. For webdoc/manpage autogen.
// TODO: individualize these comments
// ShowHelpForFlag prints the flag's help-string all on one line. This is for
// webdoc usage where the browser does dynamic line-wrapping, as the user
// resizes the browser window.
func (ft *FlagTable) ShowHelpForFlag(flagName string) bool {
for _, fs := range ft.sections {
for _, flag := range fs.flags {
@ -182,13 +274,25 @@ func (ft *FlagTable) ShowHelpForFlag(flagName string) bool {
return false
}
// Map "CSV-only flags" to "csv-only-flags" etc. for the benefit of per-section
// help in `mlr help topics`.
func (ft *FlagTable) GetDowndashSectionNames() []string {
downdashSectionNames := make([]string, len(ft.sections))
for i, fs := range ft.sections {
// Get names like "CSV-only flags" from the FLAG_TABLE.
// Downcase and replace spaces with dashes to get names like
// "csv-only-flags"
downdashSectionNames[i] = fs.GetDowndashSectionName()
}
return downdashSectionNames
}
// ================================================================
// FlagSection methods
// TODO: more options for OLH
// Sort organizes the flags in the section alphabetically, to make on-line help
// easier to read.
// easier to read. This is done from func-init context so on-line help will
// always be easy to navigate.
func (fs *FlagSection) Sort() {
// Go sort API: for ascending sort, return true if element i < element j.
sort.Slice(fs.flags, func(i, j int) bool {
@ -196,6 +300,17 @@ func (fs *FlagSection) Sort() {
})
}
// ShowHelpForFlags prints all-in-one on-line help, nominally for `mlr help
// flags`.
func (fs *FlagSection) ShowHelpForFlags() {
for _, flag := range fs.flags {
flag.ShowHelp()
}
}
// PrintInfo exposes some of the flags-table structure, so Ruby autogen scripts
// for on-line help and webdocs can traverse the structure with looping inside
// their own code.
func (fs *FlagSection) PrintInfo() {
// TODO: remove with nilabend check
if fs.infoPrinter != nil {
@ -204,18 +319,19 @@ func (fs *FlagSection) PrintInfo() {
}
}
// TODO: more options for OLH
// ListFlags exposes some of the flags-table structure, so Ruby autogen scripts
// for on-line help and webdocs can traverse the structure with looping inside
// their own code.
func (fs *FlagSection) ListFlags() {
for _, flag := range fs.flags {
fmt.Println(flag.name)
}
}
// TODO: more options for OLH
func (fs *FlagSection) ShowHelpForFlags() {
for _, flag := range fs.flags {
flag.ListTemp()
}
// Map "CSV-only flags" to "csv-only-flags" etc. for the benefit of per-section
// help in `mlr help topics`.
func (fs *FlagSection) GetDowndashSectionName() string {
return strings.ReplaceAll(strings.ToLower(fs.name), " ", "-")
}
// ================================================================
@ -236,17 +352,82 @@ func (flag *Flag) Owns(input string) bool {
return false
}
func (flag *Flag) ListTemp() {
displayText := fmt.Sprintf("%-31s", flag.GetHeadline())
// TODO: abend if flag.help == ""
help := flag.help
if help == "" {
help = "TODO WRITEME"
// ShowHelp produces formatting for `mlr help flags` and manpage use.
// Example:
// * Flag name is `--foo`
// * altName is `-f`
// * Argument spec is `{bar}`
// * Help string is "Lorem ipsum dolor sit amet, consectetur adipiscing elit,
// sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
// ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
// ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
// velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
// cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
// est laborum."
// * The headline (see the GetHeadline function) is `--foo or -f {bar}`.
// * We place the headline left in a 25-character column, colorized with the
// help color.
// * We format the help text as 55-character lines and place them
// to the right.
// * The result looks like
//
// --foo or -f {bar} Lorem ipsum dolor sit amet, consectetur adipiscing
// elit, sed do eiusmod tempor incididunt ut labore et
// dolore magna aliqua. Ut enim ad minim veniam, quis
// nostrud exercitation ullamco laboris nisi ut aliquip
// ex ea commodo consequat. Duis aute irure dolor in
// reprehenderit in voluptate velit esse cillum dolore
// eu fugiat nulla pariatur. Excepteur sint occaecat
// cupidatat non proident, sunt in culpa qui officia
// deserunt mollit anim id est laborum.
//
// * If the headline is too long we put the first help line a line below like this:
//
// --foo-flag-is-very-very-long {bar}
// Lorem ipsum dolor sit amet, consectetur adipiscing
// elit, sed do eiusmod tempor incididunt ut labore et
// dolore magna aliqua. Ut enim ad minim veniam, quis
// nostrud exercitation ullamco laboris nisi ut aliquip
// ex ea commodo consequat. Duis aute irure dolor in
// reprehenderit in voluptate velit esse cillum dolore
// eu fugiat nulla pariatur. Excepteur sint occaecat
// cupidatat non proident, sunt in culpa qui officia
// deserunt mollit anim id est laborum.
//
func (flag *Flag) ShowHelp() {
headline := flag.GetHeadline()
displayHeadline := fmt.Sprintf("%-25s", headline)
broken := len(headline) >= 25
helpLines := lib.FormatAsParagraph(flag.help, 55)
if broken {
fmt.Printf("%s\n", colorizer.MaybeColorizeHelp(displayHeadline, true))
for _, helpLine := range helpLines {
fmt.Printf("%25s%s\n", " ", helpLine)
}
} else {
fmt.Printf("%s", colorizer.MaybeColorizeHelp(displayHeadline, true))
if len(helpLines) == 0 {
fmt.Println()
}
for i, helpLine := range helpLines {
if i == 0 {
fmt.Printf("%s\n", helpLine)
} else {
fmt.Printf("%25s%s\n", " ", helpLine)
}
}
}
fmt.Printf("%s %s\n", colorizer.MaybeColorizeHelp(displayText, true), help)
}
// TODO: comment
// GetHeadline puts together the flag name, any altNames, and any argument spec
// into a single string for the left column of online help / manpage content.
// Given flag named `--foo`, altName `-f`, and argument spec `{bar}`, the
// headline is `--foo or -f {bar}`. This is the bit which is highlighted in
// on-line help; its length is also used for alignment decisions in the on-line
// help and the manapge.
func (flag *Flag) GetHeadline() string {
displayNames := make([]string, 1)
displayNames[0] = flag.name
@ -261,6 +442,10 @@ func (flag *Flag) GetHeadline() string {
return displayText
}
// Gets the help string all on one line (just in case anyone typed it in using
// multiline string-literal backtick notation in Go). This is suitable for
// webdoc use where we create all one line, and the browser dynamically
// line-wraps as the user resizes the window.
func (flag *Flag) GetHelpOneLine() string {
return strings.Join(strings.Split(flag.help, "\n"), " ")
}
@ -273,5 +458,3 @@ func (flag *Flag) GetHelpOneLine() string {
func NoOpParse1(args []string, argc int, pargi *int, options *TOptions) {
*pargi += 1
}
var NoOpHelp string = "No-op pass-through for backward compatibility with Miller 5."

View file

@ -1247,9 +1247,10 @@ func ParseMiscOptions(
}
argi += 2
} else if args[argi] == "--list" {
// TODO: some terminal/main/something
} else if args[argi] == "-g" {
argi += 1
FLAG_TABLE.ListTemp()
FLAG_TABLE.ShowHelp()
os.Exit(0)
}
@ -1265,6 +1266,7 @@ var FLAG_TABLE = FlagTable{
&SeparatorFlagSection,
&FileFormatFlagSection,
&FormatConversionKeystrokeSaverFlagSection,
// TODO: &HelpFlags, here or in climain?
&JSONOnlyFlagSection,
&CSVOnlyFlagSection,
&PPRINTOnlyFlagSection,
@ -1294,19 +1296,19 @@ TODO: auto-detect is still TBD for Miller 6
Notes about line endings:
* Default line endings (`+"`--irs`"+` and `+"`--ors`"+`) are "auto" which means autodetect from
* Default line endings (` + "`--irs`" + ` and ` + "`--ors`" + `) are "auto" which means autodetect from
the input file format, as long as the input file(s) have lines ending in either
LF (also known as linefeed, `+"`\\n`"+`, `+"`0x0a`"+`, or Unix-style) or CRLF (also known as
carriage-return/linefeed pairs, `+"`\\r\\n`"+`, `+"`0x0d 0x0a`"+`, or Windows-style).
* If both `+"`irs`"+` and `+"`ors`"+` are `+"`auto`"+` (which is the default) then LF input will lead to LF
LF (also known as linefeed, ` + "`\\n`" + `, ` + "`0x0a`" + `, or Unix-style) or CRLF (also known as
carriage-return/linefeed pairs, ` + "`\\r\\n`" + `, ` + "`0x0d 0x0a`" + `, or Windows-style).
* If both ` + "`irs`" + ` and ` + "`ors`" + ` are ` + "`auto`" + ` (which is the default) then LF input will lead to LF
output and CRLF input will lead to CRLF output, regardless of the platform you're
running on.
* The line-ending autodetector triggers on the first line ending detected in the input
stream. E.g. if you specify a CRLF-terminated file on the command line followed by an
LF-terminated file then autodetected line endings will be CRLF.
* If you use `+"`--ors {something else}`"+` with (default or explicitly specified) `+"`--irs auto`"+`
* If you use ` + "`--ors {something else}`" + ` with (default or explicitly specified) ` + "`--irs auto`" + `
then line endings are autodetected on input and set to what you specify on output.
* If you use `+"`--irs {something else}`"+` with (default or explicitly specified) `+"`--ors auto`"+`
* If you use ` + "`--irs {something else}`" + ` with (default or explicitly specified) ` + "`--ors auto`" + `
then the output line endings used are LF on Unix/Linux/BSD/MacOSX, and CRLF on Windows.
Notes about all other separators:
@ -1315,21 +1317,21 @@ Notes about all other separators:
do key-value pairs appear juxtaposed.
* IRS/ORS are ignored for XTAB format. Nominally IFS and OFS are newlines;
XTAB records are separated by two or more consecutive IFS/OFS -- i.e.
a blank line. Everything above about `+"`--irs/--ors/--rs auto`"+` becomes `+"`--ifs/--ofs/--fs`"+`
a blank line. Everything above about ` + "`--irs/--ors/--rs auto`" + ` becomes ` + "`--ifs/--ofs/--fs`" + `
auto for XTAB format. (XTAB's default IFS/OFS are "auto".)
* OFS must be single-character for PPRINT format. This is because it is used
with repetition for alignment; multi-character separators would make
alignment impossible.
* OPS may be multi-character for XTAB format, in which case alignment is
disabled.
* TSV is simply CSV using tab as field separator (`+"`--fs tab`"+`).
* TSV is simply CSV using tab as field separator (` + "`--fs tab`" + `).
* FS/PS are ignored for markdown format; RS is used.
* All FS and PS options are ignored for JSON format, since they are not relevant
to the JSON format.
* You can specify separators in any of the following ways, shown by example:
- Type them out, quoting as necessary for shell escapes, e.g.
`+"`--fs '|' --ips :`"+`
- C-style escape sequences, e.g. `+"`--rs '\\r\\n' --fs '\\t'`"+`.
` + "`--fs '|' --ips :`" + `
- C-style escape sequences, e.g. ` + "`--rs '\\r\\n' --fs '\\t'`" + `.
- To avoid backslashing, you can use any of the following names:
TODO desc-to-chars map
@ -1359,7 +1361,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--ifs",
arg: "{string}",
arg: "{string}",
help: "Specify FS for input.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1373,7 +1375,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--ips",
arg: "{string}",
arg: "{string}",
help: "Specify PS for input.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1387,7 +1389,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--irs",
arg: "{string}",
arg: "{string}",
help: "Specify RS for input.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1412,7 +1414,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--ors",
arg: "{string}",
arg: "{string}",
help: "Specify RS for output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1424,7 +1426,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--ofs",
arg: "{string}",
arg: "{string}",
help: "Specify FS for output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1436,7 +1438,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--ops",
arg: "{string}",
arg: "{string}",
help: "Specify PS for output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1448,7 +1450,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--rs",
arg: "{string}",
arg: "{string}",
help: "Specify RS for input and output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1464,7 +1466,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--fs",
arg: "{string}",
arg: "{string}",
help: "Specify FS for input and output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1480,7 +1482,7 @@ var SeparatorFlagSection = FlagSection{
{
name: "--ps",
arg: "{string}",
arg: "{string}",
help: "Specify PS for input and output.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1506,7 +1508,7 @@ func JSONOnlyPrintInfo() {
func init() { JSONOnlyFlagSection.Sort() }
var JSONOnlyFlagSection = FlagSection{
name: "JSON-only flags",
name: "JSON-only flags",
infoPrinter: JSONOnlyPrintInfo,
flags: []Flag{
@ -1531,16 +1533,15 @@ var JSONOnlyFlagSection = FlagSection{
},
{
name: "--jlistwrap",
name: "--jlistwrap",
altNames: []string{"--jl"},
help: "Wrap JSON output in outermost `[ ]`.",
help: "Wrap JSON output in outermost `[ ]`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
options.WriterOptions.WrapJSONOutputInOuterList = true
*pargi += 1
},
forWriter: true,
},
},
}
@ -1554,7 +1555,7 @@ func PPRINTOnlyPrintInfo() {
func init() { PPRINTOnlyFlagSection.Sort() }
var PPRINTOnlyFlagSection = FlagSection{
name: "PPRINT-only flags",
name: "PPRINT-only flags",
infoPrinter: PPRINTOnlyPrintInfo,
flags: []Flag{
@ -1589,31 +1590,42 @@ They are accepted as no-op flags in order to keep old scripts from breaking.`)
func init() { LegacyFlagSection.Sort() }
var LegacyFlagSection = FlagSection{
name: "Legacy flags",
name: "Legacy flags",
infoPrinter: LegacyFlagInfoPrint,
flags: []Flag{
{
name: "--mmap",
help: "Miller no longer uses memory-mapping to access data files.",
help: "Miller no longer uses memory-mapping to access data files.",
parser: NoOpParse1,
forReader: true,
},
{
name: "--no-mmap",
help: "Miller no longer uses memory-mapping to access data files.",
help: "Miller no longer uses memory-mapping to access data files.",
parser: NoOpParse1,
forReader: true,
},
{
name: "--no-fflush",
help: "The current implementation of Miller does not use buffered output, so there is no longer anything to suppress here.",
help: "The current implementation of Miller does not use buffered output, so there is no longer anything to suppress here.",
parser: NoOpParse1,
forWriter: true,
},
{
name: "--jsonx",
help: "The `--jvstack` flag is now default true in Miller 6.",
parser: NoOpParse1,
},
{
name: "--ojsonx",
help: "The `--jvstack` flag is now default true in Miller 6.",
parser: NoOpParse1,
},
{
name: "--jknquoteint",
help: "Type information from JSON input files is now preserved throughout the processing stream.",
@ -1639,7 +1651,7 @@ var LegacyFlagSection = FlagSection{
},
{
name: "--json-skip-arrays-on-input",
name: "--json-map-arrays-on-input",
help: "Miller now supports arrays as of version 6.",
parser: NoOpParse1,
},
@ -1659,12 +1671,12 @@ func FileFormatPrintInfo() {
// TODO
fmt.Println(`TO DO: brief list of formats w/ xref to m6 webdocs.
Examples: `+"`--csv`"+` for CSV-formatted input and output; `+"`--icsv --opprint`"+` for
Examples: ` + "`--csv`" + ` for CSV-formatted input and output; ` + "`--icsv --opprint`" + ` for
CSV-formatted input and pretty-printed output.
Please use `+"`--iformat1 --oformat2`"+` rather than `+"`--format1 --oformat2`"+`.
The latter sets up input and output flags for `+"`format1`"+`, not all of which
are overridden in all cases by setting output format to `+"`format2`"+`.`)
Please use ` + "`--iformat1 --oformat2`" + ` rather than ` + "`--format1 --oformat2`" + `.
The latter sets up input and output flags for ` + "`format1`" + `, not all of which
are overridden in all cases by setting output format to ` + "`format2`" + `.`)
}
//--idkvp --odkvp --dkvp Delimited key-value pairs, e.g "a=1,b=2"
@ -1708,10 +1720,6 @@ are overridden in all cases by setting output format to `+"`format2`"+`.`)
//--ijson --ojson --json JSON tabular: sequence or list of one-level
// maps: {...}{...} or [{...},{...}].
// --jsonx --ojsonx Keystroke-savers for --json --jvstack
// --jsonx --ojsonx and --ojson --jvstack, respectively.
func init() { FileFormatFlagSection.Sort() }
var FileFormatFlagSection = FlagSection{
@ -1835,7 +1843,7 @@ var FileFormatFlagSection = FlagSection{
{
name: "-i",
arg: "{format name}",
arg: "{format name}",
help: "Use format name for input data. For example: `-i csv` is the same as `--icsv`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1890,7 +1898,7 @@ var FileFormatFlagSection = FlagSection{
{
name: "-o",
arg: "{format name}",
arg: "{format name}",
help: "Use format name for output data. For example: `-o csv` is the same as `--ocsv`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -1993,15 +2001,6 @@ var FileFormatFlagSection = FlagSection{
*pargi += 1
},
},
{
name: "--ojsonx",
help: "TODO", // move to legacy
parser: func(args []string, argc int, pargi *int, options *TOptions) {
// --jvstack is now the default in Miller 6 so this is just for backward compatibility
options.WriterOptions.OutputFileFormat = "json"
*pargi += 1
},
},
{
name: "--onidx",
@ -2034,7 +2033,7 @@ var FileFormatFlagSection = FlagSection{
{
name: "--io",
arg: "{format name}",
arg: "{format name}",
help: "Use format name for input and output data. For example: `--io csv` is the same as `--csv`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -2159,16 +2158,6 @@ var FileFormatFlagSection = FlagSection{
*pargi += 1
},
},
{
name: "--jsonx",
help: "TODO", // move to legacy
parser: func(args []string, argc int, pargi *int, options *TOptions) {
// --jvstack is now the default in Miller 6 so this is just for backward compatibility
options.ReaderOptions.InputFileFormat = "json"
options.WriterOptions.OutputFileFormat = "json"
*pargi += 1
},
},
{
name: "--nidx",
@ -2212,17 +2201,20 @@ var FileFormatFlagSection = FlagSection{
// FORMAT-CONVERSION KEYSTROKE-SAVER FLAGS
func FormatConversionKeystrokeSaverPrintInfo() {
fmt.Println(`As keystroke-savers for format-conversion you may use the following:
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--t2c --t2d --t2n --t2j --t2x --t2p --t2m
--d2c --d2t --d2n --d2j --d2x --d2p --d2m
--n2c --n2t --n2d --n2j --n2x --n2p --n2m
--j2c --j2t --j2d --j2n --j2x --j2p --j2m
--x2c --x2t --x2d --x2n --x2j --x2p --x2m
--p2c --p2t --p2d --p2n --p2j --p2x --p2m
The letters c t d n j x p m refer to formats CSV, TSV, DKVP, NIDX, JSON, XTAB,
PPRINT, and markdown, respectively. Note that markdown format is available for
output only.`)
fmt.Println(`As keystroke-savers for format-conversion you may use the following.
The letters c, t, j, d, n, x, p, and m refer to formats CSV, TSV, DKVP, NIDX,
JSON, XTAB, PPRINT, and markdown, respectively. Note that markdown format is
available for output only.
| In \ out | CSV | TSV | JSON | DKVP | NIDX | XTAB | PPRINT | Markdown |
| CSV | | --c2t | --c2j | --c2d | --c2n | --c2x | --c2p | --c2m |
| TSV | --t2c | | --t2j | --t2d | --t2n | --t2x | --t2p | --t2m |
| JSON | --j2c | --j2t | | --j2d | --j2n | --j2x | --j2p | --j2m |
| DKVP | --d2c | --d2t | --d2j | | --d2n | --d2x | --d2p | --d2m |
| NIDX | --n2c | --n2t | --n2j | --n2d | | --n2x | --n2p | --n2m |
| XTAB | --x2c | --x2t | --x2j | --x2d | --x2n | | --x2p | --x2m |
| PPRINT | --p2c | --p2t | --p2j | --p2d | --p2n | --p2x | | --p2m |
`)
}
func init() { FormatConversionKeystrokeSaverFlagSection.Sort() }
@ -2896,7 +2888,7 @@ var CSVOnlyFlagSection = FlagSection{
{
name: "--allow-ragged-csv-input",
altNames: []string{"--ragged"},
help: "If a data line has fewer fields than the header line, fill remaining keys with empty string. If a data line has more fields than the header line, use integer field labels as in the implicit-header case.",
help: "If a data line has fewer fields than the header line, fill remaining keys with empty string. If a data line has more fields than the header line, use integer field labels as in the implicit-header case.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
options.ReaderOptions.AllowRaggedCSVInput = true
*pargi += 1
@ -2967,6 +2959,23 @@ var CSVOnlyFlagSection = FlagSection{
// },
//},
//func helpDoubleQuoting() {
// fmt.Printf("THIS IS STILL WIP FOR MILLER 6\n")
// fmt.Println(
// `--quote-all Wrap all fields in double quotes
//--quote-none Do not wrap any fields in double quotes, even if they have
// OFS or ORS in them
//--quote-minimal Wrap fields in double quotes only if they have OFS or ORS
// in them (default)
//--quote-numeric Wrap fields in double quotes only if they have numbers
// in them
//--quote-original Wrap fields in double quotes if and only if they were
// quoted on input. This isn't sticky for computed fields:
// e.g. if fields a and b were quoted on input and you do
// "put '$c = $a . $b'" then field c won't inherit a or b's
// was-quoted-on-input flag.`)
//}
},
}
@ -2976,14 +2985,14 @@ var CSVOnlyFlagSection = FlagSection{
func CompressedDataPrintInfo() {
fmt.Print(`Miller offers a few different ways to handle reading data files which have been compressed.
* Decompression done within the Miller process itself: `+"`--bz2in`"+` `+"`--gzin`"+` `+"`--zin`"+`
* Decompression done outside the Miller process: `+"`--prepipe`"+` `+"`--prepipex`"+`
* Decompression done within the Miller process itself: ` + "`--bz2in`" + ` ` + "`--gzin`" + ` ` + "`--zin`" + `
* Decompression done outside the Miller process: ` + "`--prepipe`" + ` ` + "`--prepipex`" + `
Using `+"`--prepipe`"+` and `+"`--prepipex`"+` you can specify an action to be
Using ` + "`--prepipe`" + ` and ` + "`--prepipex`" + ` you can specify an action to be
taken on each input file. The prepipe command must be able to read from
standard input; it will be invoked with `+"`{command} < {filename}`"+`. The
standard input; it will be invoked with ` + "`{command} < {filename}`" + `. The
prepipex command must take a filename as argument; it will be invoked with
`+"`{command} {filename}`"+`.
` + "`{command} {filename}`" + `.
Examples:
@ -2995,11 +3004,11 @@ Examples:
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
`+"`mlr ... | {your compression command} > outputfilenamegoeshere`"+`
` + "`mlr ... | {your compression command} > outputfilenamegoeshere`" + `
Lastly, note that if `+"`--prepipe`"+` or `+"`--prepipex`"+` is specified, it replaces any
Lastly, note that if ` + "`--prepipe`" + ` or ` + "`--prepipex`" + ` is specified, it replaces any
decisions that might have been made based on the file suffix. Likewise,
`+"`--gzin`"+`/`+"`--bz2in`"+`/`+"`--zin`"+` are ignored if `+"`--prepipe`"+` is also specified.
` + "`--gzin`" + `/` + "`--bz2in`" + `/` + "`--zin`" + ` are ignored if ` + "`--prepipe`" + ` is also specified.
`)
}
@ -3012,7 +3021,7 @@ var CompressedDataFlagSection = FlagSection{
{
name: "--prepipe",
arg: "{decompression command}",
arg: "{decompression command}",
help: "You can, of course, already do without this for single input files, e.g. `gunzip < myfile.csv.gz | mlr ...`. Allowed at the command line, but not in `.mlrrc` to avoid unexpected code execution.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3024,7 +3033,7 @@ var CompressedDataFlagSection = FlagSection{
{
name: "--prepipex",
arg: "{decompression command}",
arg: "{decompression command}",
help: "Like `--prepipe` with one exception: doesn't insert `<` between command and filename at runtime. Useful for some commands like `unzip -qc` which don't read standard input. Allowed at the command line, but not in `.mlrrc` to avoid unexpected code execution.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3125,7 +3134,7 @@ var CommentsInDataFlagSection = FlagSection{
{
name: "--skip-comments",
help: "Ignore commented lines (prefixed by `"+DEFAULT_COMMENT_STRING+"`) within the input.",
help: "Ignore commented lines (prefixed by `" + DEFAULT_COMMENT_STRING + "`) within the input.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
options.ReaderOptions.CommentString = DEFAULT_COMMENT_STRING
options.ReaderOptions.CommentHandling = SkipComments
@ -3135,7 +3144,7 @@ var CommentsInDataFlagSection = FlagSection{
{
name: "--skip-comments-with",
arg: "{string}",
arg: "{string}",
help: "Ignore commented lines within input, with specified prefix.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3147,7 +3156,7 @@ var CommentsInDataFlagSection = FlagSection{
{
name: "--pass-comments",
help: "Immediately print commented lines (prefixed by `"+DEFAULT_COMMENT_STRING+"`) within the input.",
help: "Immediately print commented lines (prefixed by `" + DEFAULT_COMMENT_STRING + "`) within the input.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
options.ReaderOptions.CommentString = DEFAULT_COMMENT_STRING
options.ReaderOptions.CommentHandling = PassComments
@ -3157,7 +3166,7 @@ var CommentsInDataFlagSection = FlagSection{
{
name: "--pass-comments-with",
arg: "{string}",
arg: "{string}",
help: "Immediately print commented lines within input, with specified prefix.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3185,16 +3194,16 @@ Things having colors:
Rules for coloring:
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: `+"`mlr --csv cat foo.csv`"+`
* Example: no color: `+"`mlr --csv cat foo.csv > bar.csv`"+`
* Example: no color: `+"`mlr --csv cat foo.csv | less`"+`
* Example: color: ` + "`mlr --csv cat foo.csv`" + `
* Example: no color: ` + "`mlr --csv cat foo.csv > bar.csv`" + `
* Example: no color: ` + "`mlr --csv cat foo.csv | less`" + `
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
Mechanisms for coloring:
* Miller uses ANSI escape sequences only. This does not work on Windows except within Cygwin.
* Requires `+"`TERM`"+` environment variable to be set to non-empty string.
* Requires ` + "`TERM`" + ` environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
@ -3202,25 +3211,25 @@ Mechanisms for coloring:
How you can control colorization:
* Suppression/unsuppression:
* Environment variable `+"`export MLR_NO_COLOR=true`"+` means don't color even if stdout+TTY.
* Environment variable `+"`export MLR_ALWAYS_COLOR=true`"+` means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to `+"`less -r`"+`.
* Command-line flags `+"`--no-color`"+` or `+"`-M`"+`, `+"`--always-color`"+` or `+"`-C`"+`.
* Environment variable ` + "`export MLR_NO_COLOR=true`" + ` means don't color even if stdout+TTY.
* Environment variable ` + "`export MLR_ALWAYS_COLOR=true`" + ` means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to ` + "`less -r`" + `.
* Command-line flags ` + "`--no-color`" + ` or ` + "`-M`" + `, ` + "`--always-color`" + ` or ` + "`-C`" + `.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* `+"`export MLR_KEY_COLOR=208`"+`, `+"`MLR_VALUE_COLOR=33`"+`, etc.:
`+"`MLR_KEY_COLOR`"+` `+"`MLR_VALUE_COLOR`"+` `+"`MLR_PASS_COLOR`"+` `+"`MLR_FAIL_COLOR`"+`
`+"`MLR_REPL_PS1_COLOR`"+` `+"`MLR_REPL_PS2_COLOR`"+` `+"`MLR_HELP_COLOR`"+`
* Command-line flags `+"`--key-color 208`"+`, `+"`--value-color 33`"+`, etc.:
`+"`--key-color`"+` `+"`--value-color`"+` `+"`--pass-color`"+` `+"`--fail-color`"+`
`+"`--repl-ps1-color`"+` `+"`--repl-ps2-color`"+` `+"`--help-color`"+`
* This is particularly useful if your terminal's background color clashes with current settings.
* ` + "`export MLR_KEY_COLOR=208`" + `, ` + "`MLR_VALUE_COLOR=33`" + `, etc.:
` + "`MLR_KEY_COLOR`" + ` ` + "`MLR_VALUE_COLOR`" + ` ` + "`MLR_PASS_COLOR`" + ` ` + "`MLR_FAIL_COLOR`" + `
` + "`MLR_REPL_PS1_COLOR`" + ` ` + "`MLR_REPL_PS2_COLOR`" + ` ` + "`MLR_HELP_COLOR`" + `
* Command-line flags ` + "`--key-color 208`" + `, ` + "`--value-color 33`" + `, etc.:
` + "`--key-color`" + ` ` + "`--value-color`" + ` ` + "`--pass-color`" + ` ` + "`--fail-color`" + `
` + "`--repl-ps1-color`" + ` ` + "`--repl-ps2-color`" + ` ` + "`--help-color`" + `
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided, the latter take precedence.
Please do mlr `+"`--list-color-codes`"+` to see the available color codes (like 170), and
`+"`mlr --list-color-names`"+` to see available names (like `+"`orchid`"+`).
Please do mlr ` + "`--list-color-codes`" + ` to see the available color codes (like 170), and
` + "`mlr --list-color-names`" + ` to see available names (like ` + "`orchid`" + `).
`)
}
@ -3356,8 +3365,8 @@ var FlattenUnflattenFlagSection = FlagSection{
{
name: "--flatsep",
altNames: []string{"--jflatsep", "--oflatsep"}, // TODO: really need all for miller5 back-compat?
arg: "{string}",
help: "Separator for flattening multi-level JSON keys, e.g. `{\"a\":{\"b\":3}}` becomes `a:b => 3` for non-JSON formats. Defaults to `"+DEFAULT_JSON_FLATTEN_SEPARATOR+"`.",
arg: "{string}",
help: "Separator for flattening multi-level JSON keys, e.g. `{\"a\":{\"b\":3}}` becomes `a:b => 3` for non-JSON formats. Defaults to `" + DEFAULT_JSON_FLATTEN_SEPARATOR + "`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
options.WriterOptions.FLATSEP = SeparatorFromArg(args[*pargi+1])
@ -3429,7 +3438,7 @@ var MiscFlagSection = FlagSection{
{
name: "--from",
arg: "{filename}",
arg: "{filename}",
help: "Use this to specify an input file before the verb(s), rather than after. May be used more than once. Example: `mlr --from a.dat --from b.dat cat` is the same as `mlr cat a.dat b.dat`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3440,7 +3449,7 @@ var MiscFlagSection = FlagSection{
{
name: "--mfrom",
arg: "{filenames}",
arg: "{filenames}",
help: "Use this to specify one of more input files before the verb(s), rather than after. May be used more than once. The list of filename must end with `--`. This is useful for example since `--from *.csv` doesn't do what you might hope but `--mfrom *.csv --` does.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3457,6 +3466,8 @@ var MiscFlagSection = FlagSection{
{
name: "--ofmt",
arg: "{format}",
help: "E.g. %.18f, %.0f, %9.6e. Please use sprintf-style codes for floating-point nummbers. If not specified, default formatting is used. See also the `fmtnum` function and the `format-values` verb.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
options.WriterOptions.FPOFMT = args[*pargi+1]
@ -3467,7 +3478,7 @@ var MiscFlagSection = FlagSection{
// TODO: move to another (or new) section
{
name: "--load",
arg: "{filename}",
arg: "{filename}",
help: "Load DSL script file for all put/filter operations on the command line. If the name following `--load` is a directory, load all `*.mlr` files in that directory. This is just like `put -f` and `filter -f` except it's up-front on the command line, so you can do something like `alias mlr='mlr --load ~/myscripts'` if you like.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3478,7 +3489,7 @@ var MiscFlagSection = FlagSection{
{
name: "--mload",
arg: "{filenames}",
arg: "{filenames}",
help: "Like `--load` but works with more than one filename, e.g. `--mload *.mlr --`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {
CheckArgCount(args, *pargi, argc, 2)
@ -3518,7 +3529,7 @@ var MiscFlagSection = FlagSection{
{
name: "--seed",
arg: "{n}",
arg: "{n}",
help: "with `n` of the form `12345678` or `0xcafefeed`. For `put`/`filter` `urand`, `urandint`, and `urand32`.",
parser: func(args []string, argc int, pargi *int, options *TOptions) {

View file

@ -49,18 +49,18 @@ func ParseCommandLine(args []string) (
os.Exit(0)
// TODO
// } else if cli.FLAG_TABLE.Parse(args, argc, &argi, &options) {
// // handled
} else if cli.FLAG_TABLE.Parse(args, argc, &argi, &options) {
// handled
} else if cli.ParseReaderOptions(args, argc, &argi, &options.ReaderOptions) {
// handled
} else if cli.ParseWriterOptions(args, argc, &argi, &options.WriterOptions) {
// handled
} else if cli.ParseReaderWriterOptions(args, argc, &argi,
&options.ReaderOptions, &options.WriterOptions) {
// handled
} else if cli.ParseMiscOptions(args, argc, &argi, &options) {
// handled
//} else if cli.ParseReaderOptions(args, argc, &argi, &options.ReaderOptions) {
//// handled
//} else if cli.ParseWriterOptions(args, argc, &argi, &options.WriterOptions) {
//// handled
//} else if cli.ParseReaderWriterOptions(args, argc, &argi,
//&options.ReaderOptions, &options.WriterOptions) {
//// handled
//} else if cli.ParseMiscOptions(args, argc, &argi, &options) {
//// handled
} else {
// unhandled
fmt.Fprintf(os.Stderr, "%s: option \"%s\" not recognized.\n", "mlr", args[argi])

View file

@ -1,7 +1,9 @@
package lib
import (
"bytes"
"fmt"
"strings"
)
// For online help contexts like printing all the built-in DSL functions, or
@ -31,3 +33,39 @@ func PrintWordsAsParagraph(words []string) {
fmt.Printf("\n")
}
// For online help contexts like printing all the built-in DSL functions, or
// the list of all verbs. Max width is nominally 80.
func FormatAsParagraph(text string, maxWidth int) []string {
lines := make([]string, 0)
words := strings.Fields(text)
separator := " "
separatorlen := len(separator)
linelen := 0
j := 0
var buffer bytes.Buffer
for _, word := range words {
wordlen := len(word)
linelen += separatorlen + wordlen
if linelen >= maxWidth {
line := buffer.String()
lines = append(lines, line)
buffer.Reset()
linelen = separatorlen + wordlen
j = 0
}
if j > 0 {
buffer.WriteString(separator)
}
buffer.WriteString(word)
j++
}
line := buffer.String()
if line != "" {
lines = append(lines, line)
}
return lines
}

View file

@ -7,9 +7,9 @@ NAME
as CSV and tabular JSON.
SYNOPSIS
Usage: mlr [I/O options] {verb} [verb-dependent options ...] {zero or
more file names} Output of one verb may be chained as input to another
using "then", e.g.
Usage: mlr [flags] {verb} [verb-dependent options ...] {zero or more
file names} Output of one verb may be chained as input to another using
"then", e.g.
mlr stats1 -a min,mean,max -f flag,u,v -g color then sort -f color
Please see 'mlr help topics' for more information. Please also see
https://johnkerl.org/miller6
@ -95,6 +95,43 @@ DATA FORMATS
| fox jumped | Record 2: "1":"fox", "2":"jumped"
+---------------------+
HELP OPTIONS
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help basic-examples
mlr help data-formats
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
VERB LIST
altkv bar bootstrap cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
@ -128,117 +165,171 @@ FUNCTION LIST
version ! != !=~ % & && * ** + - . .* .+ .- ./ / // < << <= == =~ > >= >> >>>
?: ?? ??? ^ ^^ | || ~
HELP OPTIONS
Type 'mlr help {topic}' for any of the following:
mlr help topics
mlr help auxents
mlr help basic-examples
mlr help comments-in-data
mlr help compressed-data
mlr help csv-options
mlr help data-format-options
mlr help data-formats
mlr help double-quoting
mlr help format-conversion
mlr help function
mlr help keyword
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help list-functions-as-paragraph
mlr help list-functions-as-table
mlr help list-keywords
mlr help list-keywords-as-paragraph
mlr help list-verbs
mlr help list-verbs-as-paragraph
mlr help misc
mlr help mlrrc
mlr help number-formatting
mlr help output-colorization
mlr help separator-options
mlr help type-arithmetic-info
mlr help usage-functions
mlr help usage-functions-by-class
mlr help usage-keywords
mlr help usage-verbs
mlr help verb
Shorthands:
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
COMMENTS-IN-DATA FLAGS
Miller lets you put comments in your data, such as
OPTIONS
In the following option flags, the version with "i" designates the
input stream, "o" the output stream, and the version without prefix
sets the option for both input and output stream. For example: --irs
sets the input record separator, --ors the output record separator, and
--rs sets both the input and output separator to the given value.
# This is a comment for a CSV file
a,b,c
1,2,3
4,5,6
DATA-FORMAT OPTIONS
--idkvp --odkvp --dkvp Delimited key-value pairs, e.g "a=1,b=2"
(Miller's default format).
Notes:
--inidx --onidx --nidx Implicitly-integer-indexed fields (Unix-toolkit style).
-T Synonymous with "--nidx --fs tab".
* Comments are only honored at the start of a line.
* In the absence of any of the below four options, comments are data like
any other text. (The comments-in-data feature is opt-in.)
* When `--pass-comments` is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
--icsv --ocsv --csv Comma-separated value (or tab-separated with --fs tab, etc.)
--pass-comments Immediately print commented lines (prefixed by `#`)
within the input.
--pass-comments-with {string}
Immediately print commented lines within input, with
specified prefix.
--skip-comments Ignore commented lines (prefixed by `#`) within the
input.
--skip-comments-with {string}
Ignore commented lines within input, with specified
prefix.
--itsv --otsv --tsv Keystroke-savers for "--icsv --ifs tab",
"--ocsv --ofs tab", "--csv --fs tab".
--iasv --oasv --asv Similar but using ASCII FS 0x1f and RS 0x1e\n",
--iusv --ousv --usv Similar but using Unicode FS U+241F (UTF-8 0xe2909f)\n",
and RS U+241E (UTF-8 0xe2909e)\n",
COMPRESSED-DATA FLAGS
Miller offers a few different ways to handle reading data files which have been compressed.
--icsvlite --ocsvlite --csvlite Comma-separated value (or tab-separated with --fs tab, etc.).
The 'lite' CSV does not handle RFC-CSV double-quoting rules; is
slightly faster and handles heterogeneity in the input stream via
empty newline followed by new header line. See also
https://johnkerl.org/miller6/file-formats.html#csv-tsv-asv-usv-etc
* Decompression done within the Miller process itself: `--bz2in` `--gzin` `--zin`
* Decompression done outside the Miller process: `--prepipe` `--prepipex`
--itsvlite --otsvlite --tsvlite Keystroke-savers for "--icsvlite --ifs tab",
"--ocsvlite --ofs tab", "--csvlite --fs tab".
-t Synonymous with --tsvlite.
--iasvlite --oasvlite --asvlite Similar to --itsvlite et al. but using ASCII FS 0x1f and RS 0x1e\n",
--iusvlite --ousvlite --usvlite Similar to --itsvlite et al. but using Unicode FS U+241F (UTF-8 0xe2909f)\n",
and RS U+241E (UTF-8 0xe2909e)\n",
Using `--prepipe` and `--prepipex` you can specify an action to be
taken on each input file. The prepipe command must be able to read from
standard input; it will be invoked with `{command} < {filename}`. The
prepipex command must take a filename as argument; it will be invoked with
`{command} {filename}`.
--ipprint --opprint --pprint Pretty-printed tabular (produces no
output until all input is in).
--right Right-justifies all fields for PPRINT output.
--barred Prints a border around PPRINT output
(only available for output).
Examples:
--omd Markdown-tabular (only available for output).
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--ixtab --oxtab --xtab Pretty-printed vertical-tabular.
--xvright Right-justifies values for XTAB format.
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
`mlr ... | {your compression command} > outputfilenamegoeshere`
--ijson --ojson --json JSON tabular: sequence or list of one-level
maps: {...}{...} or [{...},{...}].
--jvstack Put one key-value pair per line for JSON output.
--no-jvstack Put objects/arrays all on one line for JSON output.
--jsonx --ojsonx Keystroke-savers for --json --jvstack
--jsonx --ojsonx and --ojson --jvstack, respectively.
--jlistwrap Wrap JSON output in outermost [ ].
--flatsep {string} Separator for flattening multi-level JSON keys,
e.g. '{"a":{"b":3}}' becomes a:b => 3 for
non-JSON formats. Defaults to ..\n",
Lastly, note that if `--prepipe` or `--prepipex` is specified, it replaces any
decisions that might have been made based on the file suffix. Likewise,
`--gzin`/`--bz2in`/`--zin` are ignored if `--prepipe` is also specified.
-p is a keystroke-saver for --nidx --fs space --repifs
--bz2in Uncompress bzip2 within the Miller process. Done by
default if file ends in `.bz2`.
--gzin Uncompress gzip within the Miller process. Done by
default if file ends in `.gz`.
--prepipe {decompression command}
You can, of course, already do without this for
single input files, e.g. `gunzip < myfile.csv.gz |
mlr ...`. Allowed at the command line, but not in
`.mlrrc` to avoid unexpected code execution.
--prepipe-bz2 Same as `--prepipe bz2`, except this is allowed in
`.mlrrc`.
--prepipe-gunzip Same as `--prepipe gunzip`, except this is allowed in
`.mlrrc`.
--prepipe-zcat Same as `--prepipe zcat`, except this is allowed in
`.mlrrc`.
--prepipex {decompression command}
Like `--prepipe` with one exception: doesn't insert
`<` between command and filename at runtime. Useful
for some commands like `unzip -qc` which don't read
standard input. Allowed at the command line, but not
in `.mlrrc` to avoid unexpected code execution.
--zin Uncompress zlib within the Miller process. Done by
default if file ends in `.z`.
Examples: --csv for CSV-formatted input and output; --icsv --opprint for
CSV-ONLY FLAGS
--allow-ragged-csv-input or --ragged
If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line
has more fields than the header line, use integer
field labels as in the implicit-header case.
--headerless-csv-output Print only CSV data lines; do not print CSV header
lines.
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line
1 of input files. Tip: combine with `label` to
recreate missing headers.
--no-implicit-csv-header Opposite of `--implicit-csv-header`. This is the
default anyway -- the main use is for the flags to
`mlr join` if you have main file(s) which are
headerless but you want to join in on a file which
does have a CSV header. Then you could use `mlr --csv
--implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ...
your-headerless.csv`.
-N Keystroke-saver for `--implicit-csv-header
--headerless-csv-output`.
FILE-FORMAT FLAGS
TO DO: brief list of formats w/ xref to m6 webdocs.
Examples: `--csv` for CSV-formatted input and output; `--icsv --opprint` for
CSV-formatted input and pretty-printed output.
Please use --iformat1 --oformat2 rather than --format1 --oformat2.
The latter sets up input and output flags for format1, not all of which
are overridden in all cases by setting output format to format2.
Please use `--iformat1 --oformat2` rather than `--format1 --oformat2`.
The latter sets up input and output flags for `format1`, not all of which
are overridden in all cases by setting output format to `format2`.
FORMAT-CONVERSION KEYSTROKE-SAVERS
--asv or --asvlite Use ASV format for input and output data.
--csv or -c Use CSV format for input and output data.
--csvlite Use CSV-lite format for input and output data.
--dkvp Use DKVP format for input and output data.
--iasv or --iasvlite Use ASV format for input data.
--icsv Use CSV format for input data.
--icsvlite Use CSV-lite format for input data.
--idkvp Use DKVP format for input data.
--ijson Use JSON format for input data.
--inidx Use NIDX format for input data.
--io {format name} Use format name for input and output data. For
example: `--io csv` is the same as `--csv`.
--ipprint Use PPRINT format for input data.
--itsv Use TSV format for input data.
--itsvlite Use TSV-lite format for input data.
--iusv or --iusvlite Use USV format for input data.
--ixtab Use XTAB format for input data.
--json or -j Use JSON format for input and output data.
--nidx Use NIDX format for input and output data.
--oasv or --oasvlite Use ASV format for output data.
--ocsv Use CSV format for output data.
--ocsvlite Use CSV-lite format for output data.
--odkvp Use DKVP format for output data.
--ojson Use JSON format for output data.
--omd Use markdown-tabular format for output data.
--onidx Use NIDX format for output data.
--opprint Use PPRINT format for output data.
--otsv Use TSV format for output data.
--otsvlite Use TSV-lite format for output data.
--ousv or --ousvlite Use USV format for output data.
--oxtab Use XTAB format for output data.
--pprint Use PPRINT format for input and output data.
--tsv Use TSV format for input and output data.
--tsvlite or -t Use TSV-lite format for input and output data.
--usv or --usvlite Use USV format for input and output data.
--xtab Use XTAB format for input and output data.
-i {format name} Use format name for input data. For example: `-i csv`
is the same as `--icsv`.
-o {format name} Use format name for output data. For example: `-o
csv` is the same as `--ocsv`.
FLATTEN-UNFLATTEN FLAGS
--flatsep or --jflatsep or --oflatsep {string}
Separator for flattening multi-level JSON keys, e.g.
`{"a":{"b":3}}` becomes `a:b => 3` for non-JSON
formats. Defaults to `.`.
--no-auto-flatten
--no-auto-unflatten
FORMAT-CONVERSION KEYSTROKE-SAVER FLAGS
As keystroke-savers for format-conversion you may use the following:
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--t2c --t2d --t2n --t2j --t2x --t2p --t2m
--d2c --d2t --d2n --d2j --d2x --d2p --d2m
--n2c --n2t --n2d --n2j --n2x --n2p --n2m
@ -249,130 +340,270 @@ OPTIONS
PPRINT, and markdown, respectively. Note that markdown format is available for
output only.
SEPARATORS
THIS IS STILL TBD FOR MILLER 6
--c2b Use CSV for input, PPRINT with `--barred` for output.
--c2d Use CSV for input, DKVP for output.
--c2j Use CSV for input, JSON for output.
--c2m Use CSV for input, markdown-tabular for output.
--c2n Use CSV for input, NIDX for output.
--c2p Use CSV for input, PPRINT for output.
--c2t Use CSV for input, TSV for output.
--c2x Use CSV for input, XTAB for output.
--d2b Use DKVP for input, PPRINT with `--barred` for
output.
--d2c Use DKVP for input, CSV for output.
--d2j Use DKVP for input, JSON for output.
--d2m Use DKVP for input, markdown-tabular for output.
--d2n Use DKVP for input, NIDX for output.
--d2p Use DKVP for input, PPRINT for output.
--d2t Use DKVP for input, TSV for output.
--d2x Use DKVP for input, XTAB for output.
--j2b Use JSON for input, PPRINT with --barred for output.
--j2c Use JSON for input, CSV for output.
--j2d Use JSON for input, DKVP for output.
--j2m Use JSON for input, markdown-tabular for output.
--j2n Use JSON for input, NIDX for output.
--j2p Use JSON for input, PPRINT for output.
--j2t Use JSON for input, TSV for output.
--j2x Use JSON for input, XTAB for output.
--n2b Use NIDX for input, PPRINT with `--barred` for
output.
--n2c Use NIDX for input, CSV for output.
--n2d Use NIDX for input, DKVP for output.
--n2j Use NIDX for input, JSON for output.
--n2m Use NIDX for input, markdown-tabular for output.
--n2p Use NIDX for input, PPRINT for output.
--n2t Use NIDX for input, TSV for output.
--n2x Use NIDX for input, XTAB for output.
--p2c Use PPRINT for input, CSV for output.
--p2d Use PPRINT for input, DKVP for output.
--p2j Use PPRINT for input, JSON for output.
--p2m Use PPRINT for input, markdown-tabular for output.
--p2n Use PPRINT for input, NIDX for output.
--p2t Use PPRINT for input, TSV for output.
--p2x Use PPRINT for input, XTAB for output.
--t2b Use TSV for input, PPRINT with `--barred` for output.
--t2c Use TSV for input, CSV for output.
--t2d Use TSV for input, DKVP for output.
--t2j Use TSV for input, JSON for output.
--t2m Use TSV for input, markdown-tabular for output.
--t2n Use TSV for input, NIDX for output.
--t2p Use TSV for input, PPRINT for output.
--t2x Use TSV for input, XTAB for output.
--x2b Use XTAB for input, PPRINT with `--barred` for
output.
--x2c Use XTAB for input, CSV for output.
--x2d Use XTAB for input, DKVP for output.
--x2j Use XTAB for input, JSON for output.
--x2m Use XTAB for input, markdown-tabular for output.
--x2n Use XTAB for input, NIDX for output.
--x2p Use XTAB for input, PPRINT for output.
--x2t Use XTAB for input, TSV for output.
-p Keystroke-saver for `--nidx --fs space --repifs`.
-T Keystroke-saver for `--nidx --fs tab`.
COMPRESSED I/O
Decompression done within the Miller process itself:
--gzin Uncompress gzip within the Miller process. Done by default if file ends in ".gz".
--bz2in Uncompress bz2ip within the Miller process. Done by default if file ends in ".bz2".
--zin Uncompress zlib within the Miller process. Done by default if file ends in ".z".
JSON-ONLY FLAGS
These are flags which are applicable to JSON format.
Decompression done outside the Miller process:
--prepipe {command} You can, of course, already do without this for single input files,
e.g. "gunzip < myfile.csv.gz | mlr ..."
--prepipex {command} Like --prepipe with one exception: doesn't insert '<' between
command and filename at runtime. Useful for some commands like 'unzip -qc'
which don't read standard input.
--jlistwrap or --jl Wrap JSON output in outermost `[ ]`.
--jvstack Put one key-value pair per line for JSON output
(multi-line output).
--no-jvstack Put objects/arrays all on one line for JSON output.
Using --prepipe and --prepipex you can specify an action to be taken on each
input file. This prepipe command must be able to read from standard input; it
will be invoked with {command} < {filename}.
LEGACY FLAGS
These are flags which don't do anything in the current Miller version.
They are accepted as no-op flags in order to keep old scripts from breaking.
Examples:
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--jknquoteint Type information from JSON input files is now
preserved throughout the processing stream.
--jquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--json-fatal-arrays-on-input
Miller now supports arrays as of version 6.
--json-map-arrays-on-input
Miller now supports arrays as of version 6.
--json-skip-arrays-on-input
Miller now supports arrays as of version 6.
--jsonx The `--jvstack` flag is now default true in Miller 6.
--jvquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--mmap Miller no longer uses memory-mapping to access data
files.
--no-fflush The current implementation of Miller does not use
buffered output, so there is no longer anything to
suppress here.
--no-mmap Miller no longer uses memory-mapping to access data
files.
--ojsonx The `--jvstack` flag is now default true in Miller 6.
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
mlr ... | {your compression command} > outputfilenamegoeshere
MISCELLANEOUS FLAGS
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once.
Example: `mlr --from a.dat --from b.dat cat` is the
same as `mlr cat a.dat b.dat`.
--load {filename} Load DSL script file for all put/filter operations on
the command line. If the name following `--load` is a
directory, load all `*.mlr` files in that directory.
This is just like `put -f` and `filter -f` except
it's up-front on the command line, so you can do
something like `alias mlr='mlr --load ~/myscripts'`
if you like.
--mfrom {filenames} Use this to specify one of more input files before
the verb(s), rather than after. May be used more than
once. The list of filename must end with `--`. This
is useful for example since `--from *.csv` doesn't do
what you might hope but `--mfrom *.csv --` does.
--mload {filenames} Like `--load` but works with more than one filename,
e.g. `--mload *.mlr --`.
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style
codes for floating-point nummbers. If not specified,
default formatting is used. See also the `fmtnum`
function and the `format-values` verb.
--seed {n} with `n` of the form `12345678` or `0xcafefeed`. For
`put`/`filter` `urand`, `urandint`, and `urand32`.
-I Process files in-place. For each file name on the
command line, output is written to a temp file in the
same directory, which is then renamed over the
original. Each file is processed in isolation: if the
output format is CSV, CSV headers will be present in
each output file, statistics are only over each
file's own records; and so on.
-n Process no input files, nor standard input either.
Useful for `mlr put` with `begin`/`end` statements
only. (Same as `--from /dev/null`.) Also useful in
`mlr -n put -v '...'` for analyzing abstract syntax
trees (if that's your thing).
Lastly, note that if --prepipe or --prepipex is specified, it replaces any
decisions that might have been made based on the file suffix. Also,
--gzin/--bz2in/--zin are ignored if --prepipe is also specified.
OUTPUT-COLORIZATION FLAGS
Miller uses colors to highlight outputs. You can specify color preferences.
Note: output colorization does not work on Windows.
COMMENTS IN DATA
--skip-comments Ignore commented lines (prefixed by "#")
within the input.
--skip-comments-with {string} Ignore commented lines within input, with
specified prefix.
--pass-comments Immediately print commented lines (prefixed by "#")
within the input.
--pass-comments-with {string} Immediately print commented lines within input, with
specified prefix.
Things having colors:
Notes:
* Comments are only honored at the start of a line.
* In the absence of any of the above four options, comments are data like
any other text.
* When pass-comments is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc in regression-test output
* Some online-help strings
CSV-SPECIFIC OPTIONS
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line 1
of input files. Tip: combine with "label" to recreate
missing headers.
--no-implicit-csv-header Do not use --implicit-csv-header. This is the default
anyway -- the main use is for the flags to 'mlr join' if you have
main file(s) which are headerless but you want to join in on
a file which does have a CSV header. Then you could use
'mlr --csv --implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ... your-headerless.csv'
--allow-ragged-csv-input|--ragged If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line has more
fields than the header line, use integer field labels as in
the implicit-header case.
--headerless-csv-output Print only CSV data lines.
-N Keystroke-saver for --implicit-csv-header --headerless-csv-output.
Rules for coloring:
DOUBLE-QUOTING FOR CSV/CSVLITE OUTPUT
THIS IS STILL WIP FOR MILLER 6
--quote-all Wrap all fields in double quotes
--quote-none Do not wrap any fields in double quotes, even if they have
OFS or ORS in them
--quote-minimal Wrap fields in double quotes only if they have OFS or ORS
in them (default)
--quote-numeric Wrap fields in double quotes only if they have numbers
in them
--quote-original Wrap fields in double quotes if and only if they were
quoted on input. This isn't sticky for computed fields:
e.g. if fields a and b were quoted on input and you do
"put '$c = $a . $b'" then field c won't inherit a or b's
was-quoted-on-input flag.
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: `mlr --csv cat foo.csv`
* Example: no color: `mlr --csv cat foo.csv > bar.csv`
* Example: no color: `mlr --csv cat foo.csv | less`
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
NUMBER FORMATTING
THIS IS STILL WIP FOR MILLER 6
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style codes for
floating-point nummbers. If not specified, default formatting is used.
See also the fmtnum function within mlr put (mlr --help-all-functions);
see also the format-values function.
Mechanisms for coloring:
OTHER OPTIONS
--seed {n} with n of the form 12345678 or 0xcafefeed. For put/filter
urand()/urandint()/urand32().
--nr-progress-mod {m}, with m a positive integer: print filename and record
count to os.Stderr every m input records.
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once. Example:
"mlr --from a.dat --from b.dat cat" is the same as
"mlr cat a.dat b.dat".
--mfrom {filenames} -- Use this to specify one of more input files before the verb(s),
rather than after. May be used more than once.
The list of filename must end with "--". This is useful
for example since "--from *.csv" doesn't do what you might
hope but "--mfrom *.csv --" does.
--load {filename} Load DSL script file for all put/filter operations on the command line.
If the name following --load is a directory, load all "*.mlr" files
in that directory. This is just like "put -f" and "filter -f"
except it's up-front on the command line, so you can do something like
alias mlr='mlr --load ~/myscripts' if you like.
--mload {names} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-n Process no input files, nor standard input either. Useful
for mlr put with begin/end statements only. (Same as --from
/dev/null.) Also useful in "mlr -n put -v '...'" for
analyzing abstract syntax trees (if that's your thing).
-I Process files in-place. For each file name on the command
line, output is written to a temp file in the same
directory, which is then renamed over the original. Each
file is processed in isolation: if the output format is
CSV, CSV headers will be present in each output file
statistics are only over each file's own records; and so on.
* Miller uses ANSI escape sequences only. This does not work on Windows except within Cygwin.
* Requires `TERM` environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable `export MLR_NO_COLOR=true` means don't color even if stdout+TTY.
* Environment variable `export MLR_ALWAYS_COLOR=true` means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to `less -r`.
* Command-line flags `--no-color` or `-M`, `--always-color` or `-C`.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* `export MLR_KEY_COLOR=208`, `MLR_VALUE_COLOR=33`, etc.:
`MLR_KEY_COLOR` `MLR_VALUE_COLOR` `MLR_PASS_COLOR` `MLR_FAIL_COLOR`
`MLR_REPL_PS1_COLOR` `MLR_REPL_PS2_COLOR` `MLR_HELP_COLOR`
* Command-line flags `--key-color 208`, `--value-color 33`, etc.:
`--key-color` `--value-color` `--pass-color` `--fail-color`
`--repl-ps1-color` `--repl-ps2-color` `--help-color`
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided, the latter take precedence.
Please do mlr `--list-color-codes` to see the available color codes (like 170), and
`mlr --list-color-names` to see available names (like `orchid`).
--always-color or -C
--fail-color
--help-color
--key-color
--list-color-codes
--list-color-names
--no-color or -M
--pass-color
--value-color
PPRINT-ONLY FLAGS
These are flags which are applicable to PPRINT output format.
--barred Prints a border around PPRINT output (not available
for input).
--right Right-justifies all fields for PPRINT output.
SEPARATOR FLAGS
Separator options:
--rs --irs --ors Record separators, e.g. 'lf' or '\\r\\n'
--fs --ifs --ofs --repifs Field separators, e.g. comma
--ps --ips --ops Pair separators, e.g. equals sign
TODO: auto-detect is still TBD for Miller 6
Notes about line endings:
* Default line endings (`--irs` and `--ors`) are "auto" which means autodetect from
the input file format, as long as the input file(s) have lines ending in either
LF (also known as linefeed, `\n`, `0x0a`, or Unix-style) or CRLF (also known as
carriage-return/linefeed pairs, `\r\n`, `0x0d 0x0a`, or Windows-style).
* If both `irs` and `ors` are `auto` (which is the default) then LF input will lead to LF
output and CRLF input will lead to CRLF output, regardless of the platform you're
running on.
* The line-ending autodetector triggers on the first line ending detected in the input
stream. E.g. if you specify a CRLF-terminated file on the command line followed by an
LF-terminated file then autodetected line endings will be CRLF.
* If you use `--ors {something else}` with (default or explicitly specified) `--irs auto`
then line endings are autodetected on input and set to what you specify on output.
* If you use `--irs {something else}` with (default or explicitly specified) `--ors auto`
then the output line endings used are LF on Unix/Linux/BSD/MacOSX, and CRLF on Windows.
Notes about all other separators:
* IPS/OPS are only used for DKVP and XTAB formats, since only in these formats
do key-value pairs appear juxtaposed.
* IRS/ORS are ignored for XTAB format. Nominally IFS and OFS are newlines;
XTAB records are separated by two or more consecutive IFS/OFS -- i.e.
a blank line. Everything above about `--irs/--ors/--rs auto` becomes `--ifs/--ofs/--fs`
auto for XTAB format. (XTAB's default IFS/OFS are "auto".)
* OFS must be single-character for PPRINT format. This is because it is used
with repetition for alignment; multi-character separators would make
alignment impossible.
* OPS may be multi-character for XTAB format, in which case alignment is
disabled.
* TSV is simply CSV using tab as field separator (`--fs tab`).
* FS/PS are ignored for markdown format; RS is used.
* All FS and PS options are ignored for JSON format, since they are not relevant
to the JSON format.
* You can specify separators in any of the following ways, shown by example:
- Type them out, quoting as necessary for shell escapes, e.g.
`--fs '|' --ips :`
- C-style escape sequences, e.g. `--rs '\r\n' --fs '\t'`.
- To avoid backslashing, you can use any of the following names:
TODO desc-to-chars map
* Default separators by format:
TODO default_xses
--fs {string} Specify FS for input and output.
--ifs {string} Specify FS for input.
--ips {string} Specify PS for input.
--irs {string} Specify RS for input.
--ofs {string} Specify FS for output.
--ops {string} Specify PS for output.
--ors {string} Specify RS for output.
--ps {string} Specify PS for input and output.
--repifs Let IFS be repeated: e.g. for splitting on multiple
spaces.
--rs {string} Specify RS for input and output.
AUXILIARY COMMANDS
Available subcommands:
@ -386,80 +617,6 @@ AUXILIARY COMMANDS
repl
For more information, please invoke mlr {subcommand} --help.
REPL
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
OUTPUT COLORIZATION
Things having colors:
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc
in regression-test output
* Some online-help strings
Rules for coloring:
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: mlr --csv cat foo.csv
* Example: no color: mlr --csv cat foo.csv > bar.csv
* Example: no color: mlr --csv cat foo.csv | less
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
Mechanisms for coloring:
* Miller uses ANSI escape sequences only. This does not work on Windows except on Cygwin.
* Requires TERM environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable export MLR_NO_COLOR=true means don't color even if stdout+TTY.
* Environment variable export MLR_ALWAYS_COLOR=true means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to less -r.
* Command-line flags --no-color or -M, --always-color or -C.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* export MLR_KEY_COLOR=208, MLR_VALUE_COLOR-33, etc.:
MLR_KEY_COLOR MLR_VALUE_COLOR MLR_PASS_COLOR MLR_FAIL_COLOR
MLR_REPL_PS1_COLOR MLR_REPL_PS2_COLOR MLR_HELP_COLOR
* Command-line flags --key-color 208, --value-color 33, etc.:
--key-color --value-color --pass-color --fail-color
--repl-ps1-color --repl-ps2-color --help-color
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided,the latter take precedence.
Please do mlr --list-color-codes to see the available color codes (like 170), and
mlr --list-color-names to see available names (like orchid).
MLRRC
You can set up personal defaults via a $HOME/.mlrrc and/or ./.mlrrc.
For example, if you usually process CSV, then you can put "--csv" in your .mlrrc file
@ -493,6 +650,36 @@ MLRRC
See also:
https://miller.readthedocs.io/en/latest/customization.html
REPL
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
VERBS
altkv
Usage: mlr altkv [options]
@ -2526,4 +2713,4 @@ SEE ALSO
2021-09-05 MILLER(1)
2021-09-08 MILLER(1)

View file

@ -43,62 +43,30 @@ a special case.) This manpage documents #{`mlr --version`.chomp}."""
print make_section('DATA FORMATS', [])
print make_code_block(`mlr help data-formats`)
print make_section('HELP OPTIONS', [])
print make_code_block(`mlr help topics`)
print make_section('VERB LIST', [])
print make_code_block(`mlr help list-verbs-as-paragraph`)
print make_section('FUNCTION LIST', [])
print make_code_block(`mlr help list-functions-as-paragraph`)
print make_section('HELP OPTIONS', [])
print make_code_block(`mlr help topics`)
print make_section('OPTIONS', [
"""In the following option flags, the version with \"i\" designates the input
stream, \"o\" the output stream, and the version without prefix sets the option
for both input and output stream. For example: --irs sets the input record
separator, --ors the output record separator, and --rs sets both the input and
output separator to the given value."""
])
print make_subsection('DATA-FORMAT OPTIONS', [])
print make_code_block(`mlr help data-format-options`)
print make_subsection('FORMAT-CONVERSION KEYSTROKE-SAVERS', [])
print make_code_block(`mlr help format-conversion`)
print make_subsection('SEPARATORS', [])
print make_code_block(`mlr help separator-options`)
print make_subsection('COMPRESSED I/O', [])
print make_code_block(`mlr help compressed-data`)
print make_subsection('COMMENTS IN DATA', [])
print make_code_block(`mlr help comments-in-data`)
print make_subsection('CSV-SPECIFIC OPTIONS', [])
print make_code_block(`mlr help csv-options`)
print make_subsection('DOUBLE-QUOTING FOR CSV/CSVLITE OUTPUT', [])
print make_code_block(`mlr help double-quoting`)
print make_subsection('NUMBER FORMATTING', [])
print make_code_block(`mlr help number-formatting`)
print make_subsection('OTHER OPTIONS', [])
print make_code_block(`mlr help misc`)
section_names = `mlr help list-flag-sections`.split("\n")
for section_name in section_names
print make_section(section_name.upcase, [""])
print make_code_block(`mlr help show-help-for-section '#{section_name}'`)
end
print make_section('AUXILIARY COMMANDS', [])
print make_code_block(`mlr aux-list`)
print make_section('REPL', [])
print make_code_block(`mlr repl -h`)
print make_section('OUTPUT COLORIZATION', [])
print make_code_block(`mlr help output-colorization`)
print make_section('MLRRC', [])
print make_code_block(`mlr help mlrrc`)
print make_section('REPL', [])
print make_code_block(`mlr repl -h`)
verbs = `mlr help list-verbs`
print make_section('VERBS', [
""

View file

@ -2,12 +2,12 @@
.\" Title: mlr
.\" Author: [see the "AUTHOR" section]
.\" Generator: ./mkman.rb
.\" Date: 2021-09-05
.\" Date: 2021-09-08
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
.TH "MILLER" "1" "2021-09-05" "\ \&" "\ \&"
.TH "MILLER" "1" "2021-09-08" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Portability definitions
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -29,7 +29,7 @@
miller \- like awk, sed, cut, join, and sort for name-indexed data such as CSV and tabular JSON.
.SH "SYNOPSIS"
.sp
Usage: mlr [I/O options] {verb} [verb-dependent options ...] {zero or more file names}
Usage: mlr [flags] {verb} [verb-dependent options ...] {zero or more file names}
Output of one verb may be chained as input to another using "then", e.g.
mlr stats1 -a min,mean,max -f flag,u,v -g color then sort -f color
Please see 'mlr help topics' for more information.
@ -128,6 +128,49 @@ NIDX: implicitly numerically indexed (Unix-toolkit style)
.fi
.if n \{\
.RE
.SH "HELP OPTIONS"
.if n \{\
.RS 0
.\}
.nf
Type 'mlr help {topic}' for any of the following:
Essentials:
mlr help topics
mlr help basic-examples
mlr help data-formats
Flags:
mlr help flags
Verbs:
mlr help list-verbs
mlr help usage-verbs
mlr help verb
Functions:
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help usage-functions
mlr help usage-functions-by-class
mlr help function
Keywords:
mlr help list-keywords
mlr help usage-keywords
mlr help keyword
Other:
mlr help auxents
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
.fi
.if n \{\
.RE
.SH "VERB LIST"
.if n \{\
.RS 0
@ -173,133 +216,217 @@ version ! != !=~ % & && * ** + - . .* .+ .- ./ / // < << <= == =~ > >= >> >>>
.fi
.if n \{\
.RE
.SH "HELP OPTIONS"
.SH "COMMENTS-IN-DATA FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
Type 'mlr help {topic}' for any of the following:
mlr help topics
mlr help auxents
mlr help basic-examples
mlr help comments-in-data
mlr help compressed-data
mlr help csv-options
mlr help data-format-options
mlr help data-formats
mlr help double-quoting
mlr help format-conversion
mlr help function
mlr help keyword
mlr help list-functions
mlr help list-function-classes
mlr help list-functions-in-class
mlr help list-functions-as-paragraph
mlr help list-functions-as-table
mlr help list-keywords
mlr help list-keywords-as-paragraph
mlr help list-verbs
mlr help list-verbs-as-paragraph
mlr help misc
mlr help mlrrc
mlr help number-formatting
mlr help output-colorization
mlr help separator-options
mlr help type-arithmetic-info
mlr help usage-functions
mlr help usage-functions-by-class
mlr help usage-keywords
mlr help usage-verbs
mlr help verb
Shorthands:
mlr -l = mlr help list-verbs
mlr -L = mlr help usage-verbs
mlr -f = mlr help list-functions
mlr -F = mlr help usage-functions
mlr -k = mlr help list-keywords
mlr -K = mlr help usage-keywords
Miller lets you put comments in your data, such as
# This is a comment for a CSV file
a,b,c
1,2,3
4,5,6
Notes:
* Comments are only honored at the start of a line.
* In the absence of any of the below four options, comments are data like
any other text. (The comments-in-data feature is opt-in.)
* When `--pass-comments` is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
--pass-comments Immediately print commented lines (prefixed by `#`)
within the input.
--pass-comments-with {string}
Immediately print commented lines within input, with
specified prefix.
--skip-comments Ignore commented lines (prefixed by `#`) within the
input.
--skip-comments-with {string}
Ignore commented lines within input, with specified
prefix.
.fi
.if n \{\
.RE
.SH "OPTIONS"
.SH "COMPRESSED-DATA FLAGS"
.sp
In the following option flags, the version with "i" designates the input
stream, "o" the output stream, and the version without prefix sets the option
for both input and output stream. For example: --irs sets the input record
separator, --ors the output record separator, and --rs sets both the input and
output separator to the given value.
.SS "DATA-FORMAT OPTIONS"
.if n \{\
.RS 0
.\}
.nf
--idkvp --odkvp --dkvp Delimited key-value pairs, e.g "a=1,b=2"
(Miller's default format).
Miller offers a few different ways to handle reading data files which have been compressed.
--inidx --onidx --nidx Implicitly-integer-indexed fields (Unix-toolkit style).
-T Synonymous with "--nidx --fs tab".
* Decompression done within the Miller process itself: `--bz2in` `--gzin` `--zin`
* Decompression done outside the Miller process: `--prepipe` `--prepipex`
--icsv --ocsv --csv Comma-separated value (or tab-separated with --fs tab, etc.)
Using `--prepipe` and `--prepipex` you can specify an action to be
taken on each input file. The prepipe command must be able to read from
standard input; it will be invoked with `{command} < {filename}`. The
prepipex command must take a filename as argument; it will be invoked with
`{command} {filename}`.
--itsv --otsv --tsv Keystroke-savers for "--icsv --ifs tab",
"--ocsv --ofs tab", "--csv --fs tab".
--iasv --oasv --asv Similar but using ASCII FS 0x1f and RS 0x1e\en",
--iusv --ousv --usv Similar but using Unicode FS U+241F (UTF-8 0xe2909f)\en",
and RS U+241E (UTF-8 0xe2909e)\en",
Examples:
--icsvlite --ocsvlite --csvlite Comma-separated value (or tab-separated with --fs tab, etc.).
The 'lite' CSV does not handle RFC-CSV double-quoting rules; is
slightly faster and handles heterogeneity in the input stream via
empty newline followed by new header line. See also
https://johnkerl.org/miller6/file-formats.html#csv-tsv-asv-usv-etc
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
--itsvlite --otsvlite --tsvlite Keystroke-savers for "--icsvlite --ifs tab",
"--ocsvlite --ofs tab", "--csvlite --fs tab".
-t Synonymous with --tsvlite.
--iasvlite --oasvlite --asvlite Similar to --itsvlite et al. but using ASCII FS 0x1f and RS 0x1e\en",
--iusvlite --ousvlite --usvlite Similar to --itsvlite et al. but using Unicode FS U+241F (UTF-8 0xe2909f)\en",
and RS U+241E (UTF-8 0xe2909e)\en",
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
`mlr ... | {your compression command} > outputfilenamegoeshere`
--ipprint --opprint --pprint Pretty-printed tabular (produces no
output until all input is in).
--right Right-justifies all fields for PPRINT output.
--barred Prints a border around PPRINT output
(only available for output).
Lastly, note that if `--prepipe` or `--prepipex` is specified, it replaces any
decisions that might have been made based on the file suffix. Likewise,
`--gzin`/`--bz2in`/`--zin` are ignored if `--prepipe` is also specified.
--omd Markdown-tabular (only available for output).
--bz2in Uncompress bzip2 within the Miller process. Done by
default if file ends in `.bz2`.
--gzin Uncompress gzip within the Miller process. Done by
default if file ends in `.gz`.
--prepipe {decompression command}
You can, of course, already do without this for
single input files, e.g. `gunzip < myfile.csv.gz |
mlr ...`. Allowed at the command line, but not in
`.mlrrc` to avoid unexpected code execution.
--prepipe-bz2 Same as `--prepipe bz2`, except this is allowed in
`.mlrrc`.
--prepipe-gunzip Same as `--prepipe gunzip`, except this is allowed in
`.mlrrc`.
--prepipe-zcat Same as `--prepipe zcat`, except this is allowed in
`.mlrrc`.
--prepipex {decompression command}
Like `--prepipe` with one exception: doesn't insert
`<` between command and filename at runtime. Useful
for some commands like `unzip -qc` which don't read
standard input. Allowed at the command line, but not
in `.mlrrc` to avoid unexpected code execution.
--zin Uncompress zlib within the Miller process. Done by
default if file ends in `.z`.
.fi
.if n \{\
.RE
.SH "CSV-ONLY FLAGS"
.sp
--ixtab --oxtab --xtab Pretty-printed vertical-tabular.
--xvright Right-justifies values for XTAB format.
.if n \{\
.RS 0
.\}
.nf
--allow-ragged-csv-input or --ragged
If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line
has more fields than the header line, use integer
field labels as in the implicit-header case.
--headerless-csv-output Print only CSV data lines; do not print CSV header
lines.
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line
1 of input files. Tip: combine with `label` to
recreate missing headers.
--no-implicit-csv-header Opposite of `--implicit-csv-header`. This is the
default anyway -- the main use is for the flags to
`mlr join` if you have main file(s) which are
headerless but you want to join in on a file which
does have a CSV header. Then you could use `mlr --csv
--implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ...
your-headerless.csv`.
-N Keystroke-saver for `--implicit-csv-header
--headerless-csv-output`.
.fi
.if n \{\
.RE
.SH "FILE-FORMAT FLAGS"
.sp
--ijson --ojson --json JSON tabular: sequence or list of one-level
maps: {...}{...} or [{...},{...}].
--jvstack Put one key-value pair per line for JSON output.
--no-jvstack Put objects/arrays all on one line for JSON output.
--jsonx --ojsonx Keystroke-savers for --json --jvstack
--jsonx --ojsonx and --ojson --jvstack, respectively.
--jlistwrap Wrap JSON output in outermost [ ].
--flatsep {string} Separator for flattening multi-level JSON keys,
e.g. '{"a":{"b":3}}' becomes a:b => 3 for
non-JSON formats. Defaults to ..\en",
.if n \{\
.RS 0
.\}
.nf
TO DO: brief list of formats w/ xref to m6 webdocs.
-p is a keystroke-saver for --nidx --fs space --repifs
Examples: --csv for CSV-formatted input and output; --icsv --opprint for
Examples: `--csv` for CSV-formatted input and output; `--icsv --opprint` for
CSV-formatted input and pretty-printed output.
Please use --iformat1 --oformat2 rather than --format1 --oformat2.
The latter sets up input and output flags for format1, not all of which
are overridden in all cases by setting output format to format2.
Please use `--iformat1 --oformat2` rather than `--format1 --oformat2`.
The latter sets up input and output flags for `format1`, not all of which
are overridden in all cases by setting output format to `format2`.
--asv or --asvlite Use ASV format for input and output data.
--csv or -c Use CSV format for input and output data.
--csvlite Use CSV-lite format for input and output data.
--dkvp Use DKVP format for input and output data.
--iasv or --iasvlite Use ASV format for input data.
--icsv Use CSV format for input data.
--icsvlite Use CSV-lite format for input data.
--idkvp Use DKVP format for input data.
--ijson Use JSON format for input data.
--inidx Use NIDX format for input data.
--io {format name} Use format name for input and output data. For
example: `--io csv` is the same as `--csv`.
--ipprint Use PPRINT format for input data.
--itsv Use TSV format for input data.
--itsvlite Use TSV-lite format for input data.
--iusv or --iusvlite Use USV format for input data.
--ixtab Use XTAB format for input data.
--json or -j Use JSON format for input and output data.
--nidx Use NIDX format for input and output data.
--oasv or --oasvlite Use ASV format for output data.
--ocsv Use CSV format for output data.
--ocsvlite Use CSV-lite format for output data.
--odkvp Use DKVP format for output data.
--ojson Use JSON format for output data.
--omd Use markdown-tabular format for output data.
--onidx Use NIDX format for output data.
--opprint Use PPRINT format for output data.
--otsv Use TSV format for output data.
--otsvlite Use TSV-lite format for output data.
--ousv or --ousvlite Use USV format for output data.
--oxtab Use XTAB format for output data.
--pprint Use PPRINT format for input and output data.
--tsv Use TSV format for input and output data.
--tsvlite or -t Use TSV-lite format for input and output data.
--usv or --usvlite Use USV format for input and output data.
--xtab Use XTAB format for input and output data.
-i {format name} Use format name for input data. For example: `-i csv`
is the same as `--icsv`.
-o {format name} Use format name for output data. For example: `-o
csv` is the same as `--ocsv`.
.fi
.if n \{\
.RE
.SS "FORMAT-CONVERSION KEYSTROKE-SAVERS"
.SH "FLATTEN-UNFLATTEN FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
--flatsep or --jflatsep or --oflatsep {string}
Separator for flattening multi-level JSON keys, e.g.
`{"a":{"b":3}}` becomes `a:b => 3` for non-JSON
formats. Defaults to `.`.
--no-auto-flatten
--no-auto-unflatten
.fi
.if n \{\
.RE
.SH "FORMAT-CONVERSION KEYSTROKE-SAVER FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
As keystroke-savers for format-conversion you may use the following:
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--c2t --c2d --c2n --c2j --c2x --c2p --c2m
--t2c --t2d --t2n --t2j --t2x --t2p --t2m
--d2c --d2t --d2n --d2j --d2x --d2p --d2m
--n2c --n2t --n2d --n2j --n2x --n2p --n2m
@ -309,173 +436,319 @@ As keystroke-savers for format-conversion you may use the following:
The letters c t d n j x p m refer to formats CSV, TSV, DKVP, NIDX, JSON, XTAB,
PPRINT, and markdown, respectively. Note that markdown format is available for
output only.
.fi
.if n \{\
.RE
.SS "SEPARATORS"
.if n \{\
.RS 0
.\}
.nf
THIS IS STILL TBD FOR MILLER 6
.fi
.if n \{\
.RE
.SS "COMPRESSED I/O"
.if n \{\
.RS 0
.\}
.nf
Decompression done within the Miller process itself:
--gzin Uncompress gzip within the Miller process. Done by default if file ends in ".gz".
--bz2in Uncompress bz2ip within the Miller process. Done by default if file ends in ".bz2".
--zin Uncompress zlib within the Miller process. Done by default if file ends in ".z".
Decompression done outside the Miller process:
--prepipe {command} You can, of course, already do without this for single input files,
e.g. "gunzip < myfile.csv.gz | mlr ..."
--prepipex {command} Like --prepipe with one exception: doesn't insert '<' between
command and filename at runtime. Useful for some commands like 'unzip -qc'
which don't read standard input.
Using --prepipe and --prepipex you can specify an action to be taken on each
input file. This prepipe command must be able to read from standard input; it
will be invoked with {command} < {filename}.
Examples:
mlr --prepipe gunzip
mlr --prepipe zcat -cf
mlr --prepipe xz -cd
mlr --prepipe cat
Note that this feature is quite general and is not limited to decompression
utilities. You can use it to apply per-file filters of your choice. For output
compression (or other) utilities, simply pipe the output:
mlr ... | {your compression command} > outputfilenamegoeshere
Lastly, note that if --prepipe or --prepipex is specified, it replaces any
decisions that might have been made based on the file suffix. Also,
--gzin/--bz2in/--zin are ignored if --prepipe is also specified.
--c2b Use CSV for input, PPRINT with `--barred` for output.
--c2d Use CSV for input, DKVP for output.
--c2j Use CSV for input, JSON for output.
--c2m Use CSV for input, markdown-tabular for output.
--c2n Use CSV for input, NIDX for output.
--c2p Use CSV for input, PPRINT for output.
--c2t Use CSV for input, TSV for output.
--c2x Use CSV for input, XTAB for output.
--d2b Use DKVP for input, PPRINT with `--barred` for
output.
--d2c Use DKVP for input, CSV for output.
--d2j Use DKVP for input, JSON for output.
--d2m Use DKVP for input, markdown-tabular for output.
--d2n Use DKVP for input, NIDX for output.
--d2p Use DKVP for input, PPRINT for output.
--d2t Use DKVP for input, TSV for output.
--d2x Use DKVP for input, XTAB for output.
--j2b Use JSON for input, PPRINT with --barred for output.
--j2c Use JSON for input, CSV for output.
--j2d Use JSON for input, DKVP for output.
--j2m Use JSON for input, markdown-tabular for output.
--j2n Use JSON for input, NIDX for output.
--j2p Use JSON for input, PPRINT for output.
--j2t Use JSON for input, TSV for output.
--j2x Use JSON for input, XTAB for output.
--n2b Use NIDX for input, PPRINT with `--barred` for
output.
--n2c Use NIDX for input, CSV for output.
--n2d Use NIDX for input, DKVP for output.
--n2j Use NIDX for input, JSON for output.
--n2m Use NIDX for input, markdown-tabular for output.
--n2p Use NIDX for input, PPRINT for output.
--n2t Use NIDX for input, TSV for output.
--n2x Use NIDX for input, XTAB for output.
--p2c Use PPRINT for input, CSV for output.
--p2d Use PPRINT for input, DKVP for output.
--p2j Use PPRINT for input, JSON for output.
--p2m Use PPRINT for input, markdown-tabular for output.
--p2n Use PPRINT for input, NIDX for output.
--p2t Use PPRINT for input, TSV for output.
--p2x Use PPRINT for input, XTAB for output.
--t2b Use TSV for input, PPRINT with `--barred` for output.
--t2c Use TSV for input, CSV for output.
--t2d Use TSV for input, DKVP for output.
--t2j Use TSV for input, JSON for output.
--t2m Use TSV for input, markdown-tabular for output.
--t2n Use TSV for input, NIDX for output.
--t2p Use TSV for input, PPRINT for output.
--t2x Use TSV for input, XTAB for output.
--x2b Use XTAB for input, PPRINT with `--barred` for
output.
--x2c Use XTAB for input, CSV for output.
--x2d Use XTAB for input, DKVP for output.
--x2j Use XTAB for input, JSON for output.
--x2m Use XTAB for input, markdown-tabular for output.
--x2n Use XTAB for input, NIDX for output.
--x2p Use XTAB for input, PPRINT for output.
--x2t Use XTAB for input, TSV for output.
-p Keystroke-saver for `--nidx --fs space --repifs`.
-T Keystroke-saver for `--nidx --fs tab`.
.fi
.if n \{\
.RE
.SS "COMMENTS IN DATA"
.SH "JSON-ONLY FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
--skip-comments Ignore commented lines (prefixed by "#")
within the input.
--skip-comments-with {string} Ignore commented lines within input, with
specified prefix.
--pass-comments Immediately print commented lines (prefixed by "#")
within the input.
--pass-comments-with {string} Immediately print commented lines within input, with
specified prefix.
These are flags which are applicable to JSON format.
Notes:
* Comments are only honored at the start of a line.
* In the absence of any of the above four options, comments are data like
any other text.
* When pass-comments is used, comment lines are written to standard output
immediately upon being read; they are not part of the record stream. Results
may be counterintuitive. A suggestion is to place comments at the start of
data files.
--jlistwrap or --jl Wrap JSON output in outermost `[ ]`.
--jvstack Put one key-value pair per line for JSON output
(multi-line output).
--no-jvstack Put objects/arrays all on one line for JSON output.
.fi
.if n \{\
.RE
.SS "CSV-SPECIFIC OPTIONS"
.SH "LEGACY FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
--implicit-csv-header Use 1,2,3,... as field labels, rather than from line 1
of input files. Tip: combine with "label" to recreate
missing headers.
--no-implicit-csv-header Do not use --implicit-csv-header. This is the default
anyway -- the main use is for the flags to 'mlr join' if you have
main file(s) which are headerless but you want to join in on
a file which does have a CSV header. Then you could use
'mlr --csv --implicit-csv-header join --no-implicit-csv-header
-l your-join-in-with-header.csv ... your-headerless.csv'
--allow-ragged-csv-input|--ragged If a data line has fewer fields than the header line,
fill remaining keys with empty string. If a data line has more
fields than the header line, use integer field labels as in
the implicit-header case.
--headerless-csv-output Print only CSV data lines.
-N Keystroke-saver for --implicit-csv-header --headerless-csv-output.
These are flags which don't do anything in the current Miller version.
They are accepted as no-op flags in order to keep old scripts from breaking.
--jknquoteint Type information from JSON input files is now
preserved throughout the processing stream.
--jquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--json-fatal-arrays-on-input
Miller now supports arrays as of version 6.
--json-map-arrays-on-input
Miller now supports arrays as of version 6.
--json-skip-arrays-on-input
Miller now supports arrays as of version 6.
--jsonx The `--jvstack` flag is now default true in Miller 6.
--jvquoteall Type information from JSON input files is now
preserved throughout the processing stream.
--mmap Miller no longer uses memory-mapping to access data
files.
--no-fflush The current implementation of Miller does not use
buffered output, so there is no longer anything to
suppress here.
--no-mmap Miller no longer uses memory-mapping to access data
files.
--ojsonx The `--jvstack` flag is now default true in Miller 6.
.fi
.if n \{\
.RE
.SS "DOUBLE-QUOTING FOR CSV/CSVLITE OUTPUT"
.SH "MISCELLANEOUS FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
THIS IS STILL WIP FOR MILLER 6
--quote-all Wrap all fields in double quotes
--quote-none Do not wrap any fields in double quotes, even if they have
OFS or ORS in them
--quote-minimal Wrap fields in double quotes only if they have OFS or ORS
in them (default)
--quote-numeric Wrap fields in double quotes only if they have numbers
in them
--quote-original Wrap fields in double quotes if and only if they were
quoted on input. This isn't sticky for computed fields:
e.g. if fields a and b were quoted on input and you do
"put '$c = $a . $b'" then field c won't inherit a or b's
was-quoted-on-input flag.
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once.
Example: `mlr --from a.dat --from b.dat cat` is the
same as `mlr cat a.dat b.dat`.
--load {filename} Load DSL script file for all put/filter operations on
the command line. If the name following `--load` is a
directory, load all `*.mlr` files in that directory.
This is just like `put -f` and `filter -f` except
it's up-front on the command line, so you can do
something like `alias mlr='mlr --load ~/myscripts'`
if you like.
--mfrom {filenames} Use this to specify one of more input files before
the verb(s), rather than after. May be used more than
once. The list of filename must end with `--`. This
is useful for example since `--from *.csv` doesn't do
what you might hope but `--mfrom *.csv --` does.
--mload {filenames} Like `--load` but works with more than one filename,
e.g. `--mload *.mlr --`.
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style
codes for floating-point nummbers. If not specified,
default formatting is used. See also the `fmtnum`
function and the `format-values` verb.
--seed {n} with `n` of the form `12345678` or `0xcafefeed`. For
`put`/`filter` `urand`, `urandint`, and `urand32`.
-I Process files in-place. For each file name on the
command line, output is written to a temp file in the
same directory, which is then renamed over the
original. Each file is processed in isolation: if the
output format is CSV, CSV headers will be present in
each output file, statistics are only over each
file's own records; and so on.
-n Process no input files, nor standard input either.
Useful for `mlr put` with `begin`/`end` statements
only. (Same as `--from /dev/null`.) Also useful in
`mlr -n put -v '...'` for analyzing abstract syntax
trees (if that's your thing).
.fi
.if n \{\
.RE
.SS "NUMBER FORMATTING"
.SH "OUTPUT-COLORIZATION FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
THIS IS STILL WIP FOR MILLER 6
--ofmt {format} E.g. %.18f, %.0f, %9.6e. Please use sprintf-style codes for
floating-point nummbers. If not specified, default formatting is used.
See also the fmtnum function within mlr put (mlr --help-all-functions);
see also the format-values function.
Miller uses colors to highlight outputs. You can specify color preferences.
Note: output colorization does not work on Windows.
Things having colors:
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc in regression-test output
* Some online-help strings
Rules for coloring:
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: `mlr --csv cat foo.csv`
* Example: no color: `mlr --csv cat foo.csv > bar.csv`
* Example: no color: `mlr --csv cat foo.csv | less`
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
Mechanisms for coloring:
* Miller uses ANSI escape sequences only. This does not work on Windows except within Cygwin.
* Requires `TERM` environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable `export MLR_NO_COLOR=true` means don't color even if stdout+TTY.
* Environment variable `export MLR_ALWAYS_COLOR=true` means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to `less -r`.
* Command-line flags `--no-color` or `-M`, `--always-color` or `-C`.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* `export MLR_KEY_COLOR=208`, `MLR_VALUE_COLOR=33`, etc.:
`MLR_KEY_COLOR` `MLR_VALUE_COLOR` `MLR_PASS_COLOR` `MLR_FAIL_COLOR`
`MLR_REPL_PS1_COLOR` `MLR_REPL_PS2_COLOR` `MLR_HELP_COLOR`
* Command-line flags `--key-color 208`, `--value-color 33`, etc.:
`--key-color` `--value-color` `--pass-color` `--fail-color`
`--repl-ps1-color` `--repl-ps2-color` `--help-color`
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided, the latter take precedence.
Please do mlr `--list-color-codes` to see the available color codes (like 170), and
`mlr --list-color-names` to see available names (like `orchid`).
--always-color or -C
--fail-color
--help-color
--key-color
--list-color-codes
--list-color-names
--no-color or -M
--pass-color
--value-color
.fi
.if n \{\
.RE
.SS "OTHER OPTIONS"
.SH "PPRINT-ONLY FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
--seed {n} with n of the form 12345678 or 0xcafefeed. For put/filter
urand()/urandint()/urand32().
--nr-progress-mod {m}, with m a positive integer: print filename and record
count to os.Stderr every m input records.
--from {filename} Use this to specify an input file before the verb(s),
rather than after. May be used more than once. Example:
"mlr --from a.dat --from b.dat cat" is the same as
"mlr cat a.dat b.dat".
--mfrom {filenames} -- Use this to specify one of more input files before the verb(s),
rather than after. May be used more than once.
The list of filename must end with "--". This is useful
for example since "--from *.csv" doesn't do what you might
hope but "--mfrom *.csv --" does.
--load {filename} Load DSL script file for all put/filter operations on the command line.
If the name following --load is a directory, load all "*.mlr" files
in that directory. This is just like "put -f" and "filter -f"
except it's up-front on the command line, so you can do something like
alias mlr='mlr --load ~/myscripts' if you like.
--mload {names} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-n Process no input files, nor standard input either. Useful
for mlr put with begin/end statements only. (Same as --from
/dev/null.) Also useful in "mlr -n put -v '...'" for
analyzing abstract syntax trees (if that's your thing).
-I Process files in-place. For each file name on the command
line, output is written to a temp file in the same
directory, which is then renamed over the original. Each
file is processed in isolation: if the output format is
CSV, CSV headers will be present in each output file
statistics are only over each file's own records; and so on.
These are flags which are applicable to PPRINT output format.
--barred Prints a border around PPRINT output (not available
for input).
--right Right-justifies all fields for PPRINT output.
.fi
.if n \{\
.RE
.SH "SEPARATOR FLAGS"
.sp
.if n \{\
.RS 0
.\}
.nf
Separator options:
--rs --irs --ors Record separators, e.g. 'lf' or '\e\er\e\en'
--fs --ifs --ofs --repifs Field separators, e.g. comma
--ps --ips --ops Pair separators, e.g. equals sign
TODO: auto-detect is still TBD for Miller 6
Notes about line endings:
* Default line endings (`--irs` and `--ors`) are "auto" which means autodetect from
the input file format, as long as the input file(s) have lines ending in either
LF (also known as linefeed, `\en`, `0x0a`, or Unix-style) or CRLF (also known as
carriage-return/linefeed pairs, `\er\en`, `0x0d 0x0a`, or Windows-style).
* If both `irs` and `ors` are `auto` (which is the default) then LF input will lead to LF
output and CRLF input will lead to CRLF output, regardless of the platform you're
running on.
* The line-ending autodetector triggers on the first line ending detected in the input
stream. E.g. if you specify a CRLF-terminated file on the command line followed by an
LF-terminated file then autodetected line endings will be CRLF.
* If you use `--ors {something else}` with (default or explicitly specified) `--irs auto`
then line endings are autodetected on input and set to what you specify on output.
* If you use `--irs {something else}` with (default or explicitly specified) `--ors auto`
then the output line endings used are LF on Unix/Linux/BSD/MacOSX, and CRLF on Windows.
Notes about all other separators:
* IPS/OPS are only used for DKVP and XTAB formats, since only in these formats
do key-value pairs appear juxtaposed.
* IRS/ORS are ignored for XTAB format. Nominally IFS and OFS are newlines;
XTAB records are separated by two or more consecutive IFS/OFS -- i.e.
a blank line. Everything above about `--irs/--ors/--rs auto` becomes `--ifs/--ofs/--fs`
auto for XTAB format. (XTAB's default IFS/OFS are "auto".)
* OFS must be single-character for PPRINT format. This is because it is used
with repetition for alignment; multi-character separators would make
alignment impossible.
* OPS may be multi-character for XTAB format, in which case alignment is
disabled.
* TSV is simply CSV using tab as field separator (`--fs tab`).
* FS/PS are ignored for markdown format; RS is used.
* All FS and PS options are ignored for JSON format, since they are not relevant
to the JSON format.
* You can specify separators in any of the following ways, shown by example:
- Type them out, quoting as necessary for shell escapes, e.g.
`--fs '|' --ips :`
- C-style escape sequences, e.g. `--rs '\er\en' --fs '\et'`.
- To avoid backslashing, you can use any of the following names:
TODO desc-to-chars map
* Default separators by format:
TODO default_xses
--fs {string} Specify FS for input and output.
--ifs {string} Specify FS for input.
--ips {string} Specify PS for input.
--irs {string} Specify RS for input.
--ofs {string} Specify FS for output.
--ops {string} Specify PS for output.
--ors {string} Specify RS for output.
--ps {string} Specify PS for input and output.
--repifs Let IFS be repeated: e.g. for splitting on multiple
spaces.
--rs {string} Specify RS for input and output.
.fi
.if n \{\
.RE
@ -497,92 +770,6 @@ For more information, please invoke mlr {subcommand} --help.
.fi
.if n \{\
.RE
.SH "REPL"
.if n \{\
.RS 0
.\}
.nf
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
.fi
.if n \{\
.RE
.SH "OUTPUT COLORIZATION"
.if n \{\
.RS 0
.\}
.nf
Things having colors:
* Keys in CSV header lines, JSON keys, etc
* Values in CSV data lines, JSON scalar values, etc
in regression-test output
* Some online-help strings
Rules for coloring:
* By default, colorize output only if writing to stdout and stdout is a TTY.
* Example: color: mlr --csv cat foo.csv
* Example: no color: mlr --csv cat foo.csv > bar.csv
* Example: no color: mlr --csv cat foo.csv | less
* The default colors were chosen since they look OK with white or black terminal background,
and are differentiable with common varieties of human color vision.
Mechanisms for coloring:
* Miller uses ANSI escape sequences only. This does not work on Windows except on Cygwin.
* Requires TERM environment variable to be set to non-empty string.
* Doesn't try to check to see whether the terminal is capable of 256-color
ANSI vs 16-color ANSI. Note that if colors are in the range 0..15
then 16-color ANSI escapes are used, so this is in the user's control.
How you can control colorization:
* Suppression/unsuppression:
* Environment variable export MLR_NO_COLOR=true means don't color even if stdout+TTY.
* Environment variable export MLR_ALWAYS_COLOR=true means do color even if not stdout+TTY.
For example, you might want to use this when piping mlr output to less -r.
* Command-line flags --no-color or -M, --always-color or -C.
* Color choices can be specified by using environment variables, or command-line flags,
with values 0..255:
* export MLR_KEY_COLOR=208, MLR_VALUE_COLOR-33, etc.:
MLR_KEY_COLOR MLR_VALUE_COLOR MLR_PASS_COLOR MLR_FAIL_COLOR
MLR_REPL_PS1_COLOR MLR_REPL_PS2_COLOR MLR_HELP_COLOR
* Command-line flags --key-color 208, --value-color 33, etc.:
--key-color --value-color --pass-color --fail-color
--repl-ps1-color --repl-ps2-color --help-color
* This is particularly useful if your terminal's background color clashes with current settings.
If environment-variable settings and command-line flags are both provided,the latter take precedence.
Please do mlr --list-color-codes to see the available color codes (like 170), and
mlr --list-color-names to see available names (like orchid).
.fi
.if n \{\
.RE
.SH "MLRRC"
.if n \{\
.RS 0
@ -622,6 +809,42 @@ https://miller.readthedocs.io/en/latest/customization.html
.fi
.if n \{\
.RE
.SH "REPL"
.if n \{\
.RS 0
.\}
.nf
Usage: mlr repl [options] {zero or more data-file names}
-v Prints the expressions's AST (abstract syntax tree), which gives
full transparency on the precedence and associativity rules of
Miller's grammar, to stdout.
-d Like -v but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-w Show warnings about uninitialized variables
-q Don't show startup banner
-s Don't show prompts
--load {DSL script file} Load script file before presenting the prompt.
If the name following --load is a directory, load all "*.mlr" files
in that directory.
--mload {DSL script files} -- Like --load but works with more than one filename,
e.g. '--mload *.mlr --'.
-h|--help Show this message.
Or any --icsv, --ojson, etc. reader/writer options as for the main Miller command line.
Any data-file names are opened just as if you had waited and typed :open {filenames}
at the Miller REPL prompt.
.fi
.if n \{\
.RE
.SH "VERBS"
.sp