miller/test/cases/cli-help/0001/expout
John Kerl f5eaf290cf
mlr sparsify (#1498)
* mlr sparsify

* regression-test cases

* typofix

* Remove mods due to processor-architecture change
2024-02-18 10:56:26 -05:00

1372 lines
59 KiB
Text

================================================================
altkv
Usage: mlr altkv [options]
Given fields with values of the form a,b,c,d,e,f emits a=b,c=d,e=f pairs.
Options:
-h|--help Show this message.
================================================================
bar
Usage: mlr bar [options]
Replaces a numeric field with a number of asterisks, allowing for cheesy
bar plots. These align best with --opprint or --oxtab output format.
Options:
-f {a,b,c} Field names to convert to bars.
--lo {lo} Lower-limit value for min-width bar: default '0.000000'.
--hi {hi} Upper-limit value for max-width bar: default '100.000000'.
-w {n} Bar-field width: default '40'.
--auto Automatically computes limits, ignoring --lo and --hi.
Holds all records in memory before producing any output.
-c {character} Fill character: default '*'.
-x {character} Out-of-bounds character: default '#'.
-b {character} Blank character: default '.'.
Nominally the fill, out-of-bounds, and blank characters will be strings of length 1.
However you can make them all longer if you so desire.
-h|--help Show this message.
================================================================
bootstrap
Usage: mlr bootstrap [options]
Emits an n-sample, with replacement, of the input records.
See also mlr sample and mlr shuffle.
Options:
-n Number of samples to output. Defaults to number of input records.
Must be non-negative.
-h|--help Show this message.
================================================================
case
Usage: mlr case [options]
Uppercases strings in record keys and/or values.
Options:
-k Case only keys, not keys and values.
-v Case only values, not keys and values.
-f {a,b,c} Specify which field names to case (default: all)
-u Convert to uppercase
-l Convert to lowercase
-s Convert to sentence case (capitalize first letter)
-t Convert to title case (capitalize words)
-h|--help Show this message.
================================================================
cat
Usage: mlr cat [options]
Passes input records directly to output. Most useful for format conversion.
Options:
-n Prepend field "n" to each record with record-counter starting at 1.
-N {name} Prepend field {name} to each record with record-counter starting at 1.
-g {a,b,c} Optional group-by-field names for counters, e.g. a,b,c
--filename Prepend current filename to each record.
--filenum Prepend current filenum (1-up) to each record.
-h|--help Show this message.
================================================================
check
Usage: mlr check [options]
Consumes records without printing any output,
Useful for doing a well-formatted check on input data.
with the exception that warnings are printed to stderr.
Current checks are:
* Data are parseable
* If any key is the empty string
Options:
-h|--help Show this message.
================================================================
clean-whitespace
Usage: mlr clean-whitespace [options]
For each record, for each field in the record, whitespace-cleans the keys and/or
values. Whitespace-cleaning entails stripping leading and trailing whitespace,
and replacing multiple whitespace with singles. For finer-grained control,
please see the DSL functions lstrip, rstrip, strip, collapse_whitespace,
and clean_whitespace.
Options:
-k|--keys-only Do not touch values.
-v|--values-only Do not touch keys.
It is an error to specify -k as well as -v -- to clean keys and values,
leave off -k as well as -v.
-h|--help Show this message.
================================================================
count-distinct
Usage: mlr count-distinct [options]
Prints number of records having distinct values for specified field names.
Same as uniq -c.
Options:
-f {a,b,c} Field names for distinct count.
-x {a,b,c} Field names to exclude for distinct count: use each record's others instead.
-n Show only the number of distinct values. Not compatible with -u.
-o {name} Field name for output count. Default "count".
Ignored with -u.
-u Do unlashed counts for multiple field names. With -f a,b and
without -u, computes counts for distinct combinations of a
and b field values. With -f a,b and with -u, computes counts
for distinct a field values and counts for distinct b field
values separately.
================================================================
count
Usage: mlr count [options]
Prints number of records, optionally grouped by distinct values for specified field names.
Options:
-g {a,b,c} Optional group-by-field names for counts, e.g. a,b,c
-n {n} Show only the number of distinct values. Not interesting without -g.
-o {name} Field name for output-count. Default "count".
-h|--help Show this message.
================================================================
count-similar
Usage: mlr count-similar [options]
Ingests all records, then emits each record augmented by a count of
the number of other records having the same group-by field values.
Options:
-g {a,b,c} Group-by-field names for counts, e.g. a,b,c
-o {name} Field name for output-counts. Defaults to "count".
-h|--help Show this message.
================================================================
cut
Usage: mlr cut [options]
Passes through input records with specified fields included/excluded.
Options:
-f {a,b,c} Comma-separated field names for cut, e.g. a,b,c.
-o Retain fields in the order specified here in the argument list.
Default is to retain them in the order found in the input data.
-x|--complement Exclude, rather than include, field names specified by -f.
-r Treat field names as regular expressions. "ab", "a.*b" will
match any field name containing the substring "ab" or matching
"a.*b", respectively; anchors of the form "^ab$", "^a.*b$" may
be used. The -o flag is ignored when -r is present.
-h|--help Show this message.
Examples:
mlr cut -f hostname,status
mlr cut -x -f hostname,status
mlr cut -r -f '^status$,sda[0-9]'
mlr cut -r -f '^status$,"sda[0-9]"'
mlr cut -r -f '^status$,"sda[0-9]"i' (this is case-insensitive)
================================================================
decimate
Usage: mlr decimate [options]
Passes through one of every n records, optionally by category.
Options:
-b Decimate by printing first of every n.
-e Decimate by printing last of every n (default).
-g {a,b,c} Optional group-by-field names for decimate counts, e.g. a,b,c.
-n {n} Decimation factor (default 10).
-h|--help Show this message.
================================================================
fill-down
Usage: mlr fill-down [options]
If a given record has a missing value for a given field, fill that from
the corresponding value from a previous record, if any.
By default, a 'missing' field either is absent, or has the empty-string value.
With -a, a field is 'missing' only if it is absent.
Options:
--all Operate on all fields in the input.
-a|--only-if-absent If a given record has a missing value for a given field,
fill that from the corresponding value from a previous record, if any.
By default, a 'missing' field either is absent, or has the empty-string value.
With -a, a field is 'missing' only if it is absent.
-f Field names for fill-down.
-h|--help Show this message.
================================================================
fill-empty
Usage: mlr fill-empty [options]
Fills empty-string fields with specified fill-value.
Options:
-v {string} Fill-value: defaults to "N/A"
-S Don't infer type -- so '-v 0' would fill string 0 not int 0.
================================================================
filter
Usage: mlr filter [options] {DSL expression}
Options:
-f {file name} File containing a DSL expression (see examples below). If the filename
is a directory, all *.mlr files in that directory are loaded.
-e {expression} You can use this after -f to add an expression. Example use
case: define functions/subroutines in a file you specify with -f, then call
them with an expression you specify with -e.
(If you mix -e and -f then the expressions are evaluated in the order encountered.
Since the expression pieces are simply concatenated, please be sure to use intervening
semicolons to separate expressions.)
-s name=value: Predefines out-of-stream variable @name to have
Thus mlr put -s foo=97 '$column += @foo' is like
mlr put 'begin {@foo = 97} $column += @foo'.
The value part is subject to type-inferencing.
May be specified more than once, e.g. -s name1=value1 -s name2=value2.
Note: the value may be an environment variable, e.g. -s sequence=$SEQUENCE
-x (default false) Prints records for which {expression} evaluates to false, not true,
i.e. invert the sense of the filter expression.
-q Does not include the modified record in the output stream.
Useful for when all desired output is in begin and/or end blocks.
-S and -F: There are no-ops in Miller 6 and above, since now type-inferencing is done
by the record-readers before filter/put is executed. Supported as no-op pass-through
flags for backward compatibility.
-h|--help Show this message.
Parser-info options:
-w Print warnings about things like uninitialized variables.
-W Same as -w, but exit the process if there are any warnings.
-p Prints the expressions's AST (abstract syntax tree), which gives full
transparency on the precedence and associativity rules of Miller's grammar,
to stdout.
-d Like -p but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-E Echo DSL expression before printing parse-tree
-v Same as -E -p.
-X Exit after parsing but before stream-processing. Useful with -v/-d/-D, if you
only want to look at parser information.
Records will pass the filter depending on the last bare-boolean statement in
the DSL expression. That can be the result of <, ==, >, etc., the return value of a function call
which returns boolean, etc.
Examples:
mlr --csv --from example.csv filter '$color == "red"'
mlr --csv --from example.csv filter '$color == "red" && flag == true'
More example filter expressions:
First record in each file:
'FNR == 1'
Subsampling:
'urand() < 0.001'
Compound booleans:
'$color != "blue" && $value > 4.2'
'($x < 0.5 && $y < 0.5) || ($x > 0.5 && $y > 0.5)'
Regexes with case-insensitive flag
'($name =~ "^sys.*east$") || ($name =~ "^dev.[0-9]+"i)'
Assignments, then bare-boolean filter statement:
'$ab = $a+$b; $cd = $c+$d; $ab != $cd'
Bare-boolean filter statement within a conditional:
'if (NR < 100) {
$x > 0.3;
} else {
$x > 0.002;
}
'
Using 'any' higher-order function to see if $index is 10, 20, or 30:
'any([10,20,30], func(e) {return $index == e})'
See also https://miller.readthedocs.io/reference-dsl for more context.
================================================================
flatten
Usage: mlr flatten [options]
Flattens multi-level maps to single-level ones. Example: field with name 'a'
and value '{"b": { "c": 4 }}' becomes name 'a.b.c' and value 4.
Options:
-f Comma-separated list of field names to flatten (default all).
-s Separator, defaulting to mlr --flatsep value.
-h|--help Show this message.
================================================================
format-values
Usage: mlr format-values [options]
Applies format strings to all field values, depending on autodetected type.
* If a field value is detected to be integer, applies integer format.
* Else, if a field value is detected to be float, applies float format.
* Else, applies string format.
Note: this is a low-keystroke way to apply formatting to many fields. To get
finer control, please see the fmtnum function within the mlr put DSL.
Note: this verb lets you apply arbitrary format strings, which can produce
undefined behavior and/or program crashes. See your system's "man printf".
Options:
-i {integer format} Defaults to "%d".
Examples: "%06lld", "%08llx".
Note that Miller integers are long long so you must use
formats which apply to long long, e.g. with ll in them.
Undefined behavior results otherwise.
-f {float format} Defaults to "%f".
Examples: "%8.3lf", "%.6le".
Note that Miller floats are double-precision so you must
use formats which apply to double, e.g. with l[efg] in them.
Undefined behavior results otherwise.
-s {string format} Defaults to "%s".
Examples: "_%s", "%08s".
Note that you must use formats which apply to string, e.g.
with s in them. Undefined behavior results otherwise.
-n Coerce field values autodetected as int to float, and then
apply the float format.
================================================================
fraction
Usage: mlr fraction [options]
For each record's value in specified fields, computes the ratio of that
value to the sum of values in that field over all input records.
E.g. with input records x=1 x=2 x=3 and x=4, emits output records
x=1,x_fraction=0.1 x=2,x_fraction=0.2 x=3,x_fraction=0.3 and x=4,x_fraction=0.4
Note: this is internally a two-pass algorithm: on the first pass it retains
input records and accumulates sums; on the second pass it computes quotients
and emits output records. This means it produces no output until all input is read.
Options:
-f {a,b,c} Field name(s) for fraction calculation
-g {d,e,f} Optional group-by-field name(s) for fraction counts
-p Produce percents [0..100], not fractions [0..1]. Output field names
end with "_percent" rather than "_fraction"
-c Produce cumulative distributions, i.e. running sums: each output
value folds in the sum of the previous for the specified group
E.g. with input records x=1 x=2 x=3 and x=4, emits output records
x=1,x_cumulative_fraction=0.1 x=2,x_cumulative_fraction=0.3
x=3,x_cumulative_fraction=0.6 and x=4,x_cumulative_fraction=1.0
================================================================
gap
Usage: mlr gap [options]
Emits an empty record every n records, or when certain values change.
Options:
Emits an empty record every n records, or when certain values change.
-g {a,b,c} Print a gap whenever values of these fields (e.g. a,b,c) changes.
-n {n} Print a gap every n records.
One of -f or -g is required.
-n is ignored if -g is present.
-h|--help Show this message.
================================================================
grep
Usage: mlr grep [options] {regular expression}
Passes through records which match the regular expression.
Options:
-i Use case-insensitive search.
-v Invert: pass through records which do not match the regex.
-a Only grep for values, not keys and values.
-h|--help Show this message.
Note that "mlr filter" is more powerful, but requires you to know field names.
By contrast, "mlr grep" allows you to regex-match the entire record. It does this
by formatting each record in memory as DKVP (or NIDX, if -a is supplied), using
OFS "," and OPS "=", and matching the resulting line against the regex specified
here. In particular, the regex is not applied to the input stream: if you have
CSV with header line "x,y,z" and data line "1,2,3" then the regex will be
matched, not against either of these lines, but against the DKVP line
"x=1,y=2,z=3". Furthermore, not all the options to system grep are supported,
and this command is intended to be merely a keystroke-saver. To get all the
features of system grep, you can do
"mlr --odkvp ... | grep ... | mlr --idkvp ..."
================================================================
group-by
Usage: mlr group-by [options] {comma-separated field names}
Outputs records in batches having identical values at specified field names.Options:
-h|--help Show this message.
================================================================
group-like
Usage: mlr group-like [options]
Outputs records in batches having identical field names.
Options:
-h|--help Show this message.
================================================================
gsub
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
================================================================
having-fields
Usage: mlr having-fields [options]
Conditionally passes through records depending on each record's field names.
Options:
--at-least {comma-separated names}
--which-are {comma-separated names}
--at-most {comma-separated names}
--all-matching {regular expression}
--any-matching {regular expression}
--none-matching {regular expression}
Examples:
mlr having-fields --which-are amount,status,owner
mlr having-fields --any-matching 'sda[0-9]'
mlr having-fields --any-matching '"sda[0-9]"'
mlr having-fields --any-matching '"sda[0-9]"i' (this is case-insensitive)
================================================================
head
Usage: mlr head [options]
Passes through the first n records, optionally by category.
Without -g, ceases consuming more input (i.e. is fast) when n records have been read.
Options:
-g {a,b,c} Optional group-by-field names for head counts, e.g. a,b,c.
-n {n} Head-count to print. Default 10.
-h|--help Show this message.
================================================================
histogram
Just a histogram. Input values < lo or > hi are not counted.
Usage: mlr histogram [options]
-f {a,b,c} Value-field names for histogram counts
--lo {lo} Histogram low value
--hi {hi} Histogram high value
--nbins {n} Number of histogram bins. Defaults to 20.
--auto Automatically computes limits, ignoring --lo and --hi.
Holds all values in memory before producing any output.
-o {prefix} Prefix for output field name. Default: no prefix.
-h|--help Show this message.
================================================================
json-parse
Usage: mlr json-parse [options]
Tries to convert string field values to parsed JSON, e.g. "[1,2,3]" -> [1,2,3].
Options:
-f {...} Comma-separated list of field names to json-parse (default all).
-k If supplied, then on parse fail for any cell, keep the (unparsable)
input value for the cell.
-h|--help Show this message.
================================================================
json-stringify
Usage: mlr json-stringify [options]
Produces string field values from field-value data, e.g. [1,2,3] -> "[1,2,3]".
Options:
-f {...} Comma-separated list of field names to json-parse (default all).
--jvstack Produce multi-line JSON output.
--no-jvstack Produce single-line JSON output per record (default).
-h|--help Show this message.
================================================================
join
Usage: mlr join [options]
Joins records from specified left file name with records from all file names
at the end of the Miller argument list.
Functionality is essentially the same as the system "join" command, but for
record streams.
Options:
-f {left file name}
-j {a,b,c} Comma-separated join-field names for output
-l {a,b,c} Comma-separated join-field names for left input file;
defaults to -j values if omitted.
-r {a,b,c} Comma-separated join-field names for right input file(s);
defaults to -j values if omitted.
--lk|--left-keep-field-names {a,b,c} If supplied, this means keep only the specified field
names from the left file. Automatically includes the join-field name(s). Helpful
for when you only want a limited subset of information from the left file.
Tip: you can use --lk "": this means the left file becomes solely a row-selector
for the input files.
--lp {text} Additional prefix for non-join output field names from
the left file
--rp {text} Additional prefix for non-join output field names from
the right file(s)
--np Do not emit paired records
--ul Emit unpaired records from the left file
--ur Emit unpaired records from the right file(s)
-s|--sorted-input Require sorted input: records must be sorted
lexically by their join-field names, else not all records will
be paired. The only likely use case for this is with a left
file which is too big to fit into system memory otherwise.
-u Enable unsorted input. (This is the default even without -u.)
In this case, the entire left file will be loaded into memory.
--prepipe {command} As in main input options; see mlr --help for details.
If you wish to use a prepipe command for the main input as well
as here, it must be specified there as well as here.
--prepipex {command} Likewise.
File-format options default to those for the right file names on the Miller
argument list, but may be overridden for the left file as follows. Please see
the main "mlr --help" for more information on syntax for these arguments:
-i {one of csv,dkvp,nidx,pprint,xtab}
--irs {record-separator character}
--ifs {field-separator character}
--ips {pair-separator character}
--repifs
--implicit-csv-header
--implicit-tsv-header
--no-implicit-csv-header
--no-implicit-tsv-header
For example, if you have 'mlr --csv ... join -l foo ... ' then the left-file format will
be specified CSV as well unless you override with 'mlr --csv ... join --ijson -l foo' etc.
Likewise, if you have 'mlr --csv --implicit-csv-header ...' then the join-in file will be
expected to be headerless as well unless you put '--no-implicit-csv-header' after 'join'.
Please use "mlr --usage-separator-options" for information on specifying separators.
Please see https://miller.readthedocs.io/en/latest/reference-verbs.html#join for more information
including examples.
================================================================
label
Usage: mlr label [options] {new1,new2,new3,...}
Given n comma-separated names, renames the first n fields of each record to
have the respective name. (Fields past the nth are left with their original
names.) Particularly useful with --inidx or --implicit-csv-header, to give
useful names to otherwise integer-indexed fields.
Options:
-h|--help Show this message.
================================================================
latin1-to-utf8
Usage: mlr latin1-to-utf8, with no options.
Recursively converts record strings from Latin-1 to UTF-8.
For field-level control, please see the latin1_to_utf8 DSL function.
Options:
-h|--help Show this message.
================================================================
least-frequent
Usage: mlr least-frequent [options]
Shows the least frequently occurring distinct values for specified field names.
The first entry is the statistical anti-mode; the remaining are runners-up.
Options:
-f {one or more comma-separated field names}. Required flag.
-n {count}. Optional flag defaulting to 10.
-b Suppress counts; show only field values.
-o {name} Field name for output count. Default "count".
See also "mlr most-frequent".
================================================================
merge-fields
Usage: mlr merge-fields [options]
Computes univariate statistics for each input record, accumulated across
specified fields.
Options:
-a {sum,count,...} Names of accumulators. One or more of:
count Count instances of fields
null_count Count number of empty-string/JSON-null instances per field
distinct_count Count number of distinct values per field
mode Find most-frequently-occurring values for fields; first-found wins tie
antimode Find least-frequently-occurring values for fields; first-found wins tie
sum Compute sums of specified fields
mean Compute averages (sample means) of specified fields
var Compute sample variance of specified fields
stddev Compute sample standard deviation of specified fields
meaneb Estimate error bars for averages (assuming no sample autocorrelation)
skewness Compute sample skewness of specified fields
kurtosis Compute sample kurtosis of specified fields
min Compute minimum values of specified fields
max Compute maximum values of specified fields
minlen Compute minimum string-lengths of specified fields
maxlen Compute maximum string-lengths of specified fields
-f {a,b,c} Value-field names on which to compute statistics. Requires -o.
-r {a,b,c} Regular expressions for value-field names on which to compute
statistics. Requires -o.
-c {a,b,c} Substrings for collapse mode. All fields which have the same names
after removing substrings will be accumulated together. Please see
examples below.
-i Use interpolated percentiles, like R's type=7; default like type=1.
Not sensical for string-valued fields.
-o {name} Output field basename for -f/-r.
-k Keep the input fields which contributed to the output statistics;
the default is to omit them.
String-valued data make sense unless arithmetic on them is required,
e.g. for sum, mean, interpolated percentiles, etc. In case of mixed data,
numbers are less than strings.
Example input data: "a_in_x=1,a_out_x=2,b_in_y=4,b_out_x=8".
Example: mlr merge-fields -a sum,count -f a_in_x,a_out_x -o foo
produces "b_in_y=4,b_out_x=8,foo_sum=3,foo_count=2" since "a_in_x,a_out_x" are
summed over.
Example: mlr merge-fields -a sum,count -r in_,out_ -o bar
produces "bar_sum=15,bar_count=4" since all four fields are summed over.
Example: mlr merge-fields -a sum,count -c in_,out_
produces "a_x_sum=3,a_x_count=2,b_y_sum=4,b_y_count=1,b_x_sum=8,b_x_count=1"
since "a_in_x" and "a_out_x" both collapse to "a_x", "b_in_y" collapses to
"b_y", and "b_out_x" collapses to "b_x".
================================================================
most-frequent
Usage: mlr most-frequent [options]
Shows the most frequently occurring distinct values for specified field names.
The first entry is the statistical mode; the remaining are runners-up.
Options:
-f {one or more comma-separated field names}. Required flag.
-n {count}. Optional flag defaulting to 10.
-b Suppress counts; show only field values.
-o {name} Field name for output count. Default "count".
See also "mlr least-frequent".
================================================================
nest
Usage: mlr nest [options]
Explodes specified field values into separate fields/records, or reverses this.
Options:
--explode,--implode One is required.
--values,--pairs One is required.
--across-records,--across-fields One is required.
-f {field name} Required.
--nested-fs {string} Defaults to ";". Field separator for nested values.
--nested-ps {string} Defaults to ":". Pair separator for nested key-value pairs.
--evar {string} Shorthand for --explode --values --across-records --nested-fs {string}
--ivar {string} Shorthand for --implode --values --across-records --nested-fs {string}
Please use "mlr --usage-separator-options" for information on specifying separators.
Examples:
mlr nest --explode --values --across-records -f x
with input record "x=a;b;c,y=d" produces output records
"x=a,y=d"
"x=b,y=d"
"x=c,y=d"
Use --implode to do the reverse.
mlr nest --explode --values --across-fields -f x
with input record "x=a;b;c,y=d" produces output records
"x_1=a,x_2=b,x_3=c,y=d"
Use --implode to do the reverse.
mlr nest --explode --pairs --across-records -f x
with input record "x=a:1;b:2;c:3,y=d" produces output records
"a=1,y=d"
"b=2,y=d"
"c=3,y=d"
mlr nest --explode --pairs --across-fields -f x
with input record "x=a:1;b:2;c:3,y=d" produces output records
"a=1,b=2,c=3,y=d"
Notes:
* With --pairs, --implode doesn't make sense since the original field name has
been lost.
* The combination "--implode --values --across-records" is non-streaming:
no output records are produced until all input records have been read. In
particular, this means it won't work in `tail -f` contexts. But all other flag
combinations result in streaming (`tail -f` friendly) data processing.
If input is coming from `tail -f`, be sure to use `--records-per-batch 1`.
* It's up to you to ensure that the nested-fs is distinct from your data's IFS:
e.g. by default the former is semicolon and the latter is comma.
See also mlr reshape.
================================================================
nothing
Usage: mlr nothing [options]
Drops all input records. Useful for testing, or after tee/print/etc. have
produced other output.
Options:
-h|--help Show this message.
================================================================
put
Usage: mlr put [options] {DSL expression}
Options:
-f {file name} File containing a DSL expression (see examples below). If the filename
is a directory, all *.mlr files in that directory are loaded.
-e {expression} You can use this after -f to add an expression. Example use
case: define functions/subroutines in a file you specify with -f, then call
them with an expression you specify with -e.
(If you mix -e and -f then the expressions are evaluated in the order encountered.
Since the expression pieces are simply concatenated, please be sure to use intervening
semicolons to separate expressions.)
-s name=value: Predefines out-of-stream variable @name to have
Thus mlr put -s foo=97 '$column += @foo' is like
mlr put 'begin {@foo = 97} $column += @foo'.
The value part is subject to type-inferencing.
May be specified more than once, e.g. -s name1=value1 -s name2=value2.
Note: the value may be an environment variable, e.g. -s sequence=$SEQUENCE
-x (default false) Prints records for which {expression} evaluates to false, not true,
i.e. invert the sense of the filter expression.
-q Does not include the modified record in the output stream.
Useful for when all desired output is in begin and/or end blocks.
-S and -F: There are no-ops in Miller 6 and above, since now type-inferencing is done
by the record-readers before filter/put is executed. Supported as no-op pass-through
flags for backward compatibility.
-h|--help Show this message.
Parser-info options:
-w Print warnings about things like uninitialized variables.
-W Same as -w, but exit the process if there are any warnings.
-p Prints the expressions's AST (abstract syntax tree), which gives full
transparency on the precedence and associativity rules of Miller's grammar,
to stdout.
-d Like -p but uses a parenthesized-expression format for the AST.
-D Like -d but with output all on one line.
-E Echo DSL expression before printing parse-tree
-v Same as -E -p.
-X Exit after parsing but before stream-processing. Useful with -v/-d/-D, if you
only want to look at parser information.
Examples:
mlr --from example.csv put '$qr = $quantity * $rate'
More example put expressions:
If-statements:
'if ($flag == true) { $quantity *= 10}'
'if ($x > 0.0) { $y=log10($x); $z=sqrt($y) } else {$y = 0.0; $z = 0.0}'
Newly created fields can be read after being written:
'$new_field = $index**2; $qn = $quantity * $new_field'
Regex-replacement:
'$name = sub($name, "http.*com"i, "")'
Regex-capture:
'if ($a =~ "([a-z]+)_([0-9]+)") { $b = "left_\1"; $c = "right_\2" }'
Built-in variables:
'$filename = FILENAME'
Aggregations (use mlr put -q):
'@sum += $x; end {emit @sum}'
'@sum[$shape] += $quantity; end {emit @sum, "shape"}'
'@sum[$shape][$color] += $x; end {emit @sum, "shape", "color"}'
'
@min = min(@min,$x);
@max=max(@max,$x);
end{emitf @min, @max}
'
See also https://miller.readthedocs.io/reference-dsl for more context.
================================================================
regularize
Usage: mlr regularize [options]
Outputs records sorted lexically ascending by keys.
Options:
-h|--help Show this message.
================================================================
remove-empty-columns
Usage: mlr remove-empty-columns [options]
Omits fields which are empty on every input row. Non-streaming.
Options:
-h|--help Show this message.
================================================================
rename
Usage: mlr rename [options] {old1,new1,old2,new2,...}
Renames specified fields.
Options:
-r Treat old field names as regular expressions. "ab", "a.*b"
will match any field name containing the substring "ab" or
matching "a.*b", respectively; anchors of the form "^ab$",
"^a.*b$" may be used. New field names may be plain strings,
or may contain capture groups of the form "\1" through
"\9". Wrapping the regex in double quotes is optional, but
is required if you wish to follow it with 'i' to indicate
case-insensitivity.
-g Do global replacement within each field name rather than
first-match replacement.
-h|--help Show this message.
Examples:
mlr rename old_name,new_name'
mlr rename old_name_1,new_name_1,old_name_2,new_name_2'
mlr rename -r 'Date_[0-9]+,Date,' Rename all such fields to be "Date"
mlr rename -r '"Date_[0-9]+",Date' Same
mlr rename -r 'Date_([0-9]+).*,\1' Rename all such fields to be of the form 20151015
mlr rename -r '"name"i,Name' Rename "name", "Name", "NAME", etc. to "Name"
================================================================
reorder
Usage: mlr reorder [options]
Moves specified names to start of record, or end of record.
Options:
-e Put specified field names at record end: default is to put them at record start.
-f {a,b,c} Field names to reorder.
-b {x} Put field names specified with -f before field name specified by {x},
if any. If {x} isn't present in a given record, the specified fields
will not be moved.
-a {x} Put field names specified with -f after field name specified by {x},
if any. If {x} isn't present in a given record, the specified fields
will not be moved.
-h|--help Show this message.
Examples:
mlr reorder -f a,b sends input record "d=4,b=2,a=1,c=3" to "a=1,b=2,d=4,c=3".
mlr reorder -e -f a,b sends input record "d=4,b=2,a=1,c=3" to "d=4,c=3,a=1,b=2".
================================================================
repeat
Usage: mlr repeat [options]
Copies input records to output records multiple times.
Options must be exactly one of the following:
-n {repeat count} Repeat each input record this many times.
-f {field name} Same, but take the repeat count from the specified
field name of each input record.
-h|--help Show this message.
Example:
echo x=0 | mlr repeat -n 4 then put '$x=urand()'
produces:
x=0.488189
x=0.484973
x=0.704983
x=0.147311
Example:
echo a=1,b=2,c=3 | mlr repeat -f b
produces:
a=1,b=2,c=3
a=1,b=2,c=3
Example:
echo a=1,b=2,c=3 | mlr repeat -f c
produces:
a=1,b=2,c=3
a=1,b=2,c=3
a=1,b=2,c=3
================================================================
reshape
Usage: mlr reshape [options]
Wide-to-long options:
-i {input field names} -o {key-field name,value-field name}
-r {input field regex} -o {key-field name,value-field name}
These pivot/reshape the input data such that the input fields are removed
and separate records are emitted for each key/value pair.
Note: if you have multiple regexes, please specify them using multiple -r,
since regexes can contain commas within them.
Note: this works with tail -f and produces output records for each input
record seen. If input is coming from `tail -f`, be sure to use
`--records-per-batch 1`.
Long-to-wide options:
-s {key-field name,value-field name}
These pivot/reshape the input data to undo the wide-to-long operation.
Note: this does not work with tail -f; it produces output records only after
all input records have been read.
Examples:
Input file "wide.txt":
time X Y
2009-01-01 0.65473572 2.4520609
2009-01-02 -0.89248112 0.2154713
2009-01-03 0.98012375 1.3179287
mlr --pprint reshape -i X,Y -o item,value wide.txt
time item value
2009-01-01 X 0.65473572
2009-01-01 Y 2.4520609
2009-01-02 X -0.89248112
2009-01-02 Y 0.2154713
2009-01-03 X 0.98012375
2009-01-03 Y 1.3179287
mlr --pprint reshape -r '[A-Z]' -o item,value wide.txt
time item value
2009-01-01 X 0.65473572
2009-01-01 Y 2.4520609
2009-01-02 X -0.89248112
2009-01-02 Y 0.2154713
2009-01-03 X 0.98012375
2009-01-03 Y 1.3179287
Input file "long.txt":
time item value
2009-01-01 X 0.65473572
2009-01-01 Y 2.4520609
2009-01-02 X -0.89248112
2009-01-02 Y 0.2154713
2009-01-03 X 0.98012375
2009-01-03 Y 1.3179287
mlr --pprint reshape -s item,value long.txt
time X Y
2009-01-01 0.65473572 2.4520609
2009-01-02 -0.89248112 0.2154713
2009-01-03 0.98012375 1.3179287
See also mlr nest.
================================================================
sample
Usage: mlr sample [options]
Reservoir sampling (subsampling without replacement), optionally by category.
See also mlr bootstrap and mlr shuffle.
Options:
-g {a,b,c} Optional: group-by-field names for samples, e.g. a,b,c.
-k {k} Required: number of records to output in total, or by group if using -g.
-h|--help Show this message.
================================================================
sec2gmtdate
Usage: ../c/mlr sec2gmtdate {comma-separated list of field names}
Replaces a numeric field representing seconds since the epoch with the
corresponding GMT year-month-day timestamp; leaves non-numbers as-is.
This is nothing more than a keystroke-saver for the sec2gmtdate function:
../c/mlr sec2gmtdate time1,time2
is the same as
../c/mlr put '$time1=sec2gmtdate($time1);$time2=sec2gmtdate($time2)'
================================================================
sec2gmt
Usage: mlr sec2gmt [options] {comma-separated list of field names}
Replaces a numeric field representing seconds since the epoch with the
corresponding GMT timestamp; leaves non-numbers as-is. This is nothing
more than a keystroke-saver for the sec2gmt function:
mlr sec2gmt time1,time2
is the same as
mlr put '$time1 = sec2gmt($time1); $time2 = sec2gmt($time2)'
Options:
-1 through -9: format the seconds using 1..9 decimal places, respectively.
--millis Input numbers are treated as milliseconds since the epoch.
--micros Input numbers are treated as microseconds since the epoch.
--nanos Input numbers are treated as nanoseconds since the epoch.
-h|--help Show this message.
================================================================
seqgen
Usage: mlr seqgen [options]
Passes input records directly to output. Most useful for format conversion.
Produces a sequence of counters. Discards the input record stream. Produces
output as specified by the options
Options:
-f {name} (default "i") Field name for counters.
--start {value} (default 1) Inclusive start value.
--step {value} (default 1) Step value.
--stop {value} (default 100) Inclusive stop value.
-h|--help Show this message.
Start, stop, and/or step may be floating-point. Output is integer if start,
stop, and step are all integers. Step may be negative. It may not be zero
unless start == stop.
================================================================
shuffle
Usage: mlr shuffle [options]
Outputs records randomly permuted. No output records are produced until
all input records are read. See also mlr bootstrap and mlr sample.
Options:
-h|--help Show this message.
================================================================
skip-trivial-records
Usage: mlr skip-trivial-records [options]
Passes through all records except those with zero fields,
or those for which all fields have empty value.
Options:
-h|--help Show this message.
================================================================
sort
Usage: mlr sort {flags}
Sorts records primarily by the first specified field, secondarily by the second
field, and so on. (Any records not having all specified sort keys will appear
at the end of the output, in the order they were encountered, regardless of the
specified sort order.) The sort is stable: records that compare equal will sort
in the order they were encountered in the input record stream.
Options:
-f {comma-separated field names} Lexical ascending
-r {comma-separated field names} Lexical descending
-c {comma-separated field names} Case-folded lexical ascending
-cr {comma-separated field names} Case-folded lexical descending
-n {comma-separated field names} Numerical ascending; nulls sort last
-nf {comma-separated field names} Same as -n
-nr {comma-separated field names} Numerical descending; nulls sort first
-t {comma-separated field names} Natural ascending
-tr|-rt {comma-separated field names} Natural descending
-h|--help Show this message.
Example:
mlr sort -f a,b -nr x,y,z
which is the same as:
mlr sort -f a -f b -nr x -nr y -nr z
================================================================
sort-within-records
Usage: mlr sort-within-records [options]
Outputs records sorted lexically ascending by keys.
Options:
-r Recursively sort subobjects/submaps, e.g. for JSON input.
-h|--help Show this message.
================================================================
sparsify
Usage: mlr sparsify [options]
Unsets fields for which the key is the empty string (or, optionally, another
specified value). Only makes sense with output format not being CSV or TSV.
Options:
-s {filler string} What values to remove. Defaults to the empty string.
-f {a,b,c} Specify field names to be operated on; any other fields won't be
modified. The default is to modify all fields.
-h|--help Show this message.
Example: if input is a=1,b=,c=3 then output is a=1,c=3.
================================================================
split
Usage: mlr split [options] {filename}
Options:
-n {n}: Cap file sizes at N records.
-m {m}: Produce M files, round-robining records among them.
-g {a,b,c}: Write separate files with records having distinct values for fields named a,b,c.
Exactly one of -m, -n, or -g must be supplied.
--prefix {p} Specify filename prefix; default "split".
--suffix {s} Specify filename suffix; default is from mlr output format, e.g. "csv".
-a Append to existing file(s), if any, rather than overwriting.
-v Send records along to downstream verbs as well as splitting to files.
-e Do NOT URL-escape names of output files.
-j {J} Use string J to join filename parts; default "_".
-h|--help Show this message.
Any of the output-format command-line flags (see mlr -h). For example, using
mlr --icsv --from myfile.csv split --ojson -n 1000
the input is CSV, but the output files are JSON.
Examples: Suppose myfile.csv has 1,000,000 records.
100 output files, 10,000 records each. First 10,000 records in split_1.csv, next in split_2.csv, etc.
mlr --csv --from myfile.csv split -n 10000
10 output files, 100,000 records each. Records 1,11,21,etc in split_1.csv, records 2,12,22, etc in split_2.csv, etc.
mlr --csv --from myfile.csv split -m 10
Same, but with JSON output.
mlr --csv --from myfile.csv split -m 10 -o json
Same but instead of split_1.csv, split_2.csv, etc. there are test_1.dat, test_2.dat, etc.
mlr --csv --from myfile.csv split -m 10 --prefix test --suffix dat
Same, but written to the /tmp/ directory.
mlr --csv --from myfile.csv split -m 10 --prefix /tmp/test --suffix dat
If the shape field has values triangle and square, then there will be split_triangle.csv and split_square.csv.
mlr --csv --from myfile.csv split -g shape
If the color field has values yellow and green, and the shape field has values triangle and square,
then there will be split_yellow_triangle.csv, split_yellow_square.csv, etc.
mlr --csv --from myfile.csv split -g color,shape
See also the "tee" DSL function which lets you do more ad-hoc customization.
================================================================
ssub
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
================================================================
stats1
Usage: mlr stats1 [options]
Computes univariate statistics for one or more given fields, accumulated across
the input record stream.
Options:
-a {sum,count,...} Names of accumulators: one or more of:
median This is the same as p50
p10 p25.2 p50 p98 p100 etc.
count Count instances of fields
null_count Count number of empty-string/JSON-null instances per field
distinct_count Count number of distinct values per field
mode Find most-frequently-occurring values for fields; first-found wins tie
antimode Find least-frequently-occurring values for fields; first-found wins tie
sum Compute sums of specified fields
mean Compute averages (sample means) of specified fields
var Compute sample variance of specified fields
stddev Compute sample standard deviation of specified fields
meaneb Estimate error bars for averages (assuming no sample autocorrelation)
skewness Compute sample skewness of specified fields
kurtosis Compute sample kurtosis of specified fields
min Compute minimum values of specified fields
max Compute maximum values of specified fields
minlen Compute minimum string-lengths of specified fields
maxlen Compute maximum string-lengths of specified fields
-f {a,b,c} Value-field names on which to compute statistics
--fr {regex} Regex for value-field names on which to compute statistics
(compute statistics on values in all field names matching regex
--fx {regex} Inverted regex for value-field names on which to compute statistics
(compute statistics on values in all field names not matching regex)
-g {d,e,f} Optional group-by-field names
--gr {regex} Regex for optional group-by-field names
(group by values in field names matching regex)
--gx {regex} Inverted regex for optional group-by-field names
(group by values in field names not matching regex)
--grfx {regex} Shorthand for --gr {regex} --fx {that same regex}
-i Use interpolated percentiles, like R's type=7; default like type=1.
Not sensical for string-valued fields.\n");
-s Print iterative stats. Useful in tail -f contexts, in which
case please avoid pprint-format output since end of input
stream will never be seen. Likewise, if input is coming from `tail -f`
be sure to use `--records-per-batch 1`.
-h|--help Show this message.
Example: mlr stats1 -a min,p10,p50,p90,max -f value -g size,shape
Example: mlr stats1 -a count,mode -f size
Example: mlr stats1 -a count,mode -f size -g shape
Example: mlr stats1 -a count,mode --fr '^[a-h].*$' -gr '^k.*$'
This computes count and mode statistics on all field names beginning
with a through h, grouped by all field names starting with k.
Notes:
* p50 and median are synonymous.
* min and max output the same results as p0 and p100, respectively, but use
less memory.
* String-valued data make sense unless arithmetic on them is required,
e.g. for sum, mean, interpolated percentiles, etc. In case of mixed data,
numbers are less than strings.
* count and mode allow text input; the rest require numeric input.
In particular, 1 and 1.0 are distinct text for count and mode.
* When there are mode ties, the first-encountered datum wins.
================================================================
stats2
Usage: mlr stats2 [options]
Computes bivariate statistics for one or more given field-name pairs,
accumulated across the input record stream.
-a {linreg-ols,corr,...} Names of accumulators: one or more of:
linreg-ols Linear regression using ordinary least squares
linreg-pca Linear regression using principal component analysis
r2 Quality metric for linreg-ols (linreg-pca emits its own)
logireg Logistic regression
corr Sample correlation
cov Sample covariance
covx Sample-covariance matrix
-f {a,b,c,d} Value-field name-pairs on which to compute statistics.
There must be an even number of names.
-g {e,f,g} Optional group-by-field names.
-v Print additional output for linreg-pca.
-s Print iterative stats. Useful in tail -f contexts, in which
case please avoid pprint-format output since end of input
stream will never be seen. Likewise, if input is coming from
`tail -f`, be sure to use `--records-per-batch 1`.
--fit Rather than printing regression parameters, applies them to
the input data to compute new fit fields. All input records are
held in memory until end of input stream. Has effect only for
linreg-ols, linreg-pca, and logireg.
Only one of -s or --fit may be used.
Example: mlr stats2 -a linreg-pca -f x,y
Example: mlr stats2 -a linreg-ols,r2 -f x,y -g size,shape
Example: mlr stats2 -a corr -f x,y
================================================================
step
Usage: mlr step [options]
Computes values dependent on earlier/later records, optionally grouped by category.
Options:
-a {delta,rsum,...} Names of steppers: comma-separated, one or more of:
counter Count instances of field(s) between successive records
delta Compute differences in field(s) between successive records
ewma Exponentially weighted moving average over successive records
from-first Compute differences in field(s) from first record
ratio Compute ratios in field(s) between successive records
rprod Compute running products of field(s) between successive records
rsum Compute running sums of field(s) between successive records
shift Alias for shift_lag
shift_lag Include value(s) in field(s) from the previous record, if any
shift_lead Include value(s) in field(s) from the next record, if any
slwin Sliding-window averages over m records back and n forward. E.g. slwin_7_2 for 7 back and 2 forward.
-f {a,b,c} Value-field names on which to compute statistics
-g {d,e,f} Optional group-by-field names
-F Computes integerable things (e.g. counter) in floating point.
As of Miller 6 this happens automatically, but the flag is accepted
as a no-op for backward compatibility with Miller 5 and below.
-d {x,y,z} Weights for EWMA. 1 means current sample gets all weight (no
smoothing), near under 1 is light smoothing, near over 0 is
heavy smoothing. Multiple weights may be specified, e.g.
"mlr step -a ewma -f sys_load -d 0.01,0.1,0.9". Default if omitted
is "-d 0.5".
-o {a,b,c} Custom suffixes for EWMA output fields. If omitted, these default to
the -d values. If supplied, the number of -o values must be the same
as the number of -d values.
-h|--help Show this message.
Examples:
mlr step -a rsum -f request_size
mlr step -a delta -f request_size -g hostname
mlr step -a ewma -d 0.1,0.9 -f x,y
mlr step -a ewma -d 0.1,0.9 -o smooth,rough -f x,y
mlr step -a ewma -d 0.1,0.9 -o smooth,rough -f x,y -g group_name
mlr step -a slwin_9_0,slwin_0_9 -f x
Please see https://miller.readthedocs.io/en/latest/reference-verbs.html#filter or
https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
for more information on EWMA.
================================================================
sub
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
================================================================
summary
Usage: mlr summary [options]
Show summary statistics about the input data.
All summarizers:
field_type string, int, etc. -- if a column has mixed types, all encountered types are printed
count +1 for every instance of the field across all records in the input record stream
null_count count of field values either empty string or JSON null
distinct_count count of distinct values for the field
mode most-frequently-occurring value for the field
sum sum of field values
mean mean of the field values
stddev standard deviation of the field values
var variance of the field values
skewness skewness of the field values
minlen length of shortest string representation for the field
maxlen length of longest string representation for the field
min minimum field value
p25 first-quartile field value
median median field value
p75 third-quartile field value
max maximum field value
iqr interquartile range: p75 - p25
lof lower outer fence: p25 - 3.0 * iqr
lif lower inner fence: p25 - 1.5 * iqr
uif upper inner fence: p75 + 1.5 * iqr
uof upper outer fence: p75 + 3.0 * iqr
Default summarizers:
field_type count mean min max null_count distinct_count
Notes:
* min, p25, median, p75, and max work for strings as well as numbers
* Distinct-counts are computed on string representations -- so 4.1 and 4.10 are counted as distinct here.
* If the mode is not unique in the input data, the first-encountered value is reported as the mode.
Options:
-a {mean,sum,etc.} Use only the specified summarizers.
-x {mean,sum,etc.} Use all summarizers, except the specified ones.
--all Use all available summarizers.
-h|--help Show this message.
================================================================
tac
Usage: mlr tac [options]
Prints records in reverse order from the order in which they were encountered.
Options:
-h|--help Show this message.
================================================================
tail
Usage: mlr tail [options]
Passes through the last n records, optionally by category.
Options:
-g {a,b,c} Optional group-by-field names for head counts, e.g. a,b,c.
-n {n} Head-count to print. Default 10.
-h|--help Show this message.
================================================================
tee
Usage: mlr tee [options] {filename}
Options:
-a Append to existing file, if any, rather than overwriting.
-p Treat filename as a pipe-to command.
Any of the output-format command-line flags (see mlr -h). Example: using
mlr --icsv --opprint put '...' then tee --ojson ./mytap.dat then stats1 ...
the input is CSV, the output is pretty-print tabular, but the tee-file output
is written in JSON format.
-h|--help Show this message.
================================================================
template
Usage: mlr template [options]
Places input-record fields in the order specified by list of column names.
If the input record is missing a specified field, it will be filled with the fill-with.
If the input record possesses an unspecified field, it will be discarded.
Options:
-f {a,b,c} Comma-separated field names for template, e.g. a,b,c.
-t {filename} CSV file whose header line will be used for template.
--fill-with {filler string} What to fill absent fields with. Defaults to the empty string.
-h|--help Show this message.
Example:
* Specified fields are a,b,c.
* Input record is c=3,a=1,f=6.
* Output record is a=1,b=,c=3.
================================================================
top
Usage: mlr top [options]
-f {a,b,c} Value-field names for top counts.
-g {d,e,f} Optional group-by-field names for top counts.
-n {count} How many records to print per category; default 1.
-a Print all fields for top-value records; default is
to print only value and group-by fields. Requires a single
value-field name only.
--min Print top smallest values; default is top largest values.
-F Keep top values as floats even if they look like integers.
-o {name} Field name for output indices. Default "top_idx".
This is ignored if -a is used.
Prints the n records with smallest/largest values at specified fields,
optionally by category. If -a is given, then the top records are emitted
with the same fields as they appeared in the input. Without -a, only fields
from -f, fields from -g, and the top-index field are emitted. For more information
please see https://miller.readthedocs.io/en/latest/reference-verbs#top
================================================================
utf8-to-latin1
Usage: mlr utf8-to-latin1, with no options.
Recursively converts record strings from Latin-1 to UTF-8.
For field-level control, please see the utf8_to_latin1 DSL function.
Options:
-h|--help Show this message.
================================================================
unflatten
Usage: mlr unflatten [options]
Reverses flatten. Example: field with name 'a.b.c' and value 4
becomes name 'a' and value '{"b": { "c": 4 }}'.
Options:
-f {a,b,c} Comma-separated list of field names to unflatten (default all).
-s {string} Separator, defaulting to mlr --flatsep value.
-h|--help Show this message.
================================================================
uniq
Usage: mlr uniq [options]
Prints distinct values for specified field names. With -c, same as
count-distinct. For uniq, -f is a synonym for -g.
Options:
-g {d,e,f} Group-by-field names for uniq counts.
-x {a,b,c} Field names to exclude for uniq: use each record's others instead.
-c Show repeat counts in addition to unique values.
-n Show only the number of distinct values.
-o {name} Field name for output count. Default "count".
-a Output each unique record only once. Incompatible with -g.
With -c, produces unique records, with repeat counts for each.
With -n, produces only one record which is the unique-record count.
With neither -c nor -n, produces unique records.
================================================================
unspace
Usage: mlr unspace [options]
Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
Options:
-f {x} Replace spaces with specified filler character.
-k Unspace only keys, not keys and values.
-v Unspace only values, not keys and values.
-h|--help Show this message.
================================================================
unsparsify
Usage: mlr unsparsify [options]
Prints records with the union of field names over all input records.
For field names absent in a given record but present in others, fills in
a value. This verb retains all input before producing any output.
Options:
--fill-with {filler string} What to fill absent fields with. Defaults to
the empty string.
-f {a,b,c} Specify field names to be operated on. Any other fields won't be
modified, and operation will be streaming.
-h|--help Show this message.
Example: if the input is two records, one being 'a=1,b=2' and the other
being 'b=3,c=4', then the output is the two records 'a=1,b=2,c=' and
'a=,b=3,c=4'.
================================================================