miller/docs6/docs/reference-main-overview.md.in
John Kerl 4f1424789e
Doc6 proofreads 3 (#638)
* Docs6 proofreads batch 3

* BUild-everything script for local development

* Start of glossary

* Put quicklinks atop every page, not just the base-index page

* Expanded record-heterogeneity page

* streaming page

* separators page

* vimrc doc

* separators page
2021-09-03 23:19:32 -04:00

64 lines
2.6 KiB
Markdown

# Miller command structure
## Overview
The outline of an invocation of Miller is:
* The program name `mlr`.
* Options controlling input/output formatting, etc. (See [I/O options](reference-main-io-options.md)).
* One or more verbs -- such as `cut`, `sort`, etc. (see [Verbs Reference](reference-verbs.md)) -- chained together using [then](reference-main-then-chaining.md). You use these to transform your data.
* Zero or more filenames, with input taken from standard input if there are no filenames present. (You can place the filenames up front using `--from` or `--mfrom` as described on the [keystroke-savers page](keystroke-savers.md#file-names-up-front-including-from).)
For example, reading from a file:
GENMD_RUN_COMMAND
mlr --icsv --opprint head -n 2 then sort -f shape example.csv
GENMD_EOF
GENMD_RUN_COMMAND
mlr --from example.csv --icsv --opprint head -n 2 then sort -f shape
GENMD_EOF
Reading from standard input:
GENMD_RUN_COMMAND
cat example.csv | mlr --icsv --opprint head -n 2 then sort -f shape
GENMD_EOF
The rest of this reference section gives you full information on each of these parts of the command line.
See also the [Glossary](glossary.md) for more about terms such as
[record](glossary.md#record), [field](glossary.md#field),
[key](glossary.md#value), [streaming](glossary.md#streaming), and more.
## Verbs vs DSL
When you type `mlr {something} myfile.dat`, the `{something}` part is called a **verb**. It specifies how you want to transform your data. Most of the verbs are counterparts of built-in system tools like `cut` and `sort` -- but with file-format awareness, and giving you the ability to refer to fields by name.
The verbs `put` and `filter` are special in that they have a rich expression language (domain-specific language, or "DSL"). More information about them can be found at on the [Intro to Miller's programming language page](programming-language.md); see also [DSL reference](reference-dsl.md) for more details.
Here's a comparison of verbs and `put`/`filter` DSL expressions:
Example of using a verb for data processing:
GENMD_RUN_COMMAND
mlr stats1 -a sum -f x -g a data/small
GENMD_EOF
* Verbs are coded in Go
* They run a bit faster
* They take fewer keystrokes
* There's less to learn
* Their customization is limited to each verb's options
Example of doing the same thing using a DSL expression:
GENMD_RUN_COMMAND
mlr put -q '@x_sum[$a] += $x; end{emit @x_sum, "a"}' data/small
GENMD_EOF
* You get to write your own expressions in Miller's programming language
* They run a bit slower
* They take more keystrokes
* There's more to learn
* They're highly customizable