miller/docs/src/scripting.md.in
John Kerl 8b22708c27
Support comments in mlr -s files (#1359)
* Support comments in `mlr -s` files

* doc mods

* artifacts from `make dev`

* neaten
2023-08-19 13:32:09 -04:00

122 lines
3.3 KiB
Markdown

# Scripting with Miller
Suppose you are often doing something like
GENMD-RUN-COMMAND
mlr --icsv --opprint \
filter '$quantity != 20' \
then count-distinct -f shape \
then fraction -f count \
example.csv
GENMD-EOF
Typing this out can get a bit old, if the only thing that changes for you is the filename. Some options include:
* On Linux/Mac/etc you can make a script with `#!/bin/sh` which invokes Miller as part of the shell-script body.
* On Linux/Mac/etc you can make a script with `#!/usr/bin/env mlr -s` which invokes Miller.
* On any platform you can put the reusable part of your command line into a text file (say `myflags.txt`), then `mlr -s myflags-txt filename-which-varies.csv`.
Let's look at examples of each.
## Shell scripts
A shell-script option:
GENMD-RUN-COMMAND
cat example-shell-script
GENMD-EOF
Key points here:
* Use `--` before `"$@"` at the end so that main-flags like `--json` won't be confused for options to the `fraction` verb.
* Use `"$@"` at the end which means "all remaining arguments to the script".
* Use `chmod +x example-shell-script` after you create one of these.
Then you can do
GENMD-RUN-COMMAND
./example-shell-script example.csv
GENMD-EOF
GENMD-RUN-COMMAND
cat example.csv | ./example-shell-script
GENMD-EOF
GENMD-RUN-COMMAND
./example-shell-script --ojson example.csv
GENMD-EOF
GENMD-RUN-COMMAND
./example-shell-script --ojson then filter '$count == 3' example.csv
GENMD-EOF
etc.
## Miller scripts
Here instead of putting `#!/bin/bash` on the first line, we can put `mlr` directly:
GENMD-RUN-COMMAND
cat ./example-mlr-s-script
GENMD-EOF
Points:
* This is largely the same as a shell script.
* Use `chmod +x example-mlr-s-script` after you create one of these.
* You leave off the initial `mlr` since that's present on line 1.
* You don't need all the backslashing for line-continuations.
* You don't need the explicit `--` or `"$@"`.
* All text from `#` to end of line is stripped out. If for any reason you need to suppress this, please use `mlr --s-no-comment-strip` in place of `mlr -s`.
Then you can do
GENMD-RUN-COMMAND
./example-mlr-s-script example.csv
GENMD-EOF
GENMD-RUN-COMMAND
cat example.csv | ./example-mlr-s-script
GENMD-EOF
GENMD-RUN-COMMAND
./example-mlr-s-script --ojson example.csv
GENMD-EOF
GENMD-RUN-COMMAND
./example-mlr-s-script --ojson then filter '$count == 3' example.csv
GENMD-EOF
## Miller scripts on Windows
Both the previous options require executable mode with `chmod`, and a _shebang
line_ with `#!...`, which are unixisms.
One of the nice features of `mlr -s` is it can be done without a shebang line,
and this works fine on Windows. For example:
GENMD-RUN-COMMAND
cat example-mlr-s-script-no-shebang
GENMD-EOF
Points:
* Same as above, where the `#!` line isn't needed. (But you can include a `#!` line; `mlr -s` will simply see it as a comment line.).
* As above, you don't need all the backslashing for line-continuations.
* As above, you don't need the explicit `--` or `"$@"`.
Then you can do
GENMD-RUN-COMMAND
mlr -s example-mlr-s-script-no-shebang example.csv
GENMD-EOF
GENMD-RUN-COMMAND
mlr -s example-mlr-s-script-no-shebang --ojson example.csv
GENMD-EOF
GENMD-RUN-COMMAND
mlr -s example-mlr-s-script-no-shebang --ojson then filter '$count == 3' example.csv
GENMD-EOF
and so on. See also [Miller on Windows](miller-on-windows.md).