mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-23 02:14:13 +00:00
First pass at converting Miller 6 docs from Sphinx to Mkdocs (#616)
* Accept more passing emit cases * Port docs from sphinx to mkdocs * iterating * rephrase internal-link syntax using mkdocs * iterating
This commit is contained in:
parent
86f31f2f9b
commit
11eac853d2
1056 changed files with 611281 additions and 50 deletions
2
.gitignore
vendored
2
.gitignore
vendored
|
|
@ -119,3 +119,5 @@ experiments/dsl-parser/two/src
|
|||
experiments/dsl-parser/two/main
|
||||
experiments/cli-parser/cliparse
|
||||
experiments/cli-parser/cliparse.exe
|
||||
|
||||
docs6b/site/
|
||||
|
|
|
|||
2
docs6b/docs/.vimrc
Normal file
2
docs6b/docs/.vimrc
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
map \d :w<C-m>:!clear;build-one %<C-m>
|
||||
map \f :w<C-m>:!clear;make html<C-m>
|
||||
2
docs6b/docs/10-1.sh
Executable file
2
docs6b/docs/10-1.sh
Executable file
|
|
@ -0,0 +1,2 @@
|
|||
grep op=cache log.txt \
|
||||
| mlr --idkvp --opprint stats1 -a mean -f hit -g type then sort -f type
|
||||
4
docs6b/docs/10-2.sh
Executable file
4
docs6b/docs/10-2.sh
Executable file
|
|
@ -0,0 +1,4 @@
|
|||
mlr --from log.txt --opprint \
|
||||
filter 'is_present($batch_size)' \
|
||||
then step -a delta -f time,num_filtered \
|
||||
then sec2gmt time
|
||||
608
docs6b/docs/10min.md
Normal file
608
docs6b/docs/10min.md
Normal file
|
|
@ -0,0 +1,608 @@
|
|||
<!--- PLEASE DO NOT EDIT DIRECTLY. EDIT THE .md.in FILE PLEASE. --->
|
||||
# Miller in 10 minutes
|
||||
|
||||
## Obtaining Miller
|
||||
|
||||
You can install Miller for various platforms as follows:
|
||||
|
||||
* Linux: ``yum install miller`` or ``apt-get install miller`` depending on your flavor of Linux
|
||||
* MacOS: ``brew install miller`` or ``port install miller`` depending on your preference of [Homebrew](https://brew.sh>`_ or `MacPorts <https://macports.org).
|
||||
* Windows: ``choco install miller`` using [Chocolatey](https://chocolatey.org).
|
||||
* You can get latest builds for Linux, MacOS, and Windows by visiting https://github.com/johnkerl/miller/actions, selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)
|
||||
* See also the [build page](build.md) if you prefer -- in particular, if your platform's package manager doesn't have the latest release.
|
||||
|
||||
As a first check, you should be able to run ``mlr --version`` at your system's command prompt and see something like the following:
|
||||
|
||||
<pre>
|
||||
<b>mlr --version</b>
|
||||
Miller v6.0.0-dev
|
||||
</pre>
|
||||
|
||||
As a second check, given [example.csv](./example.csv) you should be able to do
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv cat example.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint cat example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre>
|
||||
|
||||
If you run into issues on these checks, please check out the resources on the [community page](community.md) for help.
|
||||
|
||||
## Miller verbs
|
||||
|
||||
Let's take a quick look at some of the most useful Miller verbs -- file-format-aware, name-index-empowered equivalents of standard system commands.
|
||||
|
||||
``mlr cat`` is like system ``cat`` (or ``type`` on Windows) -- it passes the data through unmodified:
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv cat example.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre>
|
||||
|
||||
But ``mlr cat`` can also do format conversion -- for example, you can pretty-print in tabular format:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint cat example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre>
|
||||
|
||||
``mlr head`` and ``mlr tail`` count records rather than lines. Whether you're getting the first few records or the last few, the CSV header is included either way:
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv head -n 4 example.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv tail -n 4 example.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --ojson tail -n 2 example.csv</b>
|
||||
{
|
||||
"color": "yellow",
|
||||
"shape": "circle",
|
||||
"flag": true,
|
||||
"index": 87,
|
||||
"quantity": 63.5058,
|
||||
"rate": 8.3350
|
||||
}
|
||||
{
|
||||
"color": "purple",
|
||||
"shape": "square",
|
||||
"flag": false,
|
||||
"index": 91,
|
||||
"quantity": 72.3735,
|
||||
"rate": 8.2430
|
||||
}
|
||||
</pre>
|
||||
|
||||
You can sort on a single field:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint sort -f shape example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
red circle true 16 13.8103 2.9010
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
red square true 15 79.2778 0.0130
|
||||
red square false 48 77.5542 7.4670
|
||||
red square false 64 77.1991 9.5310
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
</pre>
|
||||
|
||||
Or, you can sort primarily alphabetically on one field, then secondarily numerically descending on another field, and so on:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint sort -f shape -nr index example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
red circle true 16 13.8103 2.9010
|
||||
purple square false 91 72.3735 8.2430
|
||||
red square false 64 77.1991 9.5310
|
||||
red square false 48 77.5542 7.4670
|
||||
red square true 15 79.2778 0.0130
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
</pre>
|
||||
|
||||
If there are fields you don't want to see in your data, you can use ``cut`` to keep only the ones you want, in the same order they appeared in the input data:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint cut -f flag,shape example.csv</b>
|
||||
shape flag
|
||||
triangle true
|
||||
square true
|
||||
circle true
|
||||
square false
|
||||
triangle false
|
||||
square false
|
||||
triangle false
|
||||
circle true
|
||||
circle true
|
||||
square false
|
||||
</pre>
|
||||
|
||||
You can also use ``cut -o`` to keep specified fields, but in your preferred order:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint cut -o -f flag,shape example.csv</b>
|
||||
flag shape
|
||||
true triangle
|
||||
true square
|
||||
true circle
|
||||
false square
|
||||
false triangle
|
||||
false square
|
||||
false triangle
|
||||
true circle
|
||||
true circle
|
||||
false square
|
||||
</pre>
|
||||
|
||||
You can use ``cut -x`` to omit fields you don't care about:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint cut -x -f flag,shape example.csv</b>
|
||||
color index quantity rate
|
||||
yellow 11 43.6498 9.8870
|
||||
red 15 79.2778 0.0130
|
||||
red 16 13.8103 2.9010
|
||||
red 48 77.5542 7.4670
|
||||
purple 51 81.2290 8.5910
|
||||
red 64 77.1991 9.5310
|
||||
purple 65 80.1405 5.8240
|
||||
yellow 73 63.9785 4.2370
|
||||
yellow 87 63.5058 8.3350
|
||||
purple 91 72.3735 8.2430
|
||||
</pre>
|
||||
|
||||
You can use ``filter`` to keep only records you care about:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint filter '$color == "red"' example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
red square false 64 77.1991 9.5310
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint filter '$color == "red" && $flag == true' example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
</pre>
|
||||
|
||||
You can use ``put`` to create new fields which are computed from other fields:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint put '</b>
|
||||
<b> $ratio = $quantity / $rate;</b>
|
||||
<b> $color_shape = $color . "_" . $shape</b>
|
||||
<b>' example.csv</b>
|
||||
color shape flag index quantity rate ratio color_shape
|
||||
yellow triangle true 11 43.6498 9.8870 4.414868008496004 yellow_triangle
|
||||
red square true 15 79.2778 0.0130 6098.292307692308 red_square
|
||||
red circle true 16 13.8103 2.9010 4.760530851430541 red_circle
|
||||
red square false 48 77.5542 7.4670 10.386259541984733 red_square
|
||||
purple triangle false 51 81.2290 8.5910 9.455127458968688 purple_triangle
|
||||
red square false 64 77.1991 9.5310 8.099790158430384 red_square
|
||||
purple triangle false 65 80.1405 5.8240 13.760388049450551 purple_triangle
|
||||
yellow circle true 73 63.9785 4.2370 15.09995279679018 yellow_circle
|
||||
yellow circle true 87 63.5058 8.3350 7.619172165566886 yellow_circle
|
||||
purple square false 91 72.3735 8.2430 8.779995147397793 purple_square
|
||||
</pre>
|
||||
|
||||
Even though Miller's main selling point is name-indexing, sometimes you really want to refer to a field name by its positional index. Use ``$[[3]]`` to access the name of field 3 or ``$[[[3]]]`` to access the value of field 3:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint put '$[[3]] = "NEW"' example.csv</b>
|
||||
color shape NEW index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint put '$[[[3]]] = "NEW"' example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
yellow triangle NEW 11 43.6498 9.8870
|
||||
red square NEW 15 79.2778 0.0130
|
||||
red circle NEW 16 13.8103 2.9010
|
||||
red square NEW 48 77.5542 7.4670
|
||||
purple triangle NEW 51 81.2290 8.5910
|
||||
red square NEW 64 77.1991 9.5310
|
||||
purple triangle NEW 65 80.1405 5.8240
|
||||
yellow circle NEW 73 63.9785 4.2370
|
||||
yellow circle NEW 87 63.5058 8.3350
|
||||
purple square NEW 91 72.3735 8.2430
|
||||
</pre>
|
||||
|
||||
You can find the full list of verbs at the [Verbs Reference](reference-verbs.md) page.
|
||||
|
||||
## Multiple input files
|
||||
|
||||
Miller takes all the files from the command line as an input stream. But it's format-aware, so it doesn't repeat CSV header lines. For example, with input files [data/a.csv](data/a.csv and [data/b.csv](data/b.csv), the system ``cat`` command will repeat header lines:
|
||||
|
||||
<pre>
|
||||
<b>cat data/a.csv</b>
|
||||
a,b,c
|
||||
1,2,3
|
||||
4,5,6
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>cat data/b.csv</b>
|
||||
a,b,c
|
||||
7,8,9
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>cat data/a.csv data/b.csv</b>
|
||||
a,b,c
|
||||
1,2,3
|
||||
4,5,6
|
||||
a,b,c
|
||||
7,8,9
|
||||
</pre>
|
||||
|
||||
However, ``mlr cat`` will not:
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv cat data/a.csv data/b.csv</b>
|
||||
a,b,c
|
||||
1,2,3
|
||||
4,5,6
|
||||
7,8,9
|
||||
</pre>
|
||||
|
||||
## Chaining verbs together
|
||||
|
||||
Often we want to chain queries together -- for example, sorting by a field and taking the top few values. We can do this using pipes:
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv sort -nr index example.csv | mlr --icsv --opprint head -n 3</b>
|
||||
color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre>
|
||||
|
||||
This works fine -- but Miller also lets you chain verbs together using the word ``then``. Think of this as a Miller-internal pipe that lets you use fewer keystrokes:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint sort -nr index then head -n 3 example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre>
|
||||
|
||||
As another convenience, you can put the filename first using ``--from``. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint --from example.csv sort -nr index then head -n 3</b>
|
||||
color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint --from example.csv \</b>
|
||||
<b> sort -nr index \</b>
|
||||
<b> then head -n 3 \</b>
|
||||
<b> then cut -f shape,quantity</b>
|
||||
shape quantity
|
||||
square 72.3735
|
||||
circle 63.5058
|
||||
circle 63.9785
|
||||
</pre>
|
||||
|
||||
## Sorts and stats
|
||||
|
||||
Now suppose you want to sort the data on a given column, *and then* take the top few in that ordering. You can use Miller's ``then`` feature to pipe commands together.
|
||||
|
||||
Here are the records with the top three ``index`` values:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint sort -nr index then head -n 3 example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre>
|
||||
|
||||
Lots of Miller commands take a ``-g`` option for group-by: here, ``head -n 1 -g shape`` outputs the first record for each distinct value of the ``shape`` field. This means we're finding the record with highest ``index`` field for each distinct ``shape`` field:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint sort -f shape -nr index then head -n 1 -g shape example.csv</b>
|
||||
color shape flag index quantity rate
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
</pre>
|
||||
|
||||
Statistics can be computed with or without group-by field(s):
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint --from example.csv \</b>
|
||||
<b> stats1 -a count,min,mean,max -f quantity -g shape</b>
|
||||
shape quantity_count quantity_min quantity_mean quantity_max
|
||||
triangle 3 43.6498 68.33976666666666 81.229
|
||||
square 4 72.3735 76.60114999999999 79.2778
|
||||
circle 3 13.8103 47.0982 63.9785
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --opprint --from example.csv \</b>
|
||||
<b> stats1 -a count,min,mean,max -f quantity -g shape,color</b>
|
||||
shape color quantity_count quantity_min quantity_mean quantity_max
|
||||
triangle yellow 1 43.6498 43.6498 43.6498
|
||||
square red 3 77.1991 78.01036666666666 79.2778
|
||||
circle red 1 13.8103 13.8103 13.8103
|
||||
triangle purple 2 80.1405 80.68475000000001 81.229
|
||||
circle yellow 2 63.5058 63.742149999999995 63.9785
|
||||
square purple 1 72.3735 72.3735 72.3735
|
||||
</pre>
|
||||
|
||||
If your output has a lot of columns, you can use XTAB format to line things up vertically for you instead:
|
||||
|
||||
<pre>
|
||||
<b>mlr --icsv --oxtab --from example.csv \</b>
|
||||
<b> stats1 -a p0,p10,p25,p50,p75,p90,p99,p100 -f rate</b>
|
||||
rate_p0 0.0130
|
||||
rate_p10 2.9010
|
||||
rate_p25 4.2370
|
||||
rate_p50 8.2430
|
||||
rate_p75 8.5910
|
||||
rate_p90 9.8870
|
||||
rate_p99 9.8870
|
||||
rate_p100 9.8870
|
||||
</pre>
|
||||
|
||||
|
||||
## File formats and format conversion
|
||||
|
||||
Miller supports the following formats:
|
||||
|
||||
* CSV (comma-separared values)
|
||||
* TSV (tab-separated values)
|
||||
* JSON (JavaScript Object Notation)
|
||||
* PPRINT (pretty-printed tabular)
|
||||
* XTAB (vertical-tabular or sideways-tabular)
|
||||
* NIDX (numerically indexed, label-free, with implicit labels ``"1"``, ``"2"``, etc.)
|
||||
* DKVP (delimited key-value pairs).
|
||||
|
||||
What's a CSV file, really? It's an array of rows, or *records*, each being a list of key-value pairs, or *fields*: for CSV it so happens that all the keys are shared in the header line and the values vary from one data line to another.
|
||||
|
||||
For example, if you have:
|
||||
|
||||
<pre>
|
||||
shape,flag,index
|
||||
circle,1,24
|
||||
square,0,36
|
||||
</pre>
|
||||
|
||||
then that's a way of saying:
|
||||
|
||||
<pre>
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
</pre>
|
||||
|
||||
Other ways to write the same data:
|
||||
|
||||
<pre>
|
||||
CSV PPRINT
|
||||
shape,flag,index shape flag index
|
||||
circle,1,24 circle 1 24
|
||||
square,0,36 square 0 36
|
||||
|
||||
JSON XTAB
|
||||
{ shape circle
|
||||
"shape": "circle", flag 1
|
||||
"flag": 1, index 24
|
||||
"index": 24 .
|
||||
} shape square
|
||||
{ flag 0
|
||||
"shape": "square", index 36
|
||||
"flag": 0,
|
||||
"index": 36
|
||||
}
|
||||
|
||||
DKVP
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
</pre>
|
||||
|
||||
Anything we can do with CSV input data, we can do with any other format input data. And you can read from one format, do any record-processing, and output to the same format as the input, or to a different output format.
|
||||
|
||||
How to specify these to Miller:
|
||||
|
||||
* If you use ``--csv`` or ``--json`` or ``--pprint``, etc., then Miller will use that format for input and output.
|
||||
* If you use ``--icsv`` and ``--ojson`` (note the extra ``i`` and ``o``) then Miller will use CSV for input and JSON for output, etc. See also [Keystroke Savers](keystroke-savers.md) for even shorter options like ``--c2j``.
|
||||
|
||||
You can read more about this at the [File Formats](file-formats.md) page.
|
||||
|
||||
.. _10min-choices-for-printing-to-files:
|
||||
|
||||
## Choices for printing to files
|
||||
|
||||
Often we want to print output to the screen. Miller does this by default, as we've seen in the previous examples.
|
||||
|
||||
Sometimes, though, we want to print output to another file. Just use **> outputfilenamegoeshere** at the end of your command:
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr --icsv --opprint cat example.csv > newfile.csv
|
||||
# Output goes to the new file;
|
||||
# nothing is printed to the screen.
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cat newfile.csv
|
||||
color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
|
||||
Other times we just want our files to be **changed in-place**: just use **mlr -I**:
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cp example.csv newfile.txt
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cat newfile.txt
|
||||
color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr -I --csv sort -f shape newfile.txt
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cat newfile.txt
|
||||
color,shape,flag,index,quantity,rate
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,square,false,48,77.5542,7.4670
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
|
||||
Also using ``mlr -I`` you can bulk-operate on lots of files: e.g.:
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr -I --csv cut -x -f unwanted_column_name *.csv
|
||||
|
||||
If you like, you can first copy off your original data somewhere else, before doing in-place operations.
|
||||
|
||||
Lastly, using ``tee`` within ``put``, you can split your input data into separate files per one or more field names:
|
||||
|
||||
<pre>
|
||||
<b>mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'</b>
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>cat circle.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>cat square.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,square,false,48,77.5542,7.4670
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre>
|
||||
|
||||
<pre>
|
||||
<b>cat triangle.csv</b>
|
||||
color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
</pre>
|
||||
370
docs6b/docs/10min.md.in
Normal file
370
docs6b/docs/10min.md.in
Normal file
|
|
@ -0,0 +1,370 @@
|
|||
# Miller in 10 minutes
|
||||
|
||||
## Obtaining Miller
|
||||
|
||||
You can install Miller for various platforms as follows:
|
||||
|
||||
* Linux: ``yum install miller`` or ``apt-get install miller`` depending on your flavor of Linux
|
||||
* MacOS: ``brew install miller`` or ``port install miller`` depending on your preference of [Homebrew](https://brew.sh>`_ or `MacPorts <https://macports.org).
|
||||
* Windows: ``choco install miller`` using [Chocolatey](https://chocolatey.org).
|
||||
* You can get latest builds for Linux, MacOS, and Windows by visiting https://github.com/johnkerl/miller/actions, selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)
|
||||
* See also the [build page](build.md) if you prefer -- in particular, if your platform's package manager doesn't have the latest release.
|
||||
|
||||
As a first check, you should be able to run ``mlr --version`` at your system's command prompt and see something like the following:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --version
|
||||
GENMD_EOF
|
||||
|
||||
As a second check, given [example.csv](./example.csv) you should be able to do
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv cat example.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint cat example.csv
|
||||
GENMD_EOF
|
||||
|
||||
If you run into issues on these checks, please check out the resources on the [community page](community.md) for help.
|
||||
|
||||
## Miller verbs
|
||||
|
||||
Let's take a quick look at some of the most useful Miller verbs -- file-format-aware, name-index-empowered equivalents of standard system commands.
|
||||
|
||||
``mlr cat`` is like system ``cat`` (or ``type`` on Windows) -- it passes the data through unmodified:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv cat example.csv
|
||||
GENMD_EOF
|
||||
|
||||
But ``mlr cat`` can also do format conversion -- for example, you can pretty-print in tabular format:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint cat example.csv
|
||||
GENMD_EOF
|
||||
|
||||
``mlr head`` and ``mlr tail`` count records rather than lines. Whether you're getting the first few records or the last few, the CSV header is included either way:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv head -n 4 example.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv tail -n 4 example.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --ojson tail -n 2 example.csv
|
||||
GENMD_EOF
|
||||
|
||||
You can sort on a single field:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint sort -f shape example.csv
|
||||
GENMD_EOF
|
||||
|
||||
Or, you can sort primarily alphabetically on one field, then secondarily numerically descending on another field, and so on:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint sort -f shape -nr index example.csv
|
||||
GENMD_EOF
|
||||
|
||||
If there are fields you don't want to see in your data, you can use ``cut`` to keep only the ones you want, in the same order they appeared in the input data:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint cut -f flag,shape example.csv
|
||||
GENMD_EOF
|
||||
|
||||
You can also use ``cut -o`` to keep specified fields, but in your preferred order:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint cut -o -f flag,shape example.csv
|
||||
GENMD_EOF
|
||||
|
||||
You can use ``cut -x`` to omit fields you don't care about:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint cut -x -f flag,shape example.csv
|
||||
GENMD_EOF
|
||||
|
||||
You can use ``filter`` to keep only records you care about:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint filter '$color == "red"' example.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint filter '$color == "red" && $flag == true' example.csv
|
||||
GENMD_EOF
|
||||
|
||||
You can use ``put`` to create new fields which are computed from other fields:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint put '
|
||||
$ratio = $quantity / $rate;
|
||||
$color_shape = $color . "_" . $shape
|
||||
' example.csv
|
||||
GENMD_EOF
|
||||
|
||||
Even though Miller's main selling point is name-indexing, sometimes you really want to refer to a field name by its positional index. Use ``$[[3]]`` to access the name of field 3 or ``$[[[3]]]`` to access the value of field 3:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint put '$[[3]] = "NEW"' example.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint put '$[[[3]]] = "NEW"' example.csv
|
||||
GENMD_EOF
|
||||
|
||||
You can find the full list of verbs at the [Verbs Reference](reference-verbs.md) page.
|
||||
|
||||
## Multiple input files
|
||||
|
||||
Miller takes all the files from the command line as an input stream. But it's format-aware, so it doesn't repeat CSV header lines. For example, with input files [data/a.csv](data/a.csv and [data/b.csv](data/b.csv), the system ``cat`` command will repeat header lines:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
cat data/a.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
cat data/b.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
cat data/a.csv data/b.csv
|
||||
GENMD_EOF
|
||||
|
||||
However, ``mlr cat`` will not:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv cat data/a.csv data/b.csv
|
||||
GENMD_EOF
|
||||
|
||||
## Chaining verbs together
|
||||
|
||||
Often we want to chain queries together -- for example, sorting by a field and taking the top few values. We can do this using pipes:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv sort -nr index example.csv | mlr --icsv --opprint head -n 3
|
||||
GENMD_EOF
|
||||
|
||||
This works fine -- but Miller also lets you chain verbs together using the word ``then``. Think of this as a Miller-internal pipe that lets you use fewer keystrokes:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint sort -nr index then head -n 3 example.csv
|
||||
GENMD_EOF
|
||||
|
||||
As another convenience, you can put the filename first using ``--from``. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint --from example.csv sort -nr index then head -n 3
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint --from example.csv \
|
||||
sort -nr index \
|
||||
then head -n 3 \
|
||||
then cut -f shape,quantity
|
||||
GENMD_EOF
|
||||
|
||||
## Sorts and stats
|
||||
|
||||
Now suppose you want to sort the data on a given column, *and then* take the top few in that ordering. You can use Miller's ``then`` feature to pipe commands together.
|
||||
|
||||
Here are the records with the top three ``index`` values:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint sort -nr index then head -n 3 example.csv
|
||||
GENMD_EOF
|
||||
|
||||
Lots of Miller commands take a ``-g`` option for group-by: here, ``head -n 1 -g shape`` outputs the first record for each distinct value of the ``shape`` field. This means we're finding the record with highest ``index`` field for each distinct ``shape`` field:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint sort -f shape -nr index then head -n 1 -g shape example.csv
|
||||
GENMD_EOF
|
||||
|
||||
Statistics can be computed with or without group-by field(s):
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint --from example.csv \
|
||||
stats1 -a count,min,mean,max -f quantity -g shape
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --opprint --from example.csv \
|
||||
stats1 -a count,min,mean,max -f quantity -g shape,color
|
||||
GENMD_EOF
|
||||
|
||||
If your output has a lot of columns, you can use XTAB format to line things up vertically for you instead:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --icsv --oxtab --from example.csv \
|
||||
stats1 -a p0,p10,p25,p50,p75,p90,p99,p100 -f rate
|
||||
GENMD_EOF
|
||||
|
||||
|
||||
## File formats and format conversion
|
||||
|
||||
Miller supports the following formats:
|
||||
|
||||
* CSV (comma-separared values)
|
||||
* TSV (tab-separated values)
|
||||
* JSON (JavaScript Object Notation)
|
||||
* PPRINT (pretty-printed tabular)
|
||||
* XTAB (vertical-tabular or sideways-tabular)
|
||||
* NIDX (numerically indexed, label-free, with implicit labels ``"1"``, ``"2"``, etc.)
|
||||
* DKVP (delimited key-value pairs).
|
||||
|
||||
What's a CSV file, really? It's an array of rows, or *records*, each being a list of key-value pairs, or *fields*: for CSV it so happens that all the keys are shared in the header line and the values vary from one data line to another.
|
||||
|
||||
For example, if you have:
|
||||
|
||||
GENMD_CARDIFY
|
||||
shape,flag,index
|
||||
circle,1,24
|
||||
square,0,36
|
||||
GENMD_EOF
|
||||
|
||||
then that's a way of saying:
|
||||
|
||||
GENMD_CARDIFY
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
GENMD_EOF
|
||||
|
||||
Other ways to write the same data:
|
||||
|
||||
GENMD_CARDIFY
|
||||
CSV PPRINT
|
||||
shape,flag,index shape flag index
|
||||
circle,1,24 circle 1 24
|
||||
square,0,36 square 0 36
|
||||
|
||||
JSON XTAB
|
||||
{ shape circle
|
||||
"shape": "circle", flag 1
|
||||
"flag": 1, index 24
|
||||
"index": 24 .
|
||||
} shape square
|
||||
{ flag 0
|
||||
"shape": "square", index 36
|
||||
"flag": 0,
|
||||
"index": 36
|
||||
}
|
||||
|
||||
DKVP
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
GENMD_EOF
|
||||
|
||||
Anything we can do with CSV input data, we can do with any other format input data. And you can read from one format, do any record-processing, and output to the same format as the input, or to a different output format.
|
||||
|
||||
How to specify these to Miller:
|
||||
|
||||
* If you use ``--csv`` or ``--json`` or ``--pprint``, etc., then Miller will use that format for input and output.
|
||||
* If you use ``--icsv`` and ``--ojson`` (note the extra ``i`` and ``o``) then Miller will use CSV for input and JSON for output, etc. See also [Keystroke Savers](keystroke-savers.md) for even shorter options like ``--c2j``.
|
||||
|
||||
You can read more about this at the [File Formats](file-formats.md) page.
|
||||
|
||||
.. _10min-choices-for-printing-to-files:
|
||||
|
||||
## Choices for printing to files
|
||||
|
||||
Often we want to print output to the screen. Miller does this by default, as we've seen in the previous examples.
|
||||
|
||||
Sometimes, though, we want to print output to another file. Just use **> outputfilenamegoeshere** at the end of your command:
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr --icsv --opprint cat example.csv > newfile.csv
|
||||
# Output goes to the new file;
|
||||
# nothing is printed to the screen.
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cat newfile.csv
|
||||
color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
|
||||
Other times we just want our files to be **changed in-place**: just use **mlr -I**:
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cp example.csv newfile.txt
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cat newfile.txt
|
||||
color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr -I --csv sort -f shape newfile.txt
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
cat newfile.txt
|
||||
color,shape,flag,index,quantity,rate
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,square,false,48,77.5542,7.4670
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
|
||||
Also using ``mlr -I`` you can bulk-operate on lots of files: e.g.:
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr -I --csv cut -x -f unwanted_column_name *.csv
|
||||
|
||||
If you like, you can first copy off your original data somewhere else, before doing in-place operations.
|
||||
|
||||
Lastly, using ``tee`` within ``put``, you can split your input data into separate files per one or more field names:
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
cat circle.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
cat square.csv
|
||||
GENMD_EOF
|
||||
|
||||
GENMD_RUN_COMMAND
|
||||
cat triangle.csv
|
||||
GENMD_EOF
|
||||
28
docs6b/docs/Makefile
Normal file
28
docs6b/docs/Makefile
Normal file
|
|
@ -0,0 +1,28 @@
|
|||
# Minimal makefile for Sphinx documentation
|
||||
#
|
||||
# Note: run this after make in the ../c directory and make in the ../man directory
|
||||
# since ../c/mlr is used to autogenerate ../man/manpage.txt which is used in this directory.
|
||||
# See also https://miller.readthedocs.io/en/latest/build.html#creating-a-new-release-for-developers
|
||||
|
||||
# You can set these variables from the command line, and also
|
||||
# from the environment for the first two.
|
||||
SPHINXOPTS ?=
|
||||
SPHINXBUILD ?= sphinx-build
|
||||
SOURCEDIR = .
|
||||
BUILDDIR = _build
|
||||
|
||||
# Respective MANPATH entries would include /usr/local/share/man or $HOME/man.
|
||||
INSTALLDIR=/usr/local/share/man/man1
|
||||
INSTALLHOME=$(HOME)/man/man1
|
||||
|
||||
# Put it first so that "make" without argument is like "make help".
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
./genmds
|
||||
$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
40
docs6b/docs/README.md
Normal file
40
docs6b/docs/README.md
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
# Miller Sphinx docs
|
||||
|
||||
## Why use Sphinx
|
||||
|
||||
* Connects to https://miller.readthedocs.io so people can get their docmods onto the web instead of the self-hosted https://johnkerl.org/miller/doc. Thanks to @pabloab for the great advice!
|
||||
* More standard look and feel -- lots of people use readthedocs for other things so this should feel familiar
|
||||
* We get a Search feature for free
|
||||
|
||||
## Contributing
|
||||
|
||||
* You need `pip install sphinx` (or `pip3 install sphinx`)
|
||||
* The docs include lots of live code examples which will be invoked using `mlr` which must be somewhere in your `$PATH`
|
||||
* Clone https://github.com/johnkerl/miller and cd into `docs/` within your clone
|
||||
* Editing loop:
|
||||
* Edit `*.md.in`
|
||||
* Run `make html`
|
||||
* Either `open _build/html/index.html` (MacOS) or point your browser to `file:///path/to/your/clone/of/miller/docs/_build/html/index.html`
|
||||
* Submitting:
|
||||
* `git add` your modified files, `git commit`, `git push`, and submit a PR at https://github.com/johnkerl/miller
|
||||
* A nice markup reference: https://www.sphinx-doc.org/en/1.8/usage/restructuredtext/basics.html
|
||||
|
||||
## Notes
|
||||
|
||||
* CSS:
|
||||
* I used the Sphinx Classic theme which I like a lot except the colors -- it's a blue scheme and Miller has never been blue.
|
||||
* Files are in `docs/_static/*.css` where I marked my mods with `/* CHANGE ME */`.
|
||||
* If you modify the CSS you must run `make clean html` (not just `make html`) then reload in your browser.
|
||||
* Live code:
|
||||
* I didn't find a way to include non-Python live-code examples within Sphinx so I adapted the pre-Sphinx Miller-doc strategy which is to have a generator script read a template file (here, `foo.md.in`), run the marked lines, and generate the output file (`foo.md`).
|
||||
* Edit the `*.md.in` files, not `*.md` directly.
|
||||
* Within the `*.md.in` files are lines like `GENMD_RUN_COMMAND`. These will be run, and their output included, by `make html` which calls the `genmds` script for you.
|
||||
* readthedocs:
|
||||
* https://readthedocs.org/
|
||||
* https://readthedocs.org/projects/miller/
|
||||
* https://readthedocs.org/projects/miller/builds/
|
||||
* https://miller.readthedocs.io/en/latest/
|
||||
|
||||
## To do
|
||||
|
||||
* Let's all discuss if/how we want the v2 docs to be structured better than the v1 docs.
|
||||
BIN
docs6b/docs/_build/doctrees/10min.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/10min.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/build.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/build.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/community.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/community.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/contact.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/contact.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/contributing.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/contributing.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/cookbook.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/cookbook.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/cookbook2.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/cookbook2.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/cookbook3.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/cookbook3.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/cookbook4.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/cookbook4.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/csv-with-and-without-headers.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/csv-with-and-without-headers.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/customization.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/customization.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/data-cleaning-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/data-cleaning-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/data-diving-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/data-diving-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/data-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/data-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/data-sharing.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/data-sharing.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/dates-and-times.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/dates-and-times.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/dkvp-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/dkvp-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/environment.pickle
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/environment.pickle
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/etymology.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/etymology.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/ex-fields-not-selected.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/ex-fields-not-selected.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/ex-no-output-at-all.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/ex-no-output-at-all.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/faq.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/faq.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/feature-comparison.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/feature-comparison.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/features.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/features.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/file-formats.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/file-formats.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/foo.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/foo.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/getting-started.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/getting-started.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/index.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/index.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/install.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/install.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/installation.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/installation.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/internationalization.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/internationalization.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/introduction.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/introduction.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/joins.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/joins.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/keystroke-savers.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/keystroke-savers.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/log-processing-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/log-processing-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/manpage.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/manpage.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/miller-by-example.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/miller-by-example.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/miller-on-windows.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/miller-on-windows.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/misc-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/misc-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/new-in-miller-6.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/new-in-miller-6.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/old-10min.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/old-10min.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/operating-on-all-fields.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/operating-on-all-fields.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/originality.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/originality.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/output-colorization.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/output-colorization.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/performance.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/performance.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/programming-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/programming-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/programming-language.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/programming-language.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/quick-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/quick-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/randomizing-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/randomizing-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/record-heterogeneity.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/record-heterogeneity.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-arrays.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-arrays.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-builtin-functions.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-builtin-functions.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-complexity.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-complexity.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-complexlity.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-complexlity.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-control-structures.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-control-structures.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-errors.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-errors.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-filter-statements.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-filter-statements.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-operators.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-operators.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-output-statements.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-output-statements.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-overview.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-overview.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-syntax.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-syntax.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-tbf.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-tbf.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-unset-statements.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-unset-statements.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-user-defined-functions.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-user-defined-functions.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl-variables.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl-variables.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-dsl.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-dsl.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-arithmetic.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-arithmetic.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-auxiliary-commands.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-auxiliary-commands.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-data-types.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-data-types.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-env-vars.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-env-vars.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-io-options.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-io-options.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-null-data.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-null-data.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-online-help.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-online-help.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-overview.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-overview.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-regular-expressions.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-regular-expressions.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-main-then-chaining.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-main-then-chaining.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference-verbs.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference-verbs.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/reference.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/reference.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/release-docs.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/release-docs.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/repl.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/repl.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/shapes-of-data.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/shapes-of-data.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/shell-commands.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/shell-commands.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/special-symbols-and-formatting.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/special-symbols-and-formatting.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/sql-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/sql-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/statistics-examples.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/statistics-examples.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/then-chaining.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/then-chaining.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/two-pass-algorithms.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/two-pass-algorithms.doctree
vendored
Normal file
Binary file not shown.
BIN
docs6b/docs/_build/doctrees/why.doctree
vendored
Normal file
BIN
docs6b/docs/_build/doctrees/why.doctree
vendored
Normal file
Binary file not shown.
4
docs6b/docs/_build/html/.buildinfo
vendored
Normal file
4
docs6b/docs/_build/html/.buildinfo
vendored
Normal file
|
|
@ -0,0 +1,4 @@
|
|||
# Sphinx build info version 1
|
||||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
|
||||
config: 4993596e7a3406ca4625604594270a0e
|
||||
tags: 645f666f9bcd5a90fca523b33c5a78b7
|
||||
2
docs6b/docs/_build/html/10-1.sh
vendored
Normal file
2
docs6b/docs/_build/html/10-1.sh
vendored
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
grep op=cache log.txt \
|
||||
| mlr --idkvp --opprint stats1 -a mean -f hit -g type then sort -f type
|
||||
4
docs6b/docs/_build/html/10-2.sh
vendored
Normal file
4
docs6b/docs/_build/html/10-2.sh
vendored
Normal file
|
|
@ -0,0 +1,4 @@
|
|||
mlr --from log.txt --opprint \
|
||||
filter 'is_present($batch_size)' \
|
||||
then step -a delta -f time,num_filtered \
|
||||
then sec2gmt time
|
||||
588
docs6b/docs/_build/html/10min.html
vendored
Normal file
588
docs6b/docs/_build/html/10min.html
vendored
Normal file
|
|
@ -0,0 +1,588 @@
|
|||
|
||||
<!DOCTYPE html>
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>Miller in 10 minutes — Miller 6.0.0-alpha documentation</title>
|
||||
|
||||
<link rel="stylesheet" href="_static/scrolls.css" type="text/css" />
|
||||
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
|
||||
<link rel="stylesheet" href="_static/print.css" type="text/css" />
|
||||
|
||||
<script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
|
||||
<script src="_static/jquery.js"></script>
|
||||
<script src="_static/underscore.js"></script>
|
||||
<script src="_static/doctools.js"></script>
|
||||
<script src="_static/language_data.js"></script>
|
||||
<script src="_static/theme_extras.js"></script>
|
||||
<link rel="index" title="Index" href="genindex.html" />
|
||||
<link rel="search" title="Search" href="search.html" />
|
||||
<link rel="next" title="Keystroke-savers" href="keystroke-savers.html" />
|
||||
<link rel="prev" title="Introduction" href="introduction.html" />
|
||||
</head><body>
|
||||
<div id="content">
|
||||
<div class="header">
|
||||
<h1 class="heading"><a href="index.html"
|
||||
title="back to the documentation overview"><span>Miller in 10 minutes</span></a></h1>
|
||||
</div>
|
||||
<div class="relnav" role="navigation" aria-label="related navigation">
|
||||
<a href="introduction.html">« Introduction</a> |
|
||||
<a href="#">Miller in 10 minutes</a>
|
||||
| <a href="keystroke-savers.html">Keystroke-savers »</a>
|
||||
</div>
|
||||
<div id="contentwrapper">
|
||||
<div id="toc" role="navigation" aria-label="table of contents navigation">
|
||||
<h3>Table of Contents</h3>
|
||||
<ul>
|
||||
<li><a class="reference internal" href="#">Miller in 10 minutes</a><ul>
|
||||
<li><a class="reference internal" href="#obtaining-miller">Obtaining Miller</a></li>
|
||||
<li><a class="reference internal" href="#miller-verbs">Miller verbs</a></li>
|
||||
<li><a class="reference internal" href="#multiple-input-files">Multiple input files</a></li>
|
||||
<li><a class="reference internal" href="#chaining-verbs-together">Chaining verbs together</a></li>
|
||||
<li><a class="reference internal" href="#sorts-and-stats">Sorts and stats</a></li>
|
||||
<li><a class="reference internal" href="#file-formats-and-format-conversion">File formats and format conversion</a></li>
|
||||
<li><a class="reference internal" href="#choices-for-printing-to-files">Choices for printing to files</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
</div>
|
||||
<div role="main">
|
||||
|
||||
<div class="section" id="miller-in-10-minutes">
|
||||
<h1>Miller in 10 minutes<a class="headerlink" href="#miller-in-10-minutes" title="Permalink to this headline">¶</a></h1>
|
||||
<div class="section" id="obtaining-miller">
|
||||
<h2>Obtaining Miller<a class="headerlink" href="#obtaining-miller" title="Permalink to this headline">¶</a></h2>
|
||||
<p>You can install Miller for various platforms as follows:</p>
|
||||
<ul class="simple">
|
||||
<li><p>Linux: <code class="docutils literal notranslate"><span class="pre">yum</span> <span class="pre">install</span> <span class="pre">miller</span></code> or <code class="docutils literal notranslate"><span class="pre">apt-get</span> <span class="pre">install</span> <span class="pre">miller</span></code> depending on your flavor of Linux</p></li>
|
||||
<li><p>MacOS: <code class="docutils literal notranslate"><span class="pre">brew</span> <span class="pre">install</span> <span class="pre">miller</span></code> or <code class="docutils literal notranslate"><span class="pre">port</span> <span class="pre">install</span> <span class="pre">miller</span></code> depending on your preference of <a class="reference external" href="https://brew.sh">Homebrew</a> or <a class="reference external" href="https://macports.org">MacPorts</a>.</p></li>
|
||||
<li><p>Windows: <code class="docutils literal notranslate"><span class="pre">choco</span> <span class="pre">install</span> <span class="pre">miller</span></code> using <a class="reference external" href="https://chocolatey.org">Chocolatey</a>.</p></li>
|
||||
<li><p>You can get latest builds for Linux, MacOS, and Windows by visiting <a class="reference external" href="https://github.com/johnkerl/miller/actions">https://github.com/johnkerl/miller/actions</a>, selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)</p></li>
|
||||
<li><p>See also <a class="reference internal" href="build.html"><span class="doc">Building from source</span></a> if you prefer – in particular, if your platform’s package manager doesn’t have the latest release.</p></li>
|
||||
</ul>
|
||||
<p>As a first check, you should be able to run <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">--version</span></code> at your system’s command prompt and see something like the following:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --version
|
||||
</span> Miller v6.0.0-dev
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>As a second check, given (<a class="reference external" href="./example.csv">example.csv</a>) you should be able to do</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv cat example.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint cat example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>If you run into issues on these checks, please check out the resources on the <a class="reference internal" href="community.html"><span class="doc">Community</span></a> page for help.</p>
|
||||
</div>
|
||||
<div class="section" id="miller-verbs">
|
||||
<h2>Miller verbs<a class="headerlink" href="#miller-verbs" title="Permalink to this headline">¶</a></h2>
|
||||
<p>Let’s take a quick look at some of the most useful Miller verbs – file-format-aware, name-index-empowered equivalents of standard system commands.</p>
|
||||
<p><code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">cat</span></code> is like system <code class="docutils literal notranslate"><span class="pre">cat</span></code> (or <code class="docutils literal notranslate"><span class="pre">type</span></code> on Windows) – it passes the data through unmodified:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv cat example.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>But <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">cat</span></code> can also do format conversion – for example, you can pretty-print in tabular format:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint cat example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<p><code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">head</span></code> and <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">tail</span></code> count records rather than lines. Whether you’re getting the first few records or the last few, the CSV header is included either way:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv head -n 4 example.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv tail -n 4 example.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --ojson tail -n 2 example.csv
|
||||
</span> {
|
||||
"color": "yellow",
|
||||
"shape": "circle",
|
||||
"flag": true,
|
||||
"index": 87,
|
||||
"quantity": 63.5058,
|
||||
"rate": 8.3350
|
||||
}
|
||||
{
|
||||
"color": "purple",
|
||||
"shape": "square",
|
||||
"flag": false,
|
||||
"index": 91,
|
||||
"quantity": 72.3735,
|
||||
"rate": 8.2430
|
||||
}
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>You can sort on a single field:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint sort -f shape example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
red circle true 16 13.8103 2.9010
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
red square true 15 79.2778 0.0130
|
||||
red square false 48 77.5542 7.4670
|
||||
red square false 64 77.1991 9.5310
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Or, you can sort primarily alphabetically on one field, then secondarily numerically descending on another field, and so on:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint sort -f shape -nr index example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
red circle true 16 13.8103 2.9010
|
||||
purple square false 91 72.3735 8.2430
|
||||
red square false 64 77.1991 9.5310
|
||||
red square false 48 77.5542 7.4670
|
||||
red square true 15 79.2778 0.0130
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>If there are fields you don’t want to see in your data, you can use <code class="docutils literal notranslate"><span class="pre">cut</span></code> to keep only the ones you want, in the same order they appeared in the input data:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint cut -f flag,shape example.csv
|
||||
</span> shape flag
|
||||
triangle true
|
||||
square true
|
||||
circle true
|
||||
square false
|
||||
triangle false
|
||||
square false
|
||||
triangle false
|
||||
circle true
|
||||
circle true
|
||||
square false
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>You can also use <code class="docutils literal notranslate"><span class="pre">cut</span> <span class="pre">-o</span></code> to keep specified fields, but in your preferred order:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint cut -o -f flag,shape example.csv
|
||||
</span> flag shape
|
||||
true triangle
|
||||
true square
|
||||
true circle
|
||||
false square
|
||||
false triangle
|
||||
false square
|
||||
false triangle
|
||||
true circle
|
||||
true circle
|
||||
false square
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>You can use <code class="docutils literal notranslate"><span class="pre">cut</span> <span class="pre">-x</span></code> to omit fields you don’t care about:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint cut -x -f flag,shape example.csv
|
||||
</span> color index quantity rate
|
||||
yellow 11 43.6498 9.8870
|
||||
red 15 79.2778 0.0130
|
||||
red 16 13.8103 2.9010
|
||||
red 48 77.5542 7.4670
|
||||
purple 51 81.2290 8.5910
|
||||
red 64 77.1991 9.5310
|
||||
purple 65 80.1405 5.8240
|
||||
yellow 73 63.9785 4.2370
|
||||
yellow 87 63.5058 8.3350
|
||||
purple 91 72.3735 8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>You can use <code class="docutils literal notranslate"><span class="pre">filter</span></code> to keep only records you care about:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint filter '$color == "red"' example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
red square false 64 77.1991 9.5310
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint filter '$color == "red" && $flag == true' example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>You can use <code class="docutils literal notranslate"><span class="pre">put</span></code> to create new fields which are computed from other fields:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint put '
|
||||
</span><span class="hll"> $ratio = $quantity / $rate;
|
||||
</span><span class="hll"> $color_shape = $color . "_" . $shape
|
||||
</span><span class="hll"> ' example.csv
|
||||
</span> color shape flag index quantity rate ratio color_shape
|
||||
yellow triangle true 11 43.6498 9.8870 4.414868008496004 yellow_triangle
|
||||
red square true 15 79.2778 0.0130 6098.292307692308 red_square
|
||||
red circle true 16 13.8103 2.9010 4.760530851430541 red_circle
|
||||
red square false 48 77.5542 7.4670 10.386259541984733 red_square
|
||||
purple triangle false 51 81.2290 8.5910 9.455127458968688 purple_triangle
|
||||
red square false 64 77.1991 9.5310 8.099790158430384 red_square
|
||||
purple triangle false 65 80.1405 5.8240 13.760388049450551 purple_triangle
|
||||
yellow circle true 73 63.9785 4.2370 15.09995279679018 yellow_circle
|
||||
yellow circle true 87 63.5058 8.3350 7.619172165566886 yellow_circle
|
||||
purple square false 91 72.3735 8.2430 8.779995147397793 purple_square
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Even though Miller’s main selling point is name-indexing, sometimes you really want to refer to a field name by its positional index. Use <code class="docutils literal notranslate"><span class="pre">$[[3]]</span></code> to access the name of field 3 or <code class="docutils literal notranslate"><span class="pre">$[[[3]]]</span></code> to access the value of field 3:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint put '$[[3]] = "NEW"' example.csv
|
||||
</span> color shape NEW index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint put '$[[[3]]] = "NEW"' example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
yellow triangle NEW 11 43.6498 9.8870
|
||||
red square NEW 15 79.2778 0.0130
|
||||
red circle NEW 16 13.8103 2.9010
|
||||
red square NEW 48 77.5542 7.4670
|
||||
purple triangle NEW 51 81.2290 8.5910
|
||||
red square NEW 64 77.1991 9.5310
|
||||
purple triangle NEW 65 80.1405 5.8240
|
||||
yellow circle NEW 73 63.9785 4.2370
|
||||
yellow circle NEW 87 63.5058 8.3350
|
||||
purple square NEW 91 72.3735 8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>You can find the full list of verbs at the <a class="reference internal" href="reference-verbs.html"><span class="doc">Reference: list of verbs</span></a> page.</p>
|
||||
</div>
|
||||
<div class="section" id="multiple-input-files">
|
||||
<h2>Multiple input files<a class="headerlink" href="#multiple-input-files" title="Permalink to this headline">¶</a></h2>
|
||||
<p>Miller takes all the files from the command line as an input stream. But it’s format-aware, so it doesn’t repeat CSV header lines. For example, with input files (<a class="reference external" href="data/a.csv">data/a.csv</a>) and (<a class="reference external" href="data/b.csv">data/b.csv</a>), the system <code class="docutils literal notranslate"><span class="pre">cat</span></code> command will repeat header lines:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/a.csv
|
||||
</span> a,b,c
|
||||
1,2,3
|
||||
4,5,6
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/b.csv
|
||||
</span> a,b,c
|
||||
7,8,9
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/a.csv data/b.csv
|
||||
</span> a,b,c
|
||||
1,2,3
|
||||
4,5,6
|
||||
a,b,c
|
||||
7,8,9
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>However, <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">cat</span></code> will not:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv cat data/a.csv data/b.csv
|
||||
</span> a,b,c
|
||||
1,2,3
|
||||
4,5,6
|
||||
7,8,9
|
||||
</pre></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="chaining-verbs-together">
|
||||
<h2>Chaining verbs together<a class="headerlink" href="#chaining-verbs-together" title="Permalink to this headline">¶</a></h2>
|
||||
<p>Often we want to chain queries together – for example, sorting by a field and taking the top few values. We can do this using pipes:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv sort -nr index example.csv | mlr --icsv --opprint head -n 3
|
||||
</span> color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>This works fine – but Miller also lets you chain verbs together using the word <code class="docutils literal notranslate"><span class="pre">then</span></code>. Think of this as a Miller-internal pipe that lets you use fewer keystrokes:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint sort -nr index then head -n 3 example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>As another convenience, you can put the filename first using <code class="docutils literal notranslate"><span class="pre">--from</span></code>. When you’re interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint --from example.csv sort -nr index then head -n 3
|
||||
</span> color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint --from example.csv \
|
||||
</span><span class="hll"> sort -nr index \
|
||||
</span><span class="hll"> then head -n 3 \
|
||||
</span><span class="hll"> then cut -f shape,quantity
|
||||
</span> shape quantity
|
||||
square 72.3735
|
||||
circle 63.5058
|
||||
circle 63.9785
|
||||
</pre></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="sorts-and-stats">
|
||||
<h2>Sorts and stats<a class="headerlink" href="#sorts-and-stats" title="Permalink to this headline">¶</a></h2>
|
||||
<p>Now suppose you want to sort the data on a given column, <em>and then</em> take the top few in that ordering. You can use Miller’s <code class="docutils literal notranslate"><span class="pre">then</span></code> feature to pipe commands together.</p>
|
||||
<p>Here are the records with the top three <code class="docutils literal notranslate"><span class="pre">index</span></code> values:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint sort -nr index then head -n 3 example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
purple square false 91 72.3735 8.2430
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Lots of Miller commands take a <code class="docutils literal notranslate"><span class="pre">-g</span></code> option for group-by: here, <code class="docutils literal notranslate"><span class="pre">head</span> <span class="pre">-n</span> <span class="pre">1</span> <span class="pre">-g</span> <span class="pre">shape</span></code> outputs the first record for each distinct value of the <code class="docutils literal notranslate"><span class="pre">shape</span></code> field. This means we’re finding the record with highest <code class="docutils literal notranslate"><span class="pre">index</span></code> field for each distinct <code class="docutils literal notranslate"><span class="pre">shape</span></code> field:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint sort -f shape -nr index then head -n 1 -g shape example.csv
|
||||
</span> color shape flag index quantity rate
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Statistics can be computed with or without group-by field(s):</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint --from example.csv \
|
||||
</span><span class="hll"> stats1 -a count,min,mean,max -f quantity -g shape
|
||||
</span> shape quantity_count quantity_min quantity_mean quantity_max
|
||||
triangle 3 43.6498 68.33976666666666 81.229
|
||||
square 4 72.3735 76.60114999999999 79.2778
|
||||
circle 3 13.8103 47.0982 63.9785
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint --from example.csv \
|
||||
</span><span class="hll"> stats1 -a count,min,mean,max -f quantity -g shape,color
|
||||
</span> shape color quantity_count quantity_min quantity_mean quantity_max
|
||||
triangle yellow 1 43.6498 43.6498 43.6498
|
||||
square red 3 77.1991 78.01036666666666 79.2778
|
||||
circle red 1 13.8103 13.8103 13.8103
|
||||
triangle purple 2 80.1405 80.68475000000001 81.229
|
||||
circle yellow 2 63.5058 63.742149999999995 63.9785
|
||||
square purple 1 72.3735 72.3735 72.3735
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>If your output has a lot of columns, you can use XTAB format to line things up vertically for you instead:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --oxtab --from example.csv \
|
||||
</span><span class="hll"> stats1 -a p0,p10,p25,p50,p75,p90,p99,p100 -f rate
|
||||
</span> rate_p0 0.0130
|
||||
rate_p10 2.9010
|
||||
rate_p25 4.2370
|
||||
rate_p50 8.2430
|
||||
rate_p75 8.5910
|
||||
rate_p90 9.8870
|
||||
rate_p99 9.8870
|
||||
rate_p100 9.8870
|
||||
</pre></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="file-formats-and-format-conversion">
|
||||
<h2>File formats and format conversion<a class="headerlink" href="#file-formats-and-format-conversion" title="Permalink to this headline">¶</a></h2>
|
||||
<p>Miller supports the following formats:</p>
|
||||
<ul class="simple">
|
||||
<li><p>CSV (comma-separared values)</p></li>
|
||||
<li><p>TSV (tab-separated values)</p></li>
|
||||
<li><p>JSON (JavaScript Object Notation)</p></li>
|
||||
<li><p>PPRINT (pretty-printed tabular)</p></li>
|
||||
<li><p>XTAB (vertical-tabular or sideways-tabular)</p></li>
|
||||
<li><p>NIDX (numerically indexed, label-free, with implicit labels <code class="docutils literal notranslate"><span class="pre">"1"</span></code>, <code class="docutils literal notranslate"><span class="pre">"2"</span></code>, etc.)</p></li>
|
||||
<li><p>DKVP (delimited key-value pairs).</p></li>
|
||||
</ul>
|
||||
<p>What’s a CSV file, really? It’s an array of rows, or <em>records</em>, each being a list of key-value pairs, or <em>fields</em>: for CSV it so happens that all the keys are shared in the header line and the values vary from one data line to another.</p>
|
||||
<p>For example, if you have:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span>shape,flag,index
|
||||
circle,1,24
|
||||
square,0,36
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>then that’s a way of saying:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span>shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Other ways to write the same data:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span>CSV PPRINT
|
||||
shape,flag,index shape flag index
|
||||
circle,1,24 circle 1 24
|
||||
square,0,36 square 0 36
|
||||
|
||||
JSON XTAB
|
||||
{ shape circle
|
||||
"shape": "circle", flag 1
|
||||
"flag": 1, index 24
|
||||
"index": 24 .
|
||||
} shape square
|
||||
{ flag 0
|
||||
"shape": "square", index 36
|
||||
"flag": 0,
|
||||
"index": 36
|
||||
}
|
||||
|
||||
DKVP
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Anything we can do with CSV input data, we can do with any other format input data. And you can read from one format, do any record-processing, and output to the same format as the input, or to a different output format.</p>
|
||||
<p>How to specify these to Miller:</p>
|
||||
<ul class="simple">
|
||||
<li><p>If you use <code class="docutils literal notranslate"><span class="pre">--csv</span></code> or <code class="docutils literal notranslate"><span class="pre">--json</span></code> or <code class="docutils literal notranslate"><span class="pre">--pprint</span></code>, etc., then Miller will use that format for input and output.</p></li>
|
||||
<li><p>If you use <code class="docutils literal notranslate"><span class="pre">--icsv</span></code> and <code class="docutils literal notranslate"><span class="pre">--ojson</span></code> (note the extra <code class="docutils literal notranslate"><span class="pre">i</span></code> and <code class="docutils literal notranslate"><span class="pre">o</span></code>) then Miller will use CSV for input and JSON for output, etc. See also <a class="reference internal" href="keystroke-savers.html"><span class="doc">Keystroke-savers</span></a> for even shorter options like <code class="docutils literal notranslate"><span class="pre">--c2j</span></code>.</p></li>
|
||||
</ul>
|
||||
<p>You can read more about this at the <a class="reference internal" href="file-formats.html"><span class="doc">File formats</span></a> page.</p>
|
||||
</div>
|
||||
<div class="section" id="choices-for-printing-to-files">
|
||||
<span id="min-choices-for-printing-to-files"></span><h2>Choices for printing to files<a class="headerlink" href="#choices-for-printing-to-files" title="Permalink to this headline">¶</a></h2>
|
||||
<p>Often we want to print output to the screen. Miller does this by default, as we’ve seen in the previous examples.</p>
|
||||
<p>Sometimes, though, we want to print output to another file. Just use <strong>> outputfilenamegoeshere</strong> at the end of your command:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint cat example.csv > newfile.csv
|
||||
</span> # Output goes to the new file;
|
||||
# nothing is printed to the screen.
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat newfile.csv
|
||||
</span> color shape flag index quantity rate
|
||||
yellow triangle true 11 43.6498 9.8870
|
||||
red square true 15 79.2778 0.0130
|
||||
red circle true 16 13.8103 2.9010
|
||||
red square false 48 77.5542 7.4670
|
||||
purple triangle false 51 81.2290 8.5910
|
||||
red square false 64 77.1991 9.5310
|
||||
purple triangle false 65 80.1405 5.8240
|
||||
yellow circle true 73 63.9785 4.2370
|
||||
yellow circle true 87 63.5058 8.3350
|
||||
purple square false 91 72.3735 8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Other times we just want our files to be <strong>changed in-place</strong>: just use <strong>mlr -I</strong>:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cp example.csv newfile.txt
|
||||
</span></pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat newfile.txt
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
red,square,false,48,77.5542,7.4670
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr -I --csv sort -f shape newfile.txt
|
||||
</span></pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat newfile.txt
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,square,false,48,77.5542,7.4670
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
</pre></div>
|
||||
</div>
|
||||
<p>Also using <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">-I</span></code> you can bulk-operate on lots of files: e.g.:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr -I --csv cut -x -f unwanted_column_name *.csv
|
||||
</span></pre></div>
|
||||
</div>
|
||||
<p>If you like, you can first copy off your original data somewhere else, before doing in-place operations.</p>
|
||||
<p>Lastly, using <code class="docutils literal notranslate"><span class="pre">tee</span></code> within <code class="docutils literal notranslate"><span class="pre">put</span></code>, you can split your input data into separate files per one or more field names:</p>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'
|
||||
</span></pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat circle.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
red,circle,true,16,13.8103,2.9010
|
||||
yellow,circle,true,73,63.9785,4.2370
|
||||
yellow,circle,true,87,63.5058,8.3350
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat square.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
red,square,true,15,79.2778,0.0130
|
||||
red,square,false,48,77.5542,7.4670
|
||||
red,square,false,64,77.1991,9.5310
|
||||
purple,square,false,91,72.3735,8.2430
|
||||
</pre></div>
|
||||
</div>
|
||||
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat triangle.csv
|
||||
</span> color,shape,flag,index,quantity,rate
|
||||
yellow,triangle,true,11,43.6498,9.8870
|
||||
purple,triangle,false,51,81.2290,8.5910
|
||||
purple,triangle,false,65,80.1405,5.8240
|
||||
</pre></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="footer" role="contentinfo">
|
||||
© Copyright 2021, John Kerl.
|
||||
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 3.2.1.
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue