mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-23 02:14:13 +00:00
Attempt to unbreak readthedocs build
This commit is contained in:
parent
5da252172b
commit
c7556cda26
29 changed files with 201 additions and 201 deletions
|
|
@ -344,14 +344,14 @@ Often we want to print output to the screen. Miller does this by default, as we'
|
|||
|
||||
Sometimes we want to print output to another file: just use **> outputfilenamegoeshere** at the end of your command:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --icsv --opprint cat example.csv > newfile.csv
|
||||
# Output goes to the new file;
|
||||
# nothing is printed to the screen.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cat newfile.csv
|
||||
|
|
@ -369,12 +369,12 @@ Sometimes we want to print output to another file: just use **> outputfilenamego
|
|||
|
||||
Other times we just want our files to be **changed in-place**: just use **mlr -I**:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cp example.csv newfile.txt
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cat newfile.txt
|
||||
|
|
@ -390,12 +390,12 @@ Other times we just want our files to be **changed in-place**: just use **mlr -I
|
|||
yellow,circle,1,87,63.5058,8.3350
|
||||
purple,square,0,91,72.3735,8.2430
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr -I --icsv --opprint cat newfile.txt
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cat newfile.txt
|
||||
|
|
@ -413,7 +413,7 @@ Other times we just want our files to be **changed in-place**: just use **mlr -I
|
|||
|
||||
Also using ``mlr -I`` you can bulk-operate on lots of files: e.g.:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr -I --csv cut -x -f unwanted_column_name *.csv
|
||||
|
|
@ -462,7 +462,7 @@ What's a CSV file, really? It's an array of rows, or *records*, each being a lis
|
|||
|
||||
For example, if you have:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
shape,flag,index
|
||||
circle,1,24
|
||||
|
|
@ -470,7 +470,7 @@ For example, if you have:
|
|||
|
||||
then that's a way of saying:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
|
|
@ -479,7 +479,7 @@ Data written this way are called **DKVP**, for *delimited key-value pairs*.
|
|||
|
||||
We've also already seen other ways to write the same data:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
CSV PPRINT JSON
|
||||
shape,flag,index shape flag index [
|
||||
|
|
|
|||
|
|
@ -97,14 +97,14 @@ Often we want to print output to the screen. Miller does this by default, as we'
|
|||
|
||||
Sometimes we want to print output to another file: just use **> outputfilenamegoeshere** at the end of your command:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --icsv --opprint cat example.csv > newfile.csv
|
||||
# Output goes to the new file;
|
||||
# nothing is printed to the screen.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cat newfile.csv
|
||||
|
|
@ -122,12 +122,12 @@ Sometimes we want to print output to another file: just use **> outputfilenamego
|
|||
|
||||
Other times we just want our files to be **changed in-place**: just use **mlr -I**:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cp example.csv newfile.txt
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cat newfile.txt
|
||||
|
|
@ -143,12 +143,12 @@ Other times we just want our files to be **changed in-place**: just use **mlr -I
|
|||
yellow,circle,1,87,63.5058,8.3350
|
||||
purple,square,0,91,72.3735,8.2430
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr -I --icsv --opprint cat newfile.txt
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% cat newfile.txt
|
||||
|
|
@ -166,7 +166,7 @@ Other times we just want our files to be **changed in-place**: just use **mlr -I
|
|||
|
||||
Also using ``mlr -I`` you can bulk-operate on lots of files: e.g.:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr -I --csv cut -x -f unwanted_column_name *.csv
|
||||
|
|
@ -190,7 +190,7 @@ What's a CSV file, really? It's an array of rows, or *records*, each being a lis
|
|||
|
||||
For example, if you have:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
shape,flag,index
|
||||
circle,1,24
|
||||
|
|
@ -198,7 +198,7 @@ For example, if you have:
|
|||
|
||||
then that's a way of saying:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
shape=circle,flag=1,index=24
|
||||
shape=square,flag=0,index=36
|
||||
|
|
@ -207,7 +207,7 @@ Data written this way are called **DKVP**, for *delimited key-value pairs*.
|
|||
|
||||
We've also already seen other ways to write the same data:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
CSV PPRINT JSON
|
||||
shape,flag,index shape flag index [
|
||||
|
|
|
|||
|
|
@ -24,5 +24,5 @@ help:
|
|||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
##### temp test ./genrst
|
||||
#### temp ./genrst
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
|
|
|||
|
|
@ -72,7 +72,7 @@ Miller has been built on Windows using MSYS2: http://www.msys2.org/. You can in
|
|||
|
||||
You will first need to install MSYS2: http://www.msys2.org/. Then, start an MSYS2 shell, e.g. (supposing you installed MSYS2 to ``C:\msys2\``) run ``C:\msys2\mingw64.exe``. Within the MSYS2 shell, you can run the following to install dependent packages:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
pacman -Syu
|
||||
pacman -Su
|
||||
|
|
@ -90,13 +90,13 @@ There is a unit-test false-negative issue involving the semantics of the ``mkste
|
|||
|
||||
Within MSYS2 you can run ``mlr``: simply copy it from the ``c`` subdirectory to your desired location somewhere within your MSYS2 ``$PATH``. To run ``mlr`` outside of MSYS2, just as with precompiled binaries as described above, you'll need ``msys-2.0.dll``. One way to do this is to augment your path:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
C:\> set PATH=%PATH%;\msys64\mingw64\bin
|
||||
|
||||
Another way to do it is to copy the Miller executable and the DLL to the same directory:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
C:\> mkdir \mbin
|
||||
C:\> copy \msys64\mingw64\bin\msys-2.0.dll \mbin
|
||||
|
|
@ -181,7 +181,7 @@ In this example I am using version 3.4.0; of course that will change for subsequ
|
|||
* Similarly for ``macports``: https://github.com/macports/macports-ports/blob/master/textproc/miller/Portfile.
|
||||
* Social-media updates.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
git remote add upstream https://github.com/Homebrew/homebrew-core # one-time setup only
|
||||
git fetch upstream
|
||||
|
|
|
|||
|
|
@ -69,7 +69,7 @@ Miller has been built on Windows using MSYS2: http://www.msys2.org/. You can in
|
|||
|
||||
You will first need to install MSYS2: http://www.msys2.org/. Then, start an MSYS2 shell, e.g. (supposing you installed MSYS2 to ``C:\msys2\``) run ``C:\msys2\mingw64.exe``. Within the MSYS2 shell, you can run the following to install dependent packages:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
pacman -Syu
|
||||
pacman -Su
|
||||
|
|
@ -87,13 +87,13 @@ There is a unit-test false-negative issue involving the semantics of the ``mkste
|
|||
|
||||
Within MSYS2 you can run ``mlr``: simply copy it from the ``c`` subdirectory to your desired location somewhere within your MSYS2 ``$PATH``. To run ``mlr`` outside of MSYS2, just as with precompiled binaries as described above, you'll need ``msys-2.0.dll``. One way to do this is to augment your path:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
C:\> set PATH=%PATH%;\msys64\mingw64\bin
|
||||
|
||||
Another way to do it is to copy the Miller executable and the DLL to the same directory:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
C:\> mkdir \mbin
|
||||
C:\> copy \msys64\mingw64\bin\msys-2.0.dll \mbin
|
||||
|
|
@ -178,7 +178,7 @@ In this example I am using version 3.4.0; of course that will change for subsequ
|
|||
* Similarly for ``macports``: https://github.com/macports/macports-ports/blob/master/textproc/miller/Portfile.
|
||||
* Social-media updates.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
git remote add upstream https://github.com/Homebrew/homebrew-core # one-time setup only
|
||||
git fetch upstream
|
||||
|
|
|
|||
|
|
@ -1080,13 +1080,13 @@ Parsing log-file output
|
|||
|
||||
This, of course, depends highly on what's in your log files. But, as an example, suppose you have log-file lines such as
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
2015-10-08 08:29:09,445 INFO com.company.path.to.ClassName @ [sometext] various/sorts/of data {& punctuation} hits=1 status=0 time=2.378
|
||||
|
||||
I prefer to pre-filter with ``grep`` and/or ``sed`` to extract the structured text, then hand that to Miller. Example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
grep 'various sorts' *.log | sed 's/.*} //' | mlr --fs space --repifs --oxtab stats1 -a min,p10,p50,p90,max -f time -g status
|
||||
|
||||
|
|
@ -1118,7 +1118,7 @@ The recursive function for the Fibonacci sequence is famous for its computationa
|
|||
|
||||
produces output like this:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
i o fcount seconds_delta
|
||||
1 1 1 0
|
||||
|
|
@ -1175,7 +1175,7 @@ Note that the time it takes to evaluate the function is blowing up exponentially
|
|||
|
||||
with output like this:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
i o fcount seconds_delta
|
||||
1 1 1 0
|
||||
|
|
|
|||
|
|
@ -323,13 +323,13 @@ Parsing log-file output
|
|||
|
||||
This, of course, depends highly on what's in your log files. But, as an example, suppose you have log-file lines such as
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
2015-10-08 08:29:09,445 INFO com.company.path.to.ClassName @ [sometext] various/sorts/of data {& punctuation} hits=1 status=0 time=2.378
|
||||
|
||||
I prefer to pre-filter with ``grep`` and/or ``sed`` to extract the structured text, then hand that to Miller. Example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
grep 'various sorts' *.log | sed 's/.*} //' | mlr --fs space --repifs --oxtab stats1 -a min,p10,p50,p90,max -f time -g status
|
||||
|
||||
|
|
@ -344,7 +344,7 @@ POKI_INCLUDE_ESCAPED(data/fibo-uncached.sh)HERE
|
|||
|
||||
produces output like this:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
i o fcount seconds_delta
|
||||
1 1 1 0
|
||||
|
|
@ -382,7 +382,7 @@ POKI_INCLUDE_ESCAPED(data/fibo-cached.sh)HERE
|
|||
|
||||
with output like this:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
i o fcount seconds_delta
|
||||
1 1 1 0
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@ Randomly selecting words from a list
|
|||
|
||||
Given this `word list <https://github.com/johnkerl/miller/blob/master/docs/data/english-words.txt>`_, first take a look to see what the first few lines look like:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ head data/english-words.txt
|
||||
|
|
@ -26,7 +26,7 @@ Given this `word list <https://github.com/johnkerl/miller/blob/master/docs/data/
|
|||
|
||||
Then the following will randomly sample ten words with four to eight characters in them:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --from data/english-words.txt --nidx filter -S 'n=strlen($1);4<=n&&n<=8' then sample -k 10
|
||||
|
|
@ -48,7 +48,7 @@ These are simple *n*-grams as `described here <http://johnkerl.org/randspell/ran
|
|||
|
||||
The idea is that words from the input file are consumed, then taken apart and pasted back together in ways which imitate the letter-to-letter transitions found in the word list -- giving us automatically generated words in the same vein as *bromance* and *spork*:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --nidx --from ./ngrams/gsl-2000.txt put -q -f ./ngrams/ngfuncs.mlr -f ./ngrams/ng5.mlr
|
||||
|
|
@ -526,7 +526,7 @@ At standard resolution this makes a nice little ASCII plot:
|
|||
|
||||
But using a very small font size (as small as my Mac will let me go), and by choosing the coordinates to zoom in on a particular part of the complex plane, we can get a nice little picture:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
#!/bin/bash
|
||||
# Get the number of rows and columns from the terminal window dimensions
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ Randomly selecting words from a list
|
|||
|
||||
Given this `word list <https://github.com/johnkerl/miller/blob/master/docs/data/english-words.txt>`_, first take a look to see what the first few lines look like:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ head data/english-words.txt
|
||||
|
|
@ -23,7 +23,7 @@ Given this `word list <https://github.com/johnkerl/miller/blob/master/docs/data/
|
|||
|
||||
Then the following will randomly sample ten words with four to eight characters in them:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --from data/english-words.txt --nidx filter -S 'n=strlen($1);4<=n&&n<=8' then sample -k 10
|
||||
|
|
@ -45,7 +45,7 @@ These are simple *n*-grams as `described here <http://johnkerl.org/randspell/ran
|
|||
|
||||
The idea is that words from the input file are consumed, then taken apart and pasted back together in ways which imitate the letter-to-letter transitions found in the word list -- giving us automatically generated words in the same vein as *bromance* and *spork*:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --nidx --from ./ngrams/gsl-2000.txt put -q -f ./ngrams/ngfuncs.mlr -f ./ngrams/ng5.mlr
|
||||
|
|
@ -135,7 +135,7 @@ POKI_RUN_COMMAND{{mlr -n put -f ./programs/mand.mlr}}HERE
|
|||
|
||||
But using a very small font size (as small as my Mac will let me go), and by choosing the coordinates to zoom in on a particular part of the complex plane, we can get a nice little picture:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
#!/bin/bash
|
||||
# Get the number of rows and columns from the terminal window dimensions
|
||||
|
|
|
|||
|
|
@ -9,30 +9,30 @@ How to use .mlrrc
|
|||
|
||||
Suppose you always use CSV files. Then instead of always having to type ``--csv`` as in
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr --csv cut -x -f extra mydata.csv
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr --csv sort -n id mydata.csv
|
||||
|
||||
and so on, you can instead put the following into your ``$HOME/.mlrrc``:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--csv
|
||||
|
||||
Then you can just type things like
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr cut -x -f extra mydata.csv
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr sort -n id mydata.csv
|
||||
|
|
|
|||
|
|
@ -6,30 +6,30 @@ How to use .mlrrc
|
|||
|
||||
Suppose you always use CSV files. Then instead of always having to type ``--csv`` as in
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr --csv cut -x -f extra mydata.csv
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr --csv sort -n id mydata.csv
|
||||
|
||||
and so on, you can instead put the following into your ``$HOME/.mlrrc``:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--csv
|
||||
|
||||
Then you can just type things like
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr cut -x -f extra mydata.csv
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mlr sort -n id mydata.csv
|
||||
|
|
|
|||
|
|
@ -307,7 +307,7 @@ Note that running a subprocess on every record takes a non-trivial amount of tim
|
|||
..
|
||||
hard-coded, not live-code, since %N doesn't exist on all platforms
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint put '$t=system("date +%s.%N")' then step -a delta -f t data/small
|
||||
a b i x y t t_delta
|
||||
|
|
@ -317,7 +317,7 @@ Note that running a subprocess on every record takes a non-trivial amount of tim
|
|||
eks wye 4 0.38139939387114097 0.13418874328430463 1568774318.516547441 0.000929
|
||||
wye pan 5 0.5732889198020006 0.8636244699032729 1568774318.517518828 0.000971
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint put '$t=systime()' then step -a delta -f t data/small
|
||||
a b i x y t t_delta
|
||||
|
|
|
|||
|
|
@ -67,7 +67,7 @@ Note that running a subprocess on every record takes a non-trivial amount of tim
|
|||
..
|
||||
hard-coded, not live-code, since %N doesn't exist on all platforms
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint put '$t=system("date +%s.%N")' then step -a delta -f t data/small
|
||||
a b i x y t t_delta
|
||||
|
|
@ -77,7 +77,7 @@ Note that running a subprocess on every record takes a non-trivial amount of tim
|
|||
eks wye 4 0.38139939387114097 0.13418874328430463 1568774318.516547441 0.000929
|
||||
wye pan 5 0.5732889198020006 0.8636244699032729 1568774318.517518828 0.000971
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint put '$t=systime()' then step -a delta -f t data/small
|
||||
a b i x y t t_delta
|
||||
|
|
|
|||
|
|
@ -670,13 +670,13 @@ XML, JSON, etc. are, by contrast, all **recursive** or **nested** data structure
|
|||
|
||||
Now, you can put tabular data into these formats -- since list-of-key-value-pairs is one of the things representable in XML or JSON. Example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# DKVP
|
||||
x=1,y=2
|
||||
z=3
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# XML
|
||||
<table>
|
||||
|
|
@ -695,7 +695,7 @@ Now, you can put tabular data into these formats -- since list-of-key-value-pair
|
|||
</record>
|
||||
</table>
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# JSON
|
||||
[{"x":1,"y":2},{"z":3}]
|
||||
|
|
|
|||
|
|
@ -278,13 +278,13 @@ XML, JSON, etc. are, by contrast, all **recursive** or **nested** data structure
|
|||
|
||||
Now, you can put tabular data into these formats -- since list-of-key-value-pairs is one of the things representable in XML or JSON. Example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# DKVP
|
||||
x=1,y=2
|
||||
z=3
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# XML
|
||||
<table>
|
||||
|
|
@ -303,7 +303,7 @@ Now, you can put tabular data into these formats -- since list-of-key-value-pair
|
|||
</record>
|
||||
</table>
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# JSON
|
||||
[{"x":1,"y":2},{"z":3}]
|
||||
|
|
|
|||
|
|
@ -130,21 +130,21 @@ Miller's default file format is DKVP, for **delimited key-value pairs**. Example
|
|||
|
||||
Such data are easy to generate, e.g. in Ruby with
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
puts "host=#{hostname},seconds=#{t2-t1},message=#{msg}"
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
puts mymap.collect{|k,v| "#{k}=#{v}"}.join(',')
|
||||
|
||||
or ``print`` statements in various languages, e.g.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
echo "type=3,user=$USER,date=$date\n";
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
logger.log("type=3,user=$USER,date=$date\n");
|
||||
|
||||
|
|
@ -152,7 +152,7 @@ Fields lacking an IPS will have positional index (starting at 1) used as the key
|
|||
|
||||
As discussed in :doc:`record-heterogeneity`, Miller handles changes of field names within the same data stream. But using DKVP format this is particularly natural. One of my favorite use-cases for Miller is in application/server logs, where I log all sorts of lines such as
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
resource=/path/to/file,loadsec=0.45,ok=true
|
||||
record_count=100, resource=/path/to/file
|
||||
|
|
|
|||
|
|
@ -57,21 +57,21 @@ POKI_RUN_COMMAND{{mlr cat data/small}}HERE
|
|||
|
||||
Such data are easy to generate, e.g. in Ruby with
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
puts "host=#{hostname},seconds=#{t2-t1},message=#{msg}"
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
puts mymap.collect{|k,v| "#{k}=#{v}"}.join(',')
|
||||
|
||||
or ``print`` statements in various languages, e.g.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
echo "type=3,user=$USER,date=$date\n";
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
logger.log("type=3,user=$USER,date=$date\n");
|
||||
|
||||
|
|
@ -79,7 +79,7 @@ Fields lacking an IPS will have positional index (starting at 1) used as the key
|
|||
|
||||
As discussed in :doc:`record-heterogeneity`, Miller handles changes of field names within the same data stream. But using DKVP format this is particularly natural. One of my favorite use-cases for Miller is in application/server logs, where I log all sorts of lines such as
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
resource=/path/to/file,loadsec=0.45,ok=true
|
||||
record_count=100, resource=/path/to/file
|
||||
|
|
|
|||
|
|
@ -9,38 +9,38 @@ Prebuilt executables via package managers
|
|||
|
||||
`Homebrew <https://brew.sh/>`_ installation support for OSX is available via
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
brew update && brew install miller
|
||||
|
||||
...and also via `MacPorts <https://www.macports.org/>`_:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo port selfupdate && sudo port install miller
|
||||
|
||||
You may already have the ``mlr`` executable available in your platform's package manager on NetBSD, Debian Linux, Ubuntu Xenial and upward, Arch Linux, or perhaps other distributions. For example, on various Linux distributions you might do one of the following:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo apt-get install miller
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo apt install miller
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo yum install miller
|
||||
|
||||
On Windows, Miller is available via `Chocolatey <https://chocolatey.org/>`_:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
choco install miller
|
||||
|
|
|
|||
|
|
@ -6,38 +6,38 @@ Prebuilt executables via package managers
|
|||
|
||||
`Homebrew <https://brew.sh/>`_ installation support for OSX is available via
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
brew update && brew install miller
|
||||
|
||||
...and also via `MacPorts <https://www.macports.org/>`_:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo port selfupdate && sudo port install miller
|
||||
|
||||
You may already have the ``mlr`` executable available in your platform's package manager on NetBSD, Debian Linux, Ubuntu Xenial and upward, Arch Linux, or perhaps other distributions. For example, on various Linux distributions you might do one of the following:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo apt-get install miller
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo apt install miller
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
sudo yum install miller
|
||||
|
||||
On Windows, Miller is available via `Chocolatey <https://chocolatey.org/>`_:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
choco install miller
|
||||
|
|
|
|||
|
|
@ -6,70 +6,70 @@ Quick examples
|
|||
|
||||
Column select:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --csv cut -f hostname,uptime mydata.csv
|
||||
|
||||
Add new columns as function of other columns:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --nidx put '$sum = $7 < 0.0 ? 3.5 : $7 + 2.1*$8' *.dat
|
||||
|
||||
Row filter:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --csv filter '$status != "down" && $upsec >= 10000' *.csv
|
||||
|
||||
Apply column labels and pretty-print:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% grep -v '^#' /etc/group | mlr --ifs : --nidx --opprint label group,pass,gid,member then sort -f group
|
||||
|
||||
Join multiple data sources on key columns:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr join -j account_id -f accounts.dat then group-by account_name balances.dat
|
||||
|
||||
Multiple formats including JSON:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --json put '$attr = sub($attr, "([0-9]+)_([0-9]+)_.*", "\1:\2")' data/*.json
|
||||
|
||||
Aggregate per-column statistics:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr stats1 -a min,mean,max,p10,p50,p90 -f flag,u,v data/*
|
||||
|
||||
Linear regression:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr stats2 -a linreg-pca -f u,v -g shape data/*
|
||||
|
||||
Aggregate custom per-column statistics:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr put -q '@sum[$a][$b] += $x; end {emit @sum, "a", "b"}' data/*
|
||||
|
||||
Iterate over data using DSL expressions:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from estimates.tbl put '
|
||||
|
|
@ -83,35 +83,35 @@ Iterate over data using DSL expressions:
|
|||
|
||||
Run DSL expressions from a script file:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put -f analyze.mlr
|
||||
|
||||
Split/reduce output to multiple filenames:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put 'tee > "./taps/data-".$a."-".$b, $*'
|
||||
|
||||
Compressed I/O:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put 'tee | "gzip > ./taps/data-".$a."-".$b.".gz", $*'
|
||||
|
||||
Interoperate with other data-processing tools using standard pipes:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put -q '@v=$*; dump | "jq .[]"'
|
||||
|
||||
Tap/trace:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put '(NR % 1000 == 0) { print > stderr, "Checkpoint ".NR}'
|
||||
|
|
|
|||
|
|
@ -3,70 +3,70 @@ Quick examples
|
|||
|
||||
Column select:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --csv cut -f hostname,uptime mydata.csv
|
||||
|
||||
Add new columns as function of other columns:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --nidx put '$sum = $7 < 0.0 ? 3.5 : $7 + 2.1*$8' *.dat
|
||||
|
||||
Row filter:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --csv filter '$status != "down" && $upsec >= 10000' *.csv
|
||||
|
||||
Apply column labels and pretty-print:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% grep -v '^#' /etc/group | mlr --ifs : --nidx --opprint label group,pass,gid,member then sort -f group
|
||||
|
||||
Join multiple data sources on key columns:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr join -j account_id -f accounts.dat then group-by account_name balances.dat
|
||||
|
||||
Multiple formats including JSON:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --json put '$attr = sub($attr, "([0-9]+)_([0-9]+)_.*", "\1:\2")' data/*.json
|
||||
|
||||
Aggregate per-column statistics:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr stats1 -a min,mean,max,p10,p50,p90 -f flag,u,v data/*
|
||||
|
||||
Linear regression:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr stats2 -a linreg-pca -f u,v -g shape data/*
|
||||
|
||||
Aggregate custom per-column statistics:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr put -q '@sum[$a][$b] += $x; end {emit @sum, "a", "b"}' data/*
|
||||
|
||||
Iterate over data using DSL expressions:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from estimates.tbl put '
|
||||
|
|
@ -80,35 +80,35 @@ Iterate over data using DSL expressions:
|
|||
|
||||
Run DSL expressions from a script file:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put -f analyze.mlr
|
||||
|
||||
Split/reduce output to multiple filenames:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put 'tee > "./taps/data-".$a."-".$b, $*'
|
||||
|
||||
Compressed I/O:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put 'tee | "gzip > ./taps/data-".$a."-".$b.".gz", $*'
|
||||
|
||||
Interoperate with other data-processing tools using standard pipes:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put -q '@v=$*; dump | "jq .[]"'
|
||||
|
||||
Tap/trace:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
% mlr --from infile.dat put '(NR % 1000 == 0) { print > stderr, "Checkpoint ".NR}'
|
||||
|
|
|
|||
|
|
@ -286,17 +286,17 @@ Semicolons are required between statements even if those statements are on separ
|
|||
|
||||
Bodies for all compound statements must be enclosed in **curly braces**, even if the body is a single statement:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if ($x == 1) $y = 2' # Syntax error
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if ($x == 1) { $y = 2 }' # This is OK
|
||||
|
||||
Bodies for compound statements may be empty:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if ($x == 1) { }' # This no-op is syntactically acceptable
|
||||
|
||||
|
|
@ -956,7 +956,7 @@ Local variables can be defined either untyped as in ``x = 1``, or typed as in ``
|
|||
|
||||
The reason for ``num`` is that ``int`` and ``float`` typedecls are very precise:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
float a = 0; # Runtime error since 0 is int not float
|
||||
int b = 1.0; # Runtime error since 1.0 is float not int
|
||||
|
|
@ -967,7 +967,7 @@ A suggestion is to use ``num`` for general use when you want numeric content, an
|
|||
|
||||
The ``var`` type declaration indicates no type restrictions, e.g. ``var x = 1`` has the same type restrictions on ``x`` as ``x = 1``. The difference is in intentional shadowing: if you have ``x = 1`` in outer scope and ``x = 2`` in inner scope (e.g. within a for-loop or an if-statement) then outer-scope ``x`` has value 2 after the second assignment. But if you have ``var x = 2`` in the inner scope, then you are declaring a variable scoped to the inner block.) For example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
x = 1;
|
||||
if (NR == 4) {
|
||||
|
|
@ -975,7 +975,7 @@ The ``var`` type declaration indicates no type restrictions, e.g. ``var x = 1``
|
|||
}
|
||||
print x; # Value of x is now two
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
x = 1;
|
||||
if (NR == 4) {
|
||||
|
|
@ -985,7 +985,7 @@ The ``var`` type declaration indicates no type restrictions, e.g. ``var x = 1``
|
|||
|
||||
Likewise function arguments can optionally be typed, with type enforced when the function is called:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
func f(map m, int i) {
|
||||
...
|
||||
|
|
@ -1000,7 +1000,7 @@ Likewise function arguments can optionally be typed, with type enforced when the
|
|||
|
||||
Thirdly, function return values can be type-checked at the point of ``return`` using ``:`` and a typedecl after the parameter list:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
func f(map m, int i): bool {
|
||||
...
|
||||
|
|
@ -1395,7 +1395,7 @@ Operator precedence
|
|||
|
||||
Operators are listed in order of decreasing precedence, highest first.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
Operators Associativity
|
||||
--------- -------------
|
||||
|
|
@ -1498,11 +1498,11 @@ If-statements
|
|||
|
||||
These are again reminiscent of ``awk``. Pattern-action blocks are a special case of ``if`` with no ``elif`` or ``else`` blocks, no ``if`` keyword, and parentheses optional around the boolean expression:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'NR == 4 {$foo = "bar"}'
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if (NR == 4) {$foo = "bar"}'
|
||||
|
||||
|
|
|
|||
|
|
@ -121,17 +121,17 @@ POKI_INCLUDE_AND_RUN_ESCAPED(data/trailing-commas.sh)HERE
|
|||
|
||||
Bodies for all compound statements must be enclosed in **curly braces**, even if the body is a single statement:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if ($x == 1) $y = 2' # Syntax error
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if ($x == 1) { $y = 2 }' # This is OK
|
||||
|
||||
Bodies for compound statements may be empty:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if ($x == 1) { }' # This no-op is syntactically acceptable
|
||||
|
||||
|
|
@ -360,7 +360,7 @@ Local variables can be defined either untyped as in ``x = 1``, or typed as in ``
|
|||
|
||||
The reason for ``num`` is that ``int`` and ``float`` typedecls are very precise:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
float a = 0; # Runtime error since 0 is int not float
|
||||
int b = 1.0; # Runtime error since 1.0 is float not int
|
||||
|
|
@ -371,7 +371,7 @@ A suggestion is to use ``num`` for general use when you want numeric content, an
|
|||
|
||||
The ``var`` type declaration indicates no type restrictions, e.g. ``var x = 1`` has the same type restrictions on ``x`` as ``x = 1``. The difference is in intentional shadowing: if you have ``x = 1`` in outer scope and ``x = 2`` in inner scope (e.g. within a for-loop or an if-statement) then outer-scope ``x`` has value 2 after the second assignment. But if you have ``var x = 2`` in the inner scope, then you are declaring a variable scoped to the inner block.) For example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
x = 1;
|
||||
if (NR == 4) {
|
||||
|
|
@ -379,7 +379,7 @@ The ``var`` type declaration indicates no type restrictions, e.g. ``var x = 1``
|
|||
}
|
||||
print x; # Value of x is now two
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
x = 1;
|
||||
if (NR == 4) {
|
||||
|
|
@ -389,7 +389,7 @@ The ``var`` type declaration indicates no type restrictions, e.g. ``var x = 1``
|
|||
|
||||
Likewise function arguments can optionally be typed, with type enforced when the function is called:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
func f(map m, int i) {
|
||||
...
|
||||
|
|
@ -404,7 +404,7 @@ Likewise function arguments can optionally be typed, with type enforced when the
|
|||
|
||||
Thirdly, function return values can be type-checked at the point of ``return`` using ``:`` and a typedecl after the parameter list:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
func f(map m, int i): bool {
|
||||
...
|
||||
|
|
@ -463,7 +463,7 @@ Operator precedence
|
|||
|
||||
Operators are listed in order of decreasing precedence, highest first.
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
Operators Associativity
|
||||
--------- -------------
|
||||
|
|
@ -524,11 +524,11 @@ If-statements
|
|||
|
||||
These are again reminiscent of ``awk``. Pattern-action blocks are a special case of ``if`` with no ``elif`` or ``else`` blocks, no ``if`` keyword, and parentheses optional around the boolean expression:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'NR == 4 {$foo = "bar"}'
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put 'if (NR == 4) {$foo = "bar"}'
|
||||
|
||||
|
|
|
|||
|
|
@ -158,7 +158,7 @@ bootstrap
|
|||
|
||||
The canonical use for bootstrap sampling is to put error bars on statistical quantities, such as mean. For example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
@ -169,7 +169,7 @@ The canonical use for bootstrap sampling is to put error bars on statistical qua
|
|||
blue 0.517717 1470
|
||||
orange 0.490532 303
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint bootstrap then stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
@ -180,7 +180,7 @@ The canonical use for bootstrap sampling is to put error bars on statistical qua
|
|||
blue 0.512529 1496
|
||||
orange 0.521030 321
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint bootstrap then stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
@ -191,7 +191,7 @@ The canonical use for bootstrap sampling is to put error bars on statistical qua
|
|||
green 0.496803 1075
|
||||
purple 0.486337 1199
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint bootstrap then stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
|
|||
|
|
@ -69,7 +69,7 @@ POKI_RUN_COMMAND{{mlr bootstrap --help}}HERE
|
|||
|
||||
The canonical use for bootstrap sampling is to put error bars on statistical quantities, such as mean. For example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
@ -80,7 +80,7 @@ The canonical use for bootstrap sampling is to put error bars on statistical qua
|
|||
blue 0.517717 1470
|
||||
orange 0.490532 303
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint bootstrap then stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
@ -91,7 +91,7 @@ The canonical use for bootstrap sampling is to put error bars on statistical qua
|
|||
blue 0.512529 1496
|
||||
orange 0.521030 321
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint bootstrap then stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
@ -102,7 +102,7 @@ The canonical use for bootstrap sampling is to put error bars on statistical qua
|
|||
green 0.496803 1075
|
||||
purple 0.486337 1199
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
$ mlr --opprint bootstrap then stats1 -a mean,count -f u -g color data/colored-shapes.dkvp
|
||||
color u_mean u_count
|
||||
|
|
|
|||
|
|
@ -39,7 +39,7 @@ Formats
|
|||
|
||||
Options:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--dkvp --idkvp --odkvp
|
||||
--nidx --inidx --onidx
|
||||
|
|
@ -97,14 +97,14 @@ Compression
|
|||
|
||||
Options:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--prepipe {command}
|
||||
|
||||
|
||||
The prepipe command is anything which reads from standard input and produces data acceptable to Miller. Nominally this allows you to use whichever decompression utilities you have installed on your system, on a per-file basis. If the command has flags, quote them: e.g. ``mlr --prepipe 'zcat -cf'``. Examples:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# These two produce the same output:
|
||||
$ gunzip < myfile1.csv.gz | mlr cut -f hostname,uptime
|
||||
|
|
@ -113,14 +113,14 @@ The prepipe command is anything which reads from standard input and produces dat
|
|||
$ mlr --prepipe gunzip cut -f hostname,uptime myfile1.csv.gz myfile2.csv.gz
|
||||
$ mlr --prepipe gunzip --idkvp --oxtab cut -f hostname,uptime myfile1.dat.gz myfile2.dat.gz
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# Similar to the above, but with compressed output as well as input:
|
||||
$ gunzip < myfile1.csv.gz | mlr cut -f hostname,uptime | gzip > outfile.csv.gz
|
||||
$ mlr --prepipe gunzip cut -f hostname,uptime myfile1.csv.gz | gzip > outfile.csv.gz
|
||||
$ mlr --prepipe gunzip cut -f hostname,uptime myfile1.csv.gz myfile2.csv.gz | gzip > outfile.csv.gz
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# Similar to the above, but with different compression tools for input and output:
|
||||
$ gunzip < myfile1.csv.gz | mlr cut -f hostname,uptime | xz -z > outfile.csv.xz
|
||||
|
|
@ -136,7 +136,7 @@ Miller has record separators ``IRS`` and ``ORS``, field separators ``IFS`` and `
|
|||
|
||||
Options:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--rs --irs --ors
|
||||
--fs --ifs --ofs --repifs
|
||||
|
|
@ -157,7 +157,7 @@ Number formatting
|
|||
|
||||
The command-line option ``--ofmt {format string}`` is the global number format for commands which generate numeric output, e.g. ``stats1``, ``stats2``, ``histogram``, and ``step``, as well as ``mlr put``. Examples:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--ofmt %.9le --ofmt %.6lf --ofmt %.0lf
|
||||
|
||||
|
|
@ -200,13 +200,13 @@ then-chaining
|
|||
|
||||
In accord with the `Unix philosophy <http://en.wikipedia.org/wiki/Unix_philosophy>`_, you can pipe data into or out of Miller. For example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr cut --complement -f os_version *.dat | mlr sort -f hostname,uptime
|
||||
|
||||
You can, if you like, instead simply chain commands together using the ``then`` keyword:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr cut --complement -f os_version then sort -f hostname,uptime *.dat
|
||||
|
||||
|
|
@ -602,25 +602,25 @@ Regex captures of the form ``\0`` through ``\9`` are supported as
|
|||
|
||||
* Captures have in-function context for ``sub`` and ``gsub``. For example, the first ``\1,\2`` pair belong to the first ``sub`` and the second ``\1,\2`` pair belong to the second ``sub``:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put '$b = sub($a, "(..)_(...)", "\2-\1"); $c = sub($a, "(..)_(.)(..)", ":\1:\2:\3")'
|
||||
|
||||
* Captures endure for the entirety of a ``put`` for the ``=~`` and ``!=~`` operators. For example, here the ``\1,\2`` are set by the ``=~`` operator and are used by both subsequent assignment statements:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put '$a =~ "(..)_(....); $b = "left_\1"; $c = "right_\2"'
|
||||
|
||||
* The captures are not retained across multiple puts. For example, here the ``\1,\2`` won't be expanded from the regex capture:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put '$a =~ "(..)_(....)' then {... something else ...} then put '$b = "left_\1"; $c = "right_\2"'
|
||||
|
||||
* Captures are ignored in ``filter`` for the ``=~`` and ``!=~`` operators. For example, there is no mechanism provided to refer to the first ``(..)`` as ``\1`` or to the second ``(....)`` as ``\2`` in the following filter statement:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr filter '$a =~ "(..)_(....)'
|
||||
|
||||
|
|
@ -650,7 +650,7 @@ The short of it is that Miller does this transparently for you so you needn't th
|
|||
|
||||
Implementation details of this, for the interested: integer adds and subtracts overflow by at most one bit so it suffices to check sign-changes. Thus, Miller allows you to add and subtract arbitrary 64-bit signed integers, converting only to float precisely when the result is less than -2\ :sup:`63` or greater than 2\ :sup:`63`\ -1. Multiplies, on the other hand, can overflow by a word size and a sign-change technique does not suffice to detect overflow. Instead Miller tests whether the floating-point product exceeds the representable integer range. Now, 64-bit integers have 64-bit precision while IEEE-doubles have only 52-bit mantissas -- so, there are 53 bits including implicit leading one. The following experiment explicitly demonstrates the resolution at this range:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
64-bit integer 64-bit integer Casted to double Back to 64-bit
|
||||
in hex in decimal integer
|
||||
|
|
|
|||
|
|
@ -32,7 +32,7 @@ Formats
|
|||
|
||||
Options:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--dkvp --idkvp --odkvp
|
||||
--nidx --inidx --onidx
|
||||
|
|
@ -72,14 +72,14 @@ Compression
|
|||
|
||||
Options:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--prepipe {command}
|
||||
|
||||
|
||||
The prepipe command is anything which reads from standard input and produces data acceptable to Miller. Nominally this allows you to use whichever decompression utilities you have installed on your system, on a per-file basis. If the command has flags, quote them: e.g. ``mlr --prepipe 'zcat -cf'``. Examples:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# These two produce the same output:
|
||||
$ gunzip < myfile1.csv.gz | mlr cut -f hostname,uptime
|
||||
|
|
@ -88,14 +88,14 @@ The prepipe command is anything which reads from standard input and produces dat
|
|||
$ mlr --prepipe gunzip cut -f hostname,uptime myfile1.csv.gz myfile2.csv.gz
|
||||
$ mlr --prepipe gunzip --idkvp --oxtab cut -f hostname,uptime myfile1.dat.gz myfile2.dat.gz
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# Similar to the above, but with compressed output as well as input:
|
||||
$ gunzip < myfile1.csv.gz | mlr cut -f hostname,uptime | gzip > outfile.csv.gz
|
||||
$ mlr --prepipe gunzip cut -f hostname,uptime myfile1.csv.gz | gzip > outfile.csv.gz
|
||||
$ mlr --prepipe gunzip cut -f hostname,uptime myfile1.csv.gz myfile2.csv.gz | gzip > outfile.csv.gz
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
# Similar to the above, but with different compression tools for input and output:
|
||||
$ gunzip < myfile1.csv.gz | mlr cut -f hostname,uptime | xz -z > outfile.csv.xz
|
||||
|
|
@ -111,7 +111,7 @@ Miller has record separators ``IRS`` and ``ORS``, field separators ``IFS`` and `
|
|||
|
||||
Options:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--rs --irs --ors
|
||||
--fs --ifs --ofs --repifs
|
||||
|
|
@ -132,7 +132,7 @@ Number formatting
|
|||
|
||||
The command-line option ``--ofmt {format string}`` is the global number format for commands which generate numeric output, e.g. ``stats1``, ``stats2``, ``histogram``, and ``step``, as well as ``mlr put``. Examples:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
--ofmt %.9le --ofmt %.6lf --ofmt %.0lf
|
||||
|
||||
|
|
@ -163,13 +163,13 @@ then-chaining
|
|||
|
||||
In accord with the `Unix philosophy <http://en.wikipedia.org/wiki/Unix_philosophy>`_, you can pipe data into or out of Miller. For example:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr cut --complement -f os_version *.dat | mlr sort -f hostname,uptime
|
||||
|
||||
You can, if you like, instead simply chain commands together using the ``then`` keyword:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr cut --complement -f os_version then sort -f hostname,uptime *.dat
|
||||
|
||||
|
|
@ -365,25 +365,25 @@ Regex captures of the form ``\0`` through ``\9`` are supported as
|
|||
|
||||
* Captures have in-function context for ``sub`` and ``gsub``. For example, the first ``\1,\2`` pair belong to the first ``sub`` and the second ``\1,\2`` pair belong to the second ``sub``:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put '$b = sub($a, "(..)_(...)", "\2-\1"); $c = sub($a, "(..)_(.)(..)", ":\1:\2:\3")'
|
||||
|
||||
* Captures endure for the entirety of a ``put`` for the ``=~`` and ``!=~`` operators. For example, here the ``\1,\2`` are set by the ``=~`` operator and are used by both subsequent assignment statements:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put '$a =~ "(..)_(....); $b = "left_\1"; $c = "right_\2"'
|
||||
|
||||
* The captures are not retained across multiple puts. For example, here the ``\1,\2`` won't be expanded from the regex capture:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr put '$a =~ "(..)_(....)' then {... something else ...} then put '$b = "left_\1"; $c = "right_\2"'
|
||||
|
||||
* Captures are ignored in ``filter`` for the ``=~`` and ``!=~`` operators. For example, there is no mechanism provided to refer to the first ``(..)`` as ``\1`` or to the second ``(....)`` as ``\2`` in the following filter statement:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
mlr filter '$a =~ "(..)_(....)'
|
||||
|
||||
|
|
@ -413,7 +413,7 @@ The short of it is that Miller does this transparently for you so you needn't th
|
|||
|
||||
Implementation details of this, for the interested: integer adds and subtracts overflow by at most one bit so it suffices to check sign-changes. Thus, Miller allows you to add and subtract arbitrary 64-bit signed integers, converting only to float precisely when the result is less than -2\ :sup:`63` or greater than 2\ :sup:`63`\ -1. Multiplies, on the other hand, can overflow by a word size and a sign-change technique does not suffice to detect overflow. Instead Miller tests whether the floating-point product exceeds the representable integer range. Now, 64-bit integers have 64-bit precision while IEEE-doubles have only 52-bit mantissas -- so, there are 53 bits including implicit leading one. The following experiment explicitly demonstrates the resolution at this range:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
|
||||
64-bit integer 64-bit integer Casted to double Back to 64-bit
|
||||
in hex in decimal integer
|
||||
|
|
|
|||
|
|
@ -13,7 +13,7 @@ I like to produce SQL-query output with header-column and tab delimiter: this is
|
|||
|
||||
For example, using default output formatting in ``mysql`` we get formatting like Miller's ``--opprint --barred``:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -e 'show columns in mytable'
|
||||
|
|
@ -29,7 +29,7 @@ For example, using default output formatting in ``mysql`` we get formatting like
|
|||
|
||||
Using ``mysql``'s ``-B`` we get TSV output:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -B -e 'show columns in mytable' | mlr --itsvlite --opprint cat
|
||||
|
|
@ -42,7 +42,7 @@ Using ``mysql``'s ``-B`` we get TSV output:
|
|||
|
||||
Since Miller handles TSV output, we can do as much or as little processing as we want in the SQL query, then send the rest on to Miller. This includes outputting as JSON, doing further selects/joins in Miller, doing stats, etc. etc.:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -B -e 'show columns in mytable' | mlr --itsvlite --ojson --jlistwrap --jvstack cat
|
||||
|
|
@ -89,12 +89,12 @@ Since Miller handles TSV output, we can do as much or as little processing as we
|
|||
}
|
||||
]
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -B -e 'select * from mytable' > query.tsv
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --from query.tsv --t2p stats1 -a count -f id -g category,assigned_to
|
||||
|
|
@ -118,7 +118,7 @@ One use of NIDX (value-only, no keys) format is for loading up SQL tables.
|
|||
|
||||
Create and load SQL table:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> CREATE TABLE abixy(
|
||||
|
|
@ -130,19 +130,19 @@ Create and load SQL table:
|
|||
);
|
||||
Query OK, 0 rows affected (0.01 sec)
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
bash$ mlr --onidx --fs comma cat data/medium > medium.nidx
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> LOAD DATA LOCAL INFILE 'medium.nidx' REPLACE INTO TABLE abixy FIELDS TERMINATED BY ',' ;
|
||||
Query OK, 10000 rows affected (0.07 sec)
|
||||
Records: 10000 Deleted: 0 Skipped: 0 Warnings: 0
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> SELECT COUNT(*) AS count FROM abixy;
|
||||
|
|
@ -153,7 +153,7 @@ Create and load SQL table:
|
|||
+-------+
|
||||
1 row in set (0.00 sec)
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> SELECT * FROM abixy LIMIT 10;
|
||||
|
|
@ -174,7 +174,7 @@ Create and load SQL table:
|
|||
|
||||
Aggregate counts within SQL:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> SELECT a, b, COUNT(*) AS count FROM abixy GROUP BY a, b ORDER BY COUNT DESC;
|
||||
|
|
@ -211,7 +211,7 @@ Aggregate counts within SQL:
|
|||
|
||||
Aggregate counts within Miller:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --opprint uniq -c -g a,b then sort -nr count data/medium
|
||||
|
|
@ -234,7 +234,7 @@ Aggregate counts within Miller:
|
|||
|
||||
Pipe SQL output to aggregate counts within Miller:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql -D miller -B -e 'select * from abixy' | mlr --itsv --opprint uniq -c -g a,b then sort -nr count
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ I like to produce SQL-query output with header-column and tab delimiter: this is
|
|||
|
||||
For example, using default output formatting in ``mysql`` we get formatting like Miller's ``--opprint --barred``:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -e 'show columns in mytable'
|
||||
|
|
@ -26,7 +26,7 @@ For example, using default output formatting in ``mysql`` we get formatting like
|
|||
|
||||
Using ``mysql``'s ``-B`` we get TSV output:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -B -e 'show columns in mytable' | mlr --itsvlite --opprint cat
|
||||
|
|
@ -39,7 +39,7 @@ Using ``mysql``'s ``-B`` we get TSV output:
|
|||
|
||||
Since Miller handles TSV output, we can do as much or as little processing as we want in the SQL query, then send the rest on to Miller. This includes outputting as JSON, doing further selects/joins in Miller, doing stats, etc. etc.:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -B -e 'show columns in mytable' | mlr --itsvlite --ojson --jlistwrap --jvstack cat
|
||||
|
|
@ -86,12 +86,12 @@ Since Miller handles TSV output, we can do as much or as little processing as we
|
|||
}
|
||||
]
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql --database=mydb -B -e 'select * from mytable' > query.tsv
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --from query.tsv --t2p stats1 -a count -f id -g category,assigned_to
|
||||
|
|
@ -115,7 +115,7 @@ One use of NIDX (value-only, no keys) format is for loading up SQL tables.
|
|||
|
||||
Create and load SQL table:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> CREATE TABLE abixy(
|
||||
|
|
@ -127,19 +127,19 @@ Create and load SQL table:
|
|||
);
|
||||
Query OK, 0 rows affected (0.01 sec)
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
bash$ mlr --onidx --fs comma cat data/medium > medium.nidx
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> LOAD DATA LOCAL INFILE 'medium.nidx' REPLACE INTO TABLE abixy FIELDS TERMINATED BY ',' ;
|
||||
Query OK, 10000 rows affected (0.07 sec)
|
||||
Records: 10000 Deleted: 0 Skipped: 0 Warnings: 0
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> SELECT COUNT(*) AS count FROM abixy;
|
||||
|
|
@ -150,7 +150,7 @@ Create and load SQL table:
|
|||
+-------+
|
||||
1 row in set (0.00 sec)
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> SELECT * FROM abixy LIMIT 10;
|
||||
|
|
@ -171,7 +171,7 @@ Create and load SQL table:
|
|||
|
||||
Aggregate counts within SQL:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
mysql> SELECT a, b, COUNT(*) AS count FROM abixy GROUP BY a, b ORDER BY COUNT DESC;
|
||||
|
|
@ -208,7 +208,7 @@ Aggregate counts within SQL:
|
|||
|
||||
Aggregate counts within Miller:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mlr --opprint uniq -c -g a,b then sort -nr count data/medium
|
||||
|
|
@ -231,7 +231,7 @@ Aggregate counts within Miller:
|
|||
|
||||
Pipe SQL output to aggregate counts within Miller:
|
||||
|
||||
.. code-block::
|
||||
.. code-block:: bash
|
||||
:emphasize-lines: 1,1
|
||||
|
||||
$ mysql -D miller -B -e 'select * from abixy' | mlr --itsv --opprint uniq -c -g a,b then sort -nr count
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue