Clarify locations of performance info

This commit is contained in:
John Kerl 2022-11-26 18:32:16 -05:00
parent 6282bf1dcb
commit 36f3c3cb0f
62 changed files with 2076 additions and 33 deletions

View file

@ -19,8 +19,8 @@
* Running `make` within the `docs` directory handles both of those steps.
* TL;DR just `make docs` from the Miller base directory
* Quick-editing loop:
* In one terminal, cd to this directory and leave `mkdocs serve` running.
* In another terminal, cd to the `src` subdirectory of `docs` and edit `*.md.in`.
* In one terminal, cd to the `docs` directory and leave `mkdocs serve` running.
* In another terminal, cd to the `docs/src` subdirectory and edit `*.md.in`.
* Run `genmds` to re-create all the `*.md` files, or `genmds foo.md.in` to just re-create the `foo.md.in` file you just edited, or (simplest) just `make` within the `docs/src` subdirectory.
* In your browser, visit http://127.0.0.1:8000
* This doesn't write HTML in `docs/site`; HTML is served up directly in the browser -- this is nice for previewing interactive edits.

View file

@ -14,8 +14,8 @@ import (
func main() {
var mvs [2]mlrval.Mlrval
mvs[0] = *mlrval.FromString("hello")
mvs[1] = *mlrval.FromString("world")
mvs[0] = *mlrval.FromString("h")
mvs[1] = *mlrval.FromString("abcdefghijklmnopqrstuvwzyx")
mvs[0].ShowSizes()
fmt.Println()
mvs[1].ShowSizes()

View file

@ -76,6 +76,7 @@ nav:
- "CPU/multicore usage": "cpu.md"
- "Scripting with Miller": "scripting.md"
- "Miller environment variables": "reference-main-env-vars.md"
- "Performance": "performance.md"
- 'Types reference':
- "Data types": "reference-main-data-types.md"
- "Strings": "reference-main-strings.md"
@ -104,7 +105,6 @@ nav:
- "Why?": "why.md"
- "Why call it Miller?": "etymology.md"
- "How original is Miller?": "originality.md"
- "Performance": "performance.md"
- 'Misc. reference':
- "Auxiliary commands": "reference-main-auxiliary-commands.md"
- "Manual page": "manpage.md"

View file

@ -39,6 +39,9 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
But `mlr cat` can also do format conversion -- for example, you can pretty-print in tabular format:
@ -58,6 +61,9 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
`mlr head` and `mlr tail` count records rather than lines. Whether you're getting the first few records or the last few, the CSV header is included either way:
@ -71,6 +77,9 @@ yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
red,square,false,4,48,77.5542,7.4670
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -82,6 +91,9 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -108,6 +120,9 @@ purple,square,false,10,91,72.3735,8.2430
"rate": 8.2430
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can sort on a single field:
@ -127,6 +142,9 @@ purple square false 10 91 72.3735 8.2430
yellow triangle true 1 11 43.6498 9.8870
purple triangle false 5 51 81.2290 8.5910
purple triangle false 7 65 80.1405 5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Or, you can sort primarily alphabetically on one field, then secondarily numerically descending on another field, and so on:
@ -146,6 +164,9 @@ red square true 2 15 79.2778 0.0130
purple triangle false 7 65 80.1405 5.8240
purple triangle false 5 51 81.2290 8.5910
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If there are fields you don't want to see in your data, you can use `cut` to keep only the ones you want, in the same order they appeared in the input data:
@ -165,6 +186,9 @@ triangle false
circle true
circle true
square false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can also use `cut -o` to keep specified fields, but in your preferred order:
@ -184,6 +208,9 @@ false triangle
true circle
true circle
false square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can use `cut -x` to omit fields you don't care about:
@ -203,6 +230,9 @@ purple 7 65 80.1405 5.8240
yellow 8 73 63.9785 4.2370
yellow 9 87 63.5058 8.3350
purple 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Even though Miller's main selling point is name-indexing, sometimes you really want to refer to a field name by its positional index. Use `$[[3]]` to access the name of field 3 or `$[[[3]]]` to access the value of field 3:
@ -222,6 +252,9 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -239,6 +272,9 @@ purple triangle NEW 7 65 80.1405 5.8240
yellow circle NEW 8 73 63.9785 4.2370
yellow circle NEW 9 87 63.5058 8.3350
purple square NEW 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can find the full list of verbs at the [Verbs Reference](reference-verbs.md) page.
@ -256,6 +292,9 @@ red square true 2 15 79.2778 0.0130
red circle true 3 16 13.8103 2.9010
red square false 4 48 77.5542 7.4670
red square false 6 64 77.1991 9.5310
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -265,6 +304,9 @@ red square false 6 64 77.1991 9.5310
color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
red circle true 3 16 13.8103 2.9010
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Computing new fields
@ -289,6 +331,9 @@ purple triangle false 7 65 80.1405 5.8240 13.760388049450551 purple_triangl
yellow circle true 8 73 63.9785 4.2370 15.09995279679018 yellow_circle
yellow circle true 9 87 63.5058 8.3350 7.619172165566886 yellow_circle
purple square false 10 91 72.3735 8.2430 8.779995147397793 purple_square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
When you create a new field, it can immediately be used in subsequent statements:
@ -311,6 +356,9 @@ purple triangle false 7 65 80.1405 5.8240 66 4363
yellow circle true 8 73 63.9785 4.2370 74 5484
yellow circle true 9 87 63.5058 8.3350 88 7753
purple square false 10 91 72.3735 8.2430 92 8474
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
For `put` and `filter` we were able to type out expressions using a programming-language syntax.
@ -331,6 +379,9 @@ Zone,Total MWh
17,39.8
24,7.4
30,50.5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -342,6 +393,9 @@ Zone Total MWh
17 39.8
14 27.2
24 7.4
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
For `put` and `filter` expressions, use `${...}`:
@ -355,6 +409,9 @@ Zone Total MWh Total KWh
17 39.8 39800
24 7.4 7400
30 50.5 50500
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also the [section on field names](reference-dsl-variables.md#field-names).
@ -401,6 +458,9 @@ a,b,c
1,2,3
4,5,6
7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Chaining verbs together
@ -415,6 +475,12 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This works fine -- but Miller also lets you chain verbs together using the word `then`. Think of this as a Miller-internal pipe that lets you use fewer keystrokes:
@ -427,6 +493,9 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
As another convenience, you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
@ -439,6 +508,9 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -452,6 +524,9 @@ shape quantity
square 72.3735
circle 63.5058
circle 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Sorts and stats
@ -468,6 +543,9 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Lots of Miller commands take a `-g` option for group-by: here, `head -n 1 -g shape` outputs the first record for each distinct value of the `shape` field. This means we're finding the record with highest `index` field for each distinct `shape` field:
@ -480,6 +558,9 @@ color shape flag k index quantity rate
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
purple triangle false 7 65 80.1405 5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Statistics can be computed with or without group-by field(s):
@ -493,6 +574,9 @@ shape quantity_count quantity_min quantity_mean quantity_max
triangle 3 43.6498 68.33976666666666 81.229
square 4 72.3735 76.60114999999999 79.2778
circle 3 13.8103 47.0982 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -507,6 +591,9 @@ circle red 1 13.8103 13.8103 13.8103
triangle purple 2 80.1405 80.68475000000001 81.229
circle yellow 2 63.5058 63.742149999999995 63.9785
square purple 1 72.3735 72.3735 72.3735
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If your output has a lot of columns, you can use XTAB format to line things up vertically for you instead:
@ -524,6 +611,9 @@ rate_p75 8.5910
rate_p90 9.8870
rate_p99 9.8870
rate_p100 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Unicode and internationalization
@ -556,6 +646,9 @@ UTF-8 data. For example:
κόκκινο κύκλος αληθινό 3 16 13.8103 2.9010
κίτρινο κύκλος αληθινό 8 73 63.9785 4.2370
κίτρινο κύκλος αληθινό 9 87 63.5058 8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -573,6 +666,9 @@ UTF-8 data. For example:
κόκκινο τετράγωνο ψευδές 6 64 77.1991 9.5310
μοβ τρίγωνο ψευδές 7 65 80.1405 5.8240
μοβ τετράγωνο ψευδές 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -590,6 +686,9 @@ UTF-8 data. For example:
желтый КРУГ истина 8 73 63.9785 4.2370 6
желтый КРУГ истина 9 87 63.5058 8.3350 6
фиолетовый КВАДРАТ ложь 10 91 72.3735 8.2430 10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## File formats and format conversion
@ -689,6 +788,9 @@ a matter of specifying input-format and output-format flags:
"rate": 0.0130
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -698,6 +800,9 @@ a matter of specifying input-format and output-format flags:
color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
However, if JSON data has map-valued or array-valued fields, Miller gives you choices on how to
@ -738,6 +843,9 @@ We can convert this to CSV, or other tabular formats:
<pre class="pre-non-highlight-in-pair">
hostname,pid,req.id,req.method,req.path,req.host,req.headers.host,req.headers.user-agent,res.status_code,res.header.content-type,res.header.content-encoding
localhost,12345,6789,GET,api/check,foo.bar,bar.baz,browser,200,text,plain
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -755,6 +863,9 @@ req.headers.user-agent browser
res.status_code 200
res.header.content-type text
res.header.content-encoding plain
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
These transformations are reversible:
@ -786,6 +897,12 @@ These transformations are reversible:
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See the [flatten/unflatten page](flatten-unflatten.md) for more information.
@ -875,9 +992,14 @@ If you like, you can first copy off your original data somewhere else, before do
Lastly, using `tee` within `put`, you can split your input data into separate files per one or more field names:
<pre class="pre-highlight-non-pair">
<pre class="pre-highlight-in-pair">
<b>mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>cat circle.csv</b>

View file

@ -41,6 +41,9 @@ John,23,present
Fred,34,present
Alice,56,missing
Carol,45,present
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Following that, you can rename the positionally indexed labels to names with meaning for your context. For example:
@ -54,6 +57,9 @@ John,23,present
Fred,34,present
Alice,56,missing
Carol,45,present
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Likewise, if you need to produce CSV which is lacking its header, you can pipe Miller's output to the system command `sed 1d`, or you can use Miller's `--headerless-csv-output` option:
@ -68,6 +74,9 @@ red,square,1,80,0.219668,0.001257,0.792778,2.944117
red,circle,1,84,0.209017,0.290052,0.138103,5.065034
red,square,0,243,0.956274,0.746720,0.775542,7.117831
purple,triangle,0,257,0.435535,0.859129,0.812290,5.753095
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -79,6 +88,9 @@ red,square,1,80,0.219668,0.001257,0.792778,2.944117
red,circle,1,84,0.209017,0.290052,0.138103,5.065034
red,square,0,243,0.956274,0.746720,0.775542,7.117831
purple,triangle,0,257,0.435535,0.859129,0.812290,5.753095
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Lastly, often we say "CSV" or "TSV" when we have positionally indexed data in columns which are separated by commas or tabs, respectively. In this case it's perhaps simpler to **just use NIDX format** which was designed for this purpose. (See also [File Formats](file-formats.md).) For example:
@ -98,6 +110,9 @@ Lastly, often we say "CSV" or "TSV" when we have positionally indexed data in co
1 Carol
3 present
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Headerless CSV with duplicate field values
@ -134,6 +149,9 @@ see something happened:
-331268.59231736,4537221.43295653,22,1,13.1,1,0.978,0.978,0.962
-330341.96688431,4537221.43295653,23,1,13.1,1,0.978,0.978,0.962
-326635.46515209,4537221.43295653,27,1,13.1,2,0.978,0.972,0.958
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
What happened?
@ -162,6 +180,9 @@ One solution is to use `--implicit-csv-header`, or its shorter alias `--hi`:
-331268.59231736,4537221.43295653,22,1,13.1,1,0.978,0.978,0.962
-330341.96688431,4537221.43295653,23,1,13.1,1,0.978,0.978,0.962
-326635.46515209,4537221.43295653,27,1,13.1,2,0.978,0.972,0.958
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Another solution is to use [NIDX format](file-formats.md#nidx-index-numbered-toolkit-style):
@ -178,6 +199,9 @@ Another solution is to use [NIDX format](file-formats.md#nidx-index-numbered-too
-331268.59231736,4537221.43295653,22,1,13.1,1,0.978,0.978,0.962
-330341.96688431,4537221.43295653,23,1,13.1,1,0.978,0.978,0.962
-326635.46515209,4537221.43295653,27,1,13.1,2,0.978,0.972,0.958
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Either way, since there is no explicit header, fields are named `1` through `9`. We can use the
@ -195,6 +219,9 @@ xsn,ysn,x,y,t,a,e29,e31,e32
-331268.59231736,4537221.43295653,22,1,13.1,1,0.978,0.978,0.962
-330341.96688431,4537221.43295653,23,1,13.1,1,0.978,0.978,0.962
-326635.46515209,4537221.43295653,27,1,13.1,2,0.978,0.972,0.958
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -209,6 +236,9 @@ xsn,ysn,x,y,t,a,e29,e31,e32
-331268.59231736,4537221.43295653,22,1,13.1,1,0.978,0.978,0.962
-330341.96688431,4537221.43295653,23,1,13.1,1,0.978,0.978,0.962
-326635.46515209,4537221.43295653,27,1,13.1,2,0.978,0.972,0.958
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Regularizing ragged CSV
@ -240,6 +270,9 @@ a,b,c
1,2,3
4,5,
6,7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
or, more simply,
@ -257,6 +290,9 @@ a,b,c
1,2,3
4,5,
6,7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also the [record-heterogeneity page](record-heterogeneity.md).

View file

@ -40,6 +40,9 @@ barney false
betty true
fred true
wilma true
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -51,6 +54,9 @@ barney 0
betty 1
fred 1
wilma 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
A second option is to flag badly formatted data within the output stream:
@ -64,6 +70,9 @@ barney false true
betty true true
fred true true
wilma 1 false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Or perhaps to flag badly formatted data outside the output stream:
@ -80,6 +89,9 @@ betty true
fred true
wilma 1
Malformed at NR=4
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
A third way is to abort the process on first instance of bad data:

View file

@ -70,6 +70,9 @@ point_longitude -81.707664
line Residential
construction Masonry
point_granularity 3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
A few simple queries:
@ -88,6 +91,9 @@ BAKER COUNTY 70
BRADFORD COUNTY 31
HAMILTON COUNTY 35
UNION COUNTY 15
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -97,6 +103,9 @@ UNION COUNTY 15
line count
Residential 30838
Commercial 5796
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Categorization of total insured value:
@ -108,6 +117,9 @@ Categorization of total insured value:
tiv_2012_min 73.37
tiv_2012_mean 2571004.0973420837
tiv_2012_max 1701000000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -121,6 +133,9 @@ Wood Residential 73.37 113493.01704925536 649046.12
Reinforced Concrete Commercial 6416016.01 20212428.681839883 60570000
Reinforced Masonry Commercial 1287817.34 4621372.981117158 16650000
Steel Frame Commercial 29790000 133492500 1701000000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -135,6 +150,9 @@ hu_site_deductible_p90 76.5
hu_site_deductible_p95 6829.2
hu_site_deductible_p99 126270
hu_site_deductible_p100 7380000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -153,6 +171,7 @@ BROWARD COUNTY 0 148500 3258900
CALHOUN COUNTY 0 33339.6 33339.6
CHARLOTTE COUNTY 5400 52650 250994.7
CITRUS COUNTY 1332.9 79974.9 483785.1
Memory profile started.
</pre>
<pre class="pre-highlight-in-pair">
@ -165,6 +184,9 @@ tiv_2011_tiv_2012_ols_m 0.9835583980337723
tiv_2011_tiv_2012_ols_b 433854.6428968317
tiv_2011_tiv_2012_ols_n 36634
tiv_2011_tiv_2012_r2 0.9468258417320189
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -207,6 +229,9 @@ tiv_2011_tiv_2012_ols_m 1.2301
tiv_2011_tiv_2012_ols_b -596.6239
tiv_2011_tiv_2012_ols_n 657
tiv_2011_tiv_2012_r2 0.9335
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Color/shape data
@ -241,6 +266,9 @@ red circle 1 84 0.209017 0.290052 0.138103 5.065034
red square 0 243 0.956274 0.746720 0.775542 7.117831
purple triangle 0 257 0.435535 0.859129 0.812290 5.753095
red square 0 322 0.201551 0.953110 0.771991 5.612050
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Look at uncategorized stats (using [creach](https://github.com/johnkerl/scripts/blob/master/fundam/creach) for spacing).
@ -263,6 +291,9 @@ v_min -0.092709
v_mean 0.49778696586624427
v_max 1.0725
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The histogram shows the different distribution of 0/1 flags:
@ -284,6 +315,9 @@ bin_lo bin_hi flag_count u_count v_count
0.8900000000000002 0.9900000000000002 0 995 993
0.9900000000000002 1.0900000000000003 4020 1013 939
1.0900000000000003 1.1900000000000002 0 0 25
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Look at univariate stats by color and shape. In particular, color-dependent flag probabilities pop out, aligning with their original Bernoulli probabilities from the data-generator script:
@ -301,6 +335,9 @@ orange 0 0.5214521452145214 1 0.001235 0.49053241584158375 0.9988
purple 0 0.09019264448336252 1 0.000266 0.49400496322241666 0.999647 0.000364 0.4970507127845888 0.999975
red 0 0.3031674208144796 1 0.000671 0.49255964641241273 0.999882 -0.092709 0.4965350941607402 1.0725
yellow 0 0.8924274593064402 1 0.0013 0.4971291160651098 0.999923 0.000711 0.5106265987261144 0.999919
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -313,6 +350,9 @@ shape flag_min flag_mean flag_max u_min u_mean u_ma
circle 0 0.3998456194519491 1 0.000044 0.498554505982246 0.999923 -0.092709 0.49552416171362396 1.0725
square 0 0.39611178614823817 1 0.000188 0.4993854558930749 0.999969 0.000089 0.49653825929526124 0.999975
triangle 0 0.4015421115065243 1 0.000881 0.49685854240806604 0.999661 0.000717 0.5010495260972719 0.999995
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Look at bivariate stats by color and shape. In particular, `u,v` pairwise correlation for red circles pops out:
@ -323,6 +363,9 @@ Look at bivariate stats by color and shape. In particular, `u,v` pairwise correl
<pre class="pre-non-highlight-in-pair">
u_v_corr w_x_corr
0.1334180491027861 -0.011319841199866178
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -350,4 +393,7 @@ orange triangle -0.030456661186085785 -0.1318699981926352
yellow circle -0.06477331572781474 0.07369449819706045
blue circle -0.10234761901929677 -0.030528539069837757
green triangle -0.10901825107358765 -0.04848782060162929
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -40,6 +40,9 @@ we can use [strptime](reference-verbs.md#strptime) to parse the date field into
<pre class="pre-non-highlight-in-pair">
date,event
2018-03-07,discovery
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Caveat: localtime-handling in timezones with DST is still a work in progress; see [https://github.com/johnkerl/miller/issues/170](https://github.com/johnkerl/miller/issues/170) . See also [https://github.com/johnkerl/miller/issues/208](https://github.com/johnkerl/miller/issues/208) -- thanks @aborruso!
@ -105,6 +108,9 @@ Then, filter for adjacent difference not being 86400 (the number of seconds in a
<pre class="pre-non-highlight-in-pair">
n=774,date=2014-04-19,qoh=130140,datestamp=1397865600,datestamp_delta=259200
n=1119,date=2015-03-31,qoh=181625,datestamp=1427760000,datestamp_delta=172800
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Given this, it's now easy to see where the gaps are:
@ -124,6 +130,9 @@ n=777,1=2014-04-21,2=130368
n=778,1=2014-04-22,2=130368
n=779,1=2014-04-23,2=130849
n=780,1=2014-04-24,2=131026
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -141,4 +150,7 @@ n=1122,1=2015-04-02,2=181718
n=1123,1=2015-04-03,2=181835
n=1124,1=2015-04-04,2=182104
n=1125,1=2015-04-05,2=182528
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -147,6 +147,9 @@ eks pan 2 0.522151 ekspan 2.522151 str str int float str float
wye wye 3 0.338318 wyewye 3.338318 str str int float str float
eks wye 4 0.134188 ekswye 4.134188 str str int float str float
wye pan 5 0.863624 wyepan 5.863624 str str int float str float
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## DKVP I/O in Ruby
@ -265,4 +268,7 @@ eks pan 2 0.522151 ekspan 2.522151 String String Integer Float String Float
wye wye 3 0.338318 wyewye 3.338318 String String Integer Float String Float
eks wye 4 0.134188 ekswye 4.134188 String String Integer Float String Float
wye pan 5 0.863624 wyepan 5.863624 String String Integer Float String Float
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -172,6 +172,9 @@ An **array of single-level objects** is, quite simply, **a table**:
"shape": "square"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -190,6 +193,9 @@ An **array of single-level objects** is, quite simply, **a table**:
"v": 0.001257
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Single-level JSON data goes back and forth between JSON and tabular formats
@ -202,6 +208,9 @@ in the direct way:
color u v
yellow 0.632170 0.988721
red 0.219668 0.001257
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -219,6 +228,9 @@ purple triangle 0 65 0.684281 0.582372 0.801405 5.805148
yellow circle 1 73 0.603365 0.423708 0.639785 7.006414
yellow circle 1 87 0.285656 0.833516 0.635058 6.350036
purple square 0 91 0.259926 0.824322 0.723735 6.854221
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Nested JSON objects
@ -260,6 +272,9 @@ input as well as output in JSON format, JSON structure is preserved throughout t
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
But if the input format is JSON and the output format is not (or vice versa) then key-concatenation applies:
@ -273,6 +288,9 @@ flag i attributes.color attributes.shape values.u values.v values.w values.x
1 15 red square 0.219668 0.001257 0.792778 2.944117
1 16 red circle 0.209017 0.290052 0.138103 5.065034
0 48 red square 0.956274 0.746720 0.775542 7.117831
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This is discussed in more detail on the page [Flatten/unflatten: JSON vs. tabular formats](flatten-unflatten.md).
@ -319,6 +337,9 @@ Miller handles this:
"rate": 0.0130
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -327,6 +348,9 @@ Miller handles this:
<pre class="pre-non-highlight-in-pair">
{"color": "yellow", "shape": "triangle", "flag": "true", "k": 1, "index": 11, "quantity": 43.6498, "rate": 9.8870}
{"color": "red", "shape": "square", "flag": "true", "k": 2, "index": 15, "quantity": 79.2778, "rate": 0.0130}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that for _input_ data, either is acceptable: whether you use `--ijson` or `--ijsonl`, Miller
@ -348,6 +372,9 @@ eks,pan,2,0.758679,0.522151
wye,wye,3,0.204603,0.338318
eks,wye,4,0.381399,0.134188
wye,pan,5,0.573288,0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -360,6 +387,9 @@ eks pan 2 0.758679 0.522151
wye wye 3 0.204603 0.338318
eks wye 4 0.381399 0.134188
wye pan 5 0.573288 0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that while Miller is a line-at-a-time processor and retains input lines in memory only where necessary (e.g. for sort), pretty-print output requires it to accumulate all input lines (so that it can compute maximum column widths) before producing any output. This has two consequences: (a) pretty-print output won't work on `tail -f` contexts, where Miller will be waiting for an end-of-file marker which never arrives; (b) pretty-print output for large files is constrained by available machine memory.
@ -381,6 +411,9 @@ For output only (this isn't supported in the input-scanner as of 5.0.0) you can
| eks | wye | 4 | 0.381399 | 0.134188 |
| wye | pan | 5 | 0.573288 | 0.863624 |
+-----+-----+---+----------+----------+
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Markdown tabular
@ -398,6 +431,9 @@ Markdown format looks like this:
| wye | wye | 3 | 0.204603 | 0.338318 |
| eks | wye | 4 | 0.381399 | 0.134188 |
| wye | pan | 5 | 0.573288 | 0.863624 |
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
which renders like this when dropped into various web tools (e.g. github comments):
@ -486,6 +522,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Such data are easy to generate, e.g. in Ruby with
@ -551,6 +590,9 @@ eks pan 2 0.758679 0.522151
wye wye 3 0.204603 0.338318
eks wye 4 0.381399 0.134188
wye pan 5 0.573288 0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Example with index-numbered input:
@ -571,6 +613,9 @@ early light
1=oh,2=say,3=can,4=you,5=see
1=by,2=the,3=dawn's
1=early,2=light
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Example with index-numbered input and output:
@ -591,6 +636,9 @@ early light
say can
the dawn's
light
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Data-conversion keystroke-savers
@ -681,6 +729,9 @@ type quantity
green 678.12
purple 456.78
orange 123.45
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -692,4 +743,7 @@ type quantity
green 678.12
purple 456.78
orange 123.45
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -103,6 +103,9 @@ Flattened to CSV format:
a,b.x,b.y
1,2,3
4,5,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Flattened to pretty-print format:
@ -114,6 +117,9 @@ Flattened to pretty-print format:
a b.x b.y
1 2 3
4 5 6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Using flatten-separator `:` instead of the default `.`:
@ -125,6 +131,9 @@ Using flatten-separator `:` instead of the default `.`:
a b:x b:y
1 2 3
4 5 6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If the maps are more deeply nested, each level of map keys is joined in:
@ -150,6 +159,9 @@ If the maps are more deeply nested, each level of map keys is joined in:
a b.s.w b.s.x b.t.y b.t.z
1 2 3 4 5
6 7 8 9 10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
**Unflattening** is simply the reverse -- from non-JSON back to JSON:
@ -175,6 +187,9 @@ a b.s.w b.s.x b.t.y b.t.z
a,b.x,b.y
1,2,3
4,5,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -197,6 +212,12 @@ a,b.x,b.y
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Converting arrays between JSON and non-JSON
@ -226,6 +247,9 @@ If the input data contains arrays, these are also flattened similarly: the
a b.1 b.2
1 2 3
4 5 6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If the arrays are more deeply nested, each level of arrays keys is joined in:
@ -251,6 +275,9 @@ If the arrays are more deeply nested, each level of arrays keys is joined in:
a b.1.1 b.1.2 b.2.1 b.2.2
1 2 3 4 5
6 7 8 9 10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
In the nested-data examples shown here, nested map values are shown containing
@ -280,6 +307,9 @@ though not shown here) nested map values can contain arrays, and vice versa.
a,b.1,b.2
1,2,3
4,5,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -296,6 +326,12 @@ a,b.1,b.2
"b": [5, 6]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Auto-inferencing of arrays on unflatten
@ -323,6 +359,9 @@ a.1,a.2,a.3
"a": [4, 5, 6]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -346,6 +385,9 @@ a.1,a.3,a.5
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Manual control
@ -393,6 +435,9 @@ Using JSON output, we can see that `splita` has produced an array-valued field n
"components": ["nadir", "west", "our", "org"]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Using CSV output, with default auto-flatten, we get `components.1` through `components.4`:
@ -404,6 +449,9 @@ Using CSV output, with default auto-flatten, we get `components.1` through `comp
host,status,components.1,components.2,components.3,components.4
apoapsis.east.our.org,up,apoapsis,east,our,org
nadir.west.our.org,down,nadir,west,our,org
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Using CSV output, without default auto-flatten, we get a JSON-stringified encoding of the `components` field:
@ -415,6 +463,9 @@ Using CSV output, without default auto-flatten, we get a JSON-stringified encodi
host,status,components
apoapsis.east.our.org,up,"[""apoapsis"", ""east"", ""our"", ""org""]"
nadir.west.our.org,down,"[""nadir"", ""west"", ""our"", ""org""]"
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Now suppose we ran this
@ -435,6 +486,9 @@ host nadir.west.our.org
status down
a ["nadir", "west", "our", "org"]
b ["nadir", "west", "our", "org"]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
into a file [data/hostnames.xtab](./data/hostnames.xtab):
@ -476,6 +530,9 @@ leave `b` JSON-stringified:
"b": "[\"nadir\", \"west\", \"our\", \"org\"]"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also the

View file

@ -70,6 +70,9 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -87,6 +90,9 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If you run into issues on these checks, please check out the resources on the [community page](community.md) for help.

View file

@ -50,6 +50,9 @@ Support for internationalization includes:
κόκκινο κύκλος αληθινό 3 16 13.8103 2.9010
κίτρινο κύκλος αληθινό 8 73 63.9785 4.2370
κίτρινο κύκλος αληθινό 9 87 63.5058 8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -67,6 +70,9 @@ Support for internationalization includes:
κόκκινο τετράγωνο ψευδές 6 64 77.1991 9.5310
μοβ τρίγωνο ψευδές 7 65 80.1405 5.8240
μοβ τετράγωνο ψευδές 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -84,4 +90,7 @@ Support for internationalization includes:
желтый КРУГ истина 8 73 63.9785 4.2370 6
желтый КРУГ истина 9 87 63.5058 8.3350 6
фиолетовый КВАДРАТ ложь 10 91 72.3735 8.2430 10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -27,6 +27,9 @@ In our examples so far we've often made use of `mlr --icsv --opprint` or `mlr --
color shape flag k index quantity rate
yellow triangle true 1 11 43.6498 9.8870
red square true 2 15 79.2778 0.0130
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -53,6 +56,9 @@ red square true 2 15 79.2778 0.0130
"rate": 0.0130
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can get the full list [here](file-formats.md#data-conversion-keystroke-savers).
@ -69,6 +75,9 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -79,6 +88,9 @@ shape quantity
square 72.3735
circle 63.5058
circle 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If there's more than one input file, you can use `--mfrom`, then however many file names, then `--` to indicate the end of your input-file-name list:

View file

@ -86,6 +86,9 @@ type hit_mean
A1 0.8571428571428571
A4 0.7142857142857143
A9 0.09090909090909091
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -102,6 +105,9 @@ time batch_size num_filtered time_delta num_filtered_delta
2016-09-02T12:35:20Z 100 554 7 61
2016-09-02T12:35:36Z 100 612 16 58
2016-09-02T12:35:42Z 100 728 6 116
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Alternatively, we can simply group the similar data for a better look:
@ -158,6 +164,9 @@ time batch_size num_filtered
1472819720 100 554
1472819736 100 612
1472819742 100 728
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -212,6 +221,9 @@ time batch_size num_filtered
2016-09-02T12:35:20Z 100 554
2016-09-02T12:35:36Z 100 612
2016-09-02T12:35:42Z 100 728
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Parsing log-file output

View file

@ -41,6 +41,9 @@ purple triangle false 7 65 80.1405 5.8240 466.738272
yellow circle true 8 73 63.9785 4.2370 271.0769045
yellow circle true 9 87 63.5058 8.3350 529.3208430000001
purple square false 10 91 72.3735 8.2430 596.5747605000001
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
When we type that, a few things are happening:
@ -69,6 +72,9 @@ purple triangle false 7 6500 80.1405 5.8240 466.738272
yellow circle true 8 7300 63.9785 4.2370 271.0769045
yellow circle true 9 8700 63.5058 8.3350 529.3208430000001
purple square false 10 9100 72.3735 8.2430 596.5747605000001
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -89,6 +95,9 @@ purple triangle false 7 6500 80.1405 5.8240 466.738272
yellow circle true 8 7300 63.9785 4.2370 271.0769045
yellow circle true 9 8700 63.5058 8.3350 529.3208430000001
purple square false 10 9100 72.3735 8.2430 596.5747605000001
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
One of Miller's key features is the ability to express data-transformation right there at the keyboard, interactively. But if you find yourself using expressions repeatedly, you can put everything between the single quotes into a file and refer to that using `put -f`:
@ -116,6 +125,9 @@ purple triangle false 7 6500 80.1405 5.8240 466.738272
yellow circle true 8 7300 63.9785 4.2370 271.0769045
yellow circle true 9 8700 63.5058 8.3350 529.3208430000001
purple square false 10 9100 72.3735 8.2430 596.5747605000001
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This becomes particularly important on Windows. Quite a bit of effort was put into making Miller on Windows be able to handle the kinds of single-quoted expressions we're showing here, but if you get syntax-error messages on Windows using examples in this documentation, you can put the parts between single quotes into a file and refer to that using `mlr put -f` -- or, use the triple-double-quote trick as described in the [Miller on Windows page](miller-on-windows.md).
@ -146,6 +158,9 @@ purple square false 10 91 72.3735 8.2430
sum
652.7185
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If you want the end-block output to be the only output, and not include the records from the input data, you can use `mlr put -q`:
@ -156,6 +171,9 @@ If you want the end-block output to be the only output, and not include the reco
<pre class="pre-non-highlight-in-pair">
sum
652.7185
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -167,6 +185,9 @@ sum
"sum": 652.7185
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -184,6 +205,9 @@ sum
"sum": 652.7185
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
We'll see in the documentation for [stats1](reference-verbs.md#stats1) that there's a lower-keystroking way to get counts and sums of things:
@ -198,6 +222,9 @@ We'll see in the documentation for [stats1](reference-verbs.md#stats1) that ther
"quantity_count": 10
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
So, take this sum/count example as an indication of the kinds of things you can do using Miller's programming language.
@ -249,6 +276,9 @@ a b c nf nr fnr filename filenum newnf
1 2 3 3 1 1 data/a.csv 1 8
4 5 6 3 2 2 data/a.csv 1 8
7 8 9 3 3 1 data/b.csv 2 8
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Functions and local variables
@ -283,6 +313,9 @@ purple triangle false 7 65 80.1405 5.8240 5040
yellow circle true 8 73 63.9785 4.2370 40320
yellow circle true 9 87 63.5058 8.3350 362880
purple square false 10 91 72.3735 8.2430 3628800
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that here we used the `-f` flag to `put` to load our function
@ -320,6 +353,9 @@ end {
<pre class="pre-non-highlight-in-pair">
count_of_red sum_of_red
4 247.84139999999996
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Miller's else-if is spelled `elif`.
@ -350,6 +386,9 @@ print
a,b,c
1,2,3
4,5,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -364,6 +403,9 @@ KEY IS a VALUE IS 4
KEY IS b VALUE IS 5
KEY IS c VALUE IS 6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Here we used the local variables `k` and `v`. Now we've seen four kinds of variables:
@ -416,6 +458,9 @@ For example, you can sum up all the `$a` values across records without having to
"b": 5
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -438,4 +483,7 @@ For example, you can sum up all the `$a` values across records without having to
"sum_of_a": 5
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -177,9 +177,14 @@ And, suppose you want to compute the differences in the counters between adjacen
First, rename counter columns to make them distinct:
<pre class="pre-highlight-non-pair">
<pre class="pre-highlight-in-pair">
<b>mlr --csv rename count,previous_count data/previous_counters.csv > data/prevtemp.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>cat data/prevtemp.csv</b>
@ -192,9 +197,14 @@ orange,694
purple,12
</pre>
<pre class="pre-highlight-non-pair">
<pre class="pre-highlight-in-pair">
<b>mlr --csv rename count,current_count data/current_counters.csv > data/currtemp.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>cat data/currtemp.csv</b>
@ -223,6 +233,9 @@ orange 694 670 -24
yellow 0 27 (error)
blue 6838 6944 106
purple 12 0 (error)
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also the [record-heterogeneity page](record-heterogeneity.md).

View file

@ -165,6 +165,9 @@ purple,square,false,10,91,72.3735,8.2430
yellow,triangle,true,1,11,43.6498,9.8870
purple,triangle,false,5,51,81.2290,8.5910
purple,triangle,false,7,65,80.1405,5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Data processing
@ -226,6 +229,9 @@ For example (see [https://github.com/johnkerl/miller/issues/178](https://github.
"a": "0123"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -238,6 +244,9 @@ For example (see [https://github.com/johnkerl/miller/issues/178](https://github.
"y": 1.230000000
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Deduping of repeated field names
@ -339,6 +348,9 @@ This works in Miller 6 (and worked in Miller 5 as well) and is supported:
</pre>
<pre class="pre-non-highlight-in-pair">
input=1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Please see the [section on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf)

View file

@ -40,6 +40,9 @@ a_b_c,def,g_h_i
123,4567,890
2468,1357,3579
9987,3312,4543
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -50,6 +53,9 @@ a_b_c def g_h_i
123 4567 890
2468 1357 3579
9987 3312 4543
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can also do this with a for-loop:
@ -73,6 +79,9 @@ a_b_c def g_h_i
123 4567 890
2468 1357 3579
9987 3312 4543
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Bulk rename of fields with carriage returns
@ -106,6 +115,9 @@ field A,field B
1,2
3,3
6,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Search-and-replace over all fields
@ -137,6 +149,9 @@ for (k in $*) {
a,b,c
thX quick,brown fox,jumpXd
ovXr,thX,lazy dogs
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Full field renames and reassigns
@ -177,4 +192,7 @@ z=0.758679,KEYFIELD=eks,i=3,b=pan,y=0.758679,x=0.522151
z=0.204603,KEYFIELD=wye,i=6,b=wye,y=0.204603,x=0.338318
z=0.381399,KEYFIELD=eks,i=10,b=wye,y=0.381399,x=0.134188
z=0.573288,KEYFIELD=wye,i=15,b=pan,y=0.573288,x=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -86,6 +86,9 @@ after all the input is read.
"sum": 119
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
And if all we want is the final output and not the input data, we can use `put
@ -111,6 +114,9 @@ And if all we want is the final output and not the input data, we can use `put
"sum": 119
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
As discussed a bit more on the page on [streaming processing and memory
@ -173,6 +179,9 @@ cat,54
"sum": 119
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The downside to this, of course, is that this retains all records (plus data-structure overhead) in memory, so you're limited to processing files that fit in your computer's memory. The upside, though, is that you can do random access over the records using things like
@ -232,6 +241,9 @@ The third option is to retain records in an [array](reference-main-arrays.md), t
"sum": 119
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Just as with the retain-as-map approach, the downside is the overhead of
@ -276,6 +288,9 @@ array will have [null-gaps](reference-main-arrays.md) in it:
]
[
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can index `@records` by `@count` rather than `NR` to get a contiguous array:
@ -319,6 +334,9 @@ You can index `@records` by `@count` rather than `NR` to get a contiguous array:
"sum": 91
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If you use a map to retain records, then this is a non-issue: maps can retain whatever values you like:
@ -360,6 +378,9 @@ If you use a map to retain records, then this is a non-issue: maps can retain wh
"sum": 91
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Do note that Miller [maps](reference-main-maps.md) preserve insertion order, so
@ -404,6 +425,9 @@ interested in:
"sum": 91
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Sorting

View file

@ -53,6 +53,9 @@ Robert,"Bob,Bobby,Biker","2,4,6"
"codes": "2,4,6"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Then we can use the [`splita`](reference-dsl-builtin-functions.md#splita) function to split the
@ -74,6 +77,9 @@ Then we can use the [`splita`](reference-dsl-builtin-functions.md#splita) functi
"codes": "2,4,6"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Likewise we can split the `codes` field. Since these look like numbers, we can again use `splita`
@ -97,6 +103,9 @@ substrings, with no type inference:
"codes": [2, 4, 6]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -115,6 +124,9 @@ substrings, with no type inference:
"codes": ["2", "4", "6"]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
We can do operations on the array, then use [joinv](reference-dsl-builtin-functions.md#joinv) to put them
@ -140,6 +152,9 @@ back together:
"codes": "200,400,600"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -153,6 +168,9 @@ back together:
name,nicknames,codes
Alice,"Allie,Skater","100,300,500"
Robert,"Bob,Bobby,Biker","200,400,600"
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The full list of split functions includes
@ -195,6 +213,9 @@ host,status
xy01.east,up
ab02.west,down
ac91.west,up
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Flatten/unflatten: representing arrays in CSV
@ -219,6 +240,9 @@ _flatten/unflatten strategy_: array-valued fields are turned into multiple CSV c
"codes": ["2", "4", "6"]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -228,6 +252,9 @@ _flatten/unflatten strategy_: array-valued fields are turned into multiple CSV c
name,nicknames,codes.1,codes.2,codes.3
Alice,"Allie,Skater",1,3,5
Robert,"Bob,Bobby,Biker",2,4,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See the [flatten/unflatten: converting between JSON and tabular formats¶](flatten-unflatten.md)
@ -279,6 +306,9 @@ stamp,event
"pieces": [5, 19, "07", 56]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -293,6 +323,9 @@ stamp event description
5-18:53:22 close 5 day(s) 18 hour(s) 53 minute(s) 22 seconds(s)
5-19:07:34 open 5 day(s) 19 hour(s) 07 minute(s) 34 seconds(s)
5-19:07:56 close 5 day(s) 19 hour(s) 07 minute(s) 56 seconds(s)
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Using regular expressions and capture groups
@ -312,6 +345,9 @@ stamp event description
5-18:53:22 close 5 day(s) 18 hour(s) 53 minute(s) 22 seconds(s)
5-19:07:34 open 5 day(s) 19 hour(s) 07 minute(s) 34 seconds(s)
5-19:07:56 close 5 day(s) 19 hour(s) 07 minute(s) 56 seconds(s)
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Special case: timestamps
@ -337,6 +373,9 @@ sec dhms
100 1m40s
10000 2h46m40s
1000000 11d13h46m40s
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Please see
@ -353,6 +392,9 @@ One way to handle currencies is to sub out the currency marker (like `$`) as wel
</pre>
<pre class="pre-non-highlight-in-pair">
d=1234.56
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Nesting and unnesting fields
@ -368,6 +410,9 @@ For example:
name nicknames codes
Alice Allie,Skater 1,3,5
Robert Bob,Bobby,Biker 2,4,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -380,6 +425,9 @@ Alice Skater 1,3,5
Robert Bob 2,4,6
Robert Bobby 2,4,6
Robert Biker 2,4,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See [documentation on the nest verb](reference-verbs.md#nest) for general information on how to do this.

View file

@ -16,6 +16,8 @@ Quick links:
</div>
# Performance
See also the [performance-benchmarks section](new-in-miller-6.md#performance-benchmarks).
## Disclaimer
In a previous version of this page, I compared Miller to some items in the Unix toolkit in terms of run time. But such comparisons are very much not apples-to-apples:

View file

@ -1,5 +1,7 @@
# Performance
See also the [performance-benchmarks section](new-in-miller-6.md#performance-benchmarks).
## Disclaimer
In a previous version of this page, I compared Miller to some items in the Unix toolkit in terms of run time. But such comparisons are very much not apples-to-apples:

View file

@ -89,6 +89,9 @@ end {
83
89
97
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Mandelbrot-set generator
@ -228,6 +231,9 @@ CHARS = @X*o-.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
But using a very small font size (as small as my Mac will let me go), and by choosing the coordinates to zoom in on a particular part of the complex plane, we can get a nice little picture:

View file

@ -30,6 +30,9 @@ hostname ipaddr
nadir.east.our.org 10.3.1.18
zenith.west.our.org 10.3.1.27
apoapsis.east.our.org 10.4.5.94
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -46,6 +49,9 @@ ipaddr timestamp bytes
10.3.1.27 1448762599 0
10.3.1.18 1448762598 73425
10.4.5.94 1448762599 12200
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -57,6 +63,9 @@ ipaddr hostname timestamp bytes
10.4.5.94 apoapsis.east.our.org 1448762579 17445
10.4.5.94 apoapsis.east.our.org 1448762589 8899
10.4.5.94 apoapsis.east.our.org 1448762599 12200
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The issue is that Miller's `join`, by default (before 5.1.0), took input sorted (lexically ascending) by the sort keys on both the left and right files. This design decision was made intentionally to parallel the Unix/Linux system `join` command, which has the same semantics. The benefit of this default is that the joiner program can stream through the left and right files, needing to load neither entirely into memory. The drawback, of course, is that is requires sorted input.
@ -77,6 +86,9 @@ ipaddr hostname timestamp bytes
10.3.1.27 zenith.west.our.org 1448762599 0
10.3.1.18 nadir.east.our.org 1448762598 73425
10.4.5.94 apoapsis.east.our.org 1448762599 12200
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
General advice is to make sure the left-file is relatively small, e.g. containing name-to-number mappings, while saving large amounts of data for the right file.
@ -107,6 +119,9 @@ Joining on color the results are as expected:
id,code,color
4,ff0000,red
2,00ff00,green
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
However, if we ask for left-unpaireds, since there's no `color` column, we get a row not having the same column names as the other:
@ -121,6 +136,9 @@ id,code,color
id,code
3,0000ff
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
To fix this, we can use **unsparsify**:
@ -135,6 +153,9 @@ id,code,color
4,ff0000,red
2,00ff00,green
3,0000ff,
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Thanks to @aborruso for the tip!
@ -199,4 +220,7 @@ id status name task
20 idle Carol mix
10 idle Bob knead
30 occupied Alice clean
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -45,6 +45,9 @@ paid cash 2
pending debit 1
pending credit 1
paid debit 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
After that, run it with the next `then` step included:
@ -59,6 +62,9 @@ paid cash 2
pending debit 1
pending credit 1
paid debit 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Now if you use `then` to include another verb after that, the columns `Status`, `Payment_Type`, and `count` will be the input to that verb.
@ -75,6 +81,12 @@ paid cash 2
pending debit 1
pending credit 1
paid debit 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## NR is not consecutive after then-chaining
@ -100,6 +112,9 @@ why don't I see `NR=1` and `NR=2` here??
<pre class="pre-non-highlight-in-pair">
a=eks,b=pan,i=2,x=0.758679,y=0.522151,NR=2
a=wye,b=pan,i=5,x=0.573288,y=0.863624,NR=5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The reason is that `NR` is computed for the original input records and isn't dynamically updated. By contrast, `NF` is dynamically updated: it's the number of fields in the current record, and if you add/remove a field, the value of `NF` will change:
@ -109,6 +124,9 @@ The reason is that `NR` is computed for the original input records and isn't dyn
</pre>
<pre class="pre-non-highlight-in-pair">
nf1=3,u=4,nf2=5,nf3=3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
`NR`, by contrast (and `FNR` as well), retains the value from the original input stream, and records may be dropped by a `filter` within a `then`-chain. To recover consecutive record numbers, you can use out-of-stream variables as follows:
@ -130,6 +148,9 @@ nf1=3,u=4,nf2=5,nf3=3
a b i x y nr1 nr2
eks pan 2 0.758679 0.522151 2 1
wye pan 5 0.573288 0.863624 5 2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Or, simply use `mlr cat -n`:
@ -140,4 +161,7 @@ Or, simply use `mlr cat -n`:
<pre class="pre-non-highlight-in-pair">
n=1,a=eks,b=pan,i=2,x=0.758679,y=0.522151
n=2,a=wye,b=pan,i=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -117,6 +117,9 @@ bin_lo bin_hi u_count s_count
1.88 1.92 [64]#...................[9554] [326]#...................[3703]
1.92 1.96 [64]#...................[9554] [326]#...................[3703]
1.96 2 [64]#...................[9554] [326]#...................[3703]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Randomly selecting words from a list

View file

@ -41,6 +41,9 @@ a,b,c
1,2,3
4,5,6
7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
It has three records (written here using JSON Lines formatting):
@ -52,6 +55,9 @@ It has three records (written here using JSON Lines formatting):
{"a": 1, "b": 2, "c": 3}
{"a": 4, "b": 5, "c": 6}
{"a": 7, "b": 8, "c": 9}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Here every row has the same keys, in the same order: `a,b,c`.
@ -66,6 +72,9 @@ a b c
1 2 3
4 5 6
7 8 9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Fillable data
@ -80,6 +89,9 @@ a,b,c
1,2,3
4,,6
,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -89,6 +101,9 @@ a,b,c
{"a": 1, "b": 2, "c": 3}
{"a": 4, "b": "", "c": 6}
{"a": "", "b": 8, "c": 9}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This example is still homogeneous, though: every row has the same keys, in the same order: `a,b,c`.
@ -105,6 +120,9 @@ a b c
1 2 3
4 filler 6
filler 8 9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Ragged data
@ -162,6 +180,9 @@ with 1) for too-long rows:
"4": 10
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Irregular data
@ -199,6 +220,9 @@ the keys:
{"a": 1, "b": 2, "c": 3}
{"a": 4, "b": 5, "c": 6}
{"a": 7, "b": 8, "c": 9}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The `regularize` verb tries to re-order subsequent rows to look like the first
@ -232,6 +256,9 @@ data for items which are present, but won't log data for items which aren't.
"reimaged": true
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This data is called **sparse** (from the [data-storage term](https://en.wikipedia.org/wiki/Sparse_matrix)).
@ -266,6 +293,9 @@ every record has the same keys:
"reimaged": true
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Since this data is now homogeneous (rectangular), it pretty-prints nicely:
@ -278,6 +308,9 @@ host status volume purpose reimaged
xy01.east running /dev/sda1 - -
xy92.west running - - -
xy55.east - /dev/sda1 failover true
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Reading and writing heterogeneous data
@ -317,6 +350,9 @@ For these formats, record-heterogeneity comes naturally:
xy01.east running /dev/sda1
xy92.west running
failover xy55.east /dev/sda1 true
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -334,6 +370,9 @@ purpose failover
host xy55.east
volume /dev/sda1
reimaged true
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -343,6 +382,9 @@ reimaged true
host=xy01.east,status=running,volume=/dev/sda1
host=xy92.west,status=running
purpose=failover,host=xy55.east,volume=/dev/sda1,reimaged=true
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Even then, we may wish to put like with like, using the [`group-like`](reference-verbs.md#group-like) verb:
@ -356,6 +398,9 @@ record_count=100,resource=/path/to/file
resource=/path/to/second/file,loadsec=0.32,ok=true
record_count=150,resource=/path/to/second/file
resource=/some/other/path,loadsec=0.97,ok=false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -367,6 +412,9 @@ resource=/path/to/second/file,loadsec=0.32,ok=true
resource=/some/other/path,loadsec=0.97,ok=false
record_count=100,resource=/path/to/file
record_count=150,resource=/path/to/second/file
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Rectangular file formats: CSV and pretty-print
@ -429,6 +477,9 @@ record_count resource
resource loadsec ok
/some/other/path 0.97 false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -443,6 +494,9 @@ resource loadsec ok
record_count resource
100 /path/to/file
150 /path/to/second/file
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Miller handles explicit header changes as just shown. If your CSV input contains ragged data -- if there are implicit header changes (no intervening blank line and new header line) as seen above -- you can use `--allow-ragged-csv-input` (or keystroke-saver `--ragged`).
@ -457,6 +511,9 @@ a,b,c
a,b,c,4
7,8,9,10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Processing heterogeneous data
@ -493,4 +550,7 @@ count=300,color=blue
count=450
count=500,color=green
count=600
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -37,6 +37,9 @@ PURPLE tr**ngl* false 7 65 80.1405 5.8240
YELLOW c*rcl* true 8 73 63.9785 4.2370
YELLOW c*rcl* true 9 87 63.5058 8.3350
PURPLE sq**r* false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
the `toupper` and `gsub` bits are _functions_.

View file

@ -29,6 +29,9 @@ x=0
x=1
x=2
x=3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -40,6 +43,9 @@ x=0
x=1,y=0,z=0
x=2,y=0.3010299956639812,z=0.5486620049392715
x=3,y=0.4771212547196624,z=0.6907396432228734
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -49,6 +55,9 @@ x=3,y=0.4771212547196624,z=0.6907396432228734
a=abc_123
a=some other name
a=xyz_789
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -62,6 +71,9 @@ a=xyz_789
a=abc_123,b=left_abc,c=right_123
a=some other name
a=xyz_789,b=left_xyz,c=right_789
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This produces heteregenous output which Miller, of course, has no problems with (see [Record Heterogeneity](record-heterogeneity.md)). But if you want homogeneous output, the curly braces can be replaced with a semicolon between the expression and the body statements. This causes `put` to evaluate the boolean expression (along with any side effects, namely, regex-captures `\1`, `\2`, etc.) but doesn't use it as a criterion for whether subsequent assignments should be executed. Instead, subsequent assignments are done unconditionally:
@ -78,6 +90,9 @@ a b c
abc_123 left_abc right_123
some other name left_ right_
xyz_789 left_xyz right_789
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that pattern-action blocks are just a syntactic variation of if-statements. The following do the same thing:
@ -136,6 +151,9 @@ Miller's `while` and `do-while` are unsurprising in comparison to various langua
</pre>
<pre class="pre-non-highlight-in-pair">
x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -151,6 +169,9 @@ x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
</pre>
<pre class="pre-non-highlight-in-pair">
x=1,y=2,3=,4=,5=,foo=bar
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
A `break` or `continue` within nested conditional blocks or if-statements will,
@ -219,6 +240,9 @@ NR = 5
key: i value: 5
key: x value: 0.573288
key: y value: 0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -234,6 +258,9 @@ NR = 5
<pre class="pre-non-highlight-in-pair">
key: a valuetype: int
key: b valuetype: map
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that the value corresponding to a given key may be gotten as through a **computed field name** using square brackets as in `$[e]` for stream records, or by indexing the looped-over variable using square brackets.
@ -256,6 +283,9 @@ value: 20 valuetype: string
value: {} valuetype: map
value: four valuetype: string
value: true valuetype: bool
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Key-value for-loops
@ -294,6 +324,9 @@ label1 label2 f1 f2 f3 sum1 sum2 sum3
blue green 100 240 350 690 690 690
red green 120 11 195 326 326 326
yellow blue 140 0 240 380 380 380
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -306,6 +339,9 @@ eks pan 2 0.758679 0.522151 string string int float float
wye wye 3 0.204603 0.338318 string string int float float
eks wye 4 0.381399 0.134188 string string int float float
wye pan 5 0.573288 0.863624 string string int float float
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that the value of the current field in the for-loop can be gotten either using the bound variable `value`, or through a **computed field name** using square brackets as in `$[key]`.
@ -331,6 +367,9 @@ eks pan 2 0.758679 0.522151 3.28083 13.12332
wye wye 3 0.204603 0.338318 3.542921 14.171684
eks wye 4 0.381399 0.134188 4.515587 18.062348
wye pan 5 0.573288 0.863624 6.4369119999999995 25.747647999999998
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
It can be confusing to modify the stream record while iterating over a copy of it, so instead you might find it simpler to use a local variable in the loop and only update the stream record after the loop:
@ -353,6 +392,9 @@ eks pan 2 0.758679 0.522151 3.28083
wye wye 3 0.204603 0.338318 3.542921
eks wye 4 0.381399 0.134188 4.515587
wye pan 5 0.573288 0.863624 6.4369119999999995
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can also start iterating on sub-maps of an out-of-stream or local variable; you can loop over nested keys; you can loop over all out-of-stream variables. The bound variables are bound to a copy of the sub-map as it was before the loop started. The sub-map is specified by square-bracketed indices after `in`, and additional deeper indices are bound to loop key-variables. The terminal values are bound to the loop value-variable whenever the keys are not too shallow. The value-variable may refer to a terminal (string, number) or it may be map-valued if the map goes deeper. Example indexing is as follows:
@ -396,6 +438,9 @@ That's confusing in the abstract, so a concrete example is in order. Suppose the
}
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Then we can get at various values as follows:
@ -422,6 +467,9 @@ Then we can get at various values as follows:
key=1,valuetype=int
key=3,valuetype=map
key=6,valuetype=map
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -446,6 +494,9 @@ key=6,valuetype=map
<pre class="pre-non-highlight-in-pair">
key1=3,key2=4,valuetype=int
key1=6,key2=7,valuetype=map
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -469,6 +520,9 @@ key1=6,key2=7,valuetype=map
</pre>
<pre class="pre-non-highlight-in-pair">
key1=7,key2=8,valuetype=int
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### C-style triple-for loops
@ -491,6 +545,9 @@ eks pan 2 0.758679 0.522151 3
wye wye 3 0.204603 0.338318 6
eks wye 4 0.381399 0.134188 10
wye pan 5 0.573288 0.863624 15
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -512,6 +569,9 @@ eks pan 2 0.758679 0.522151 3 3
wye wye 3 0.204603 0.338318 6 7
eks wye 4 0.381399 0.134188 10 15
wye pan 5 0.573288 0.863624 15 31
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Notes:
@ -544,6 +604,9 @@ a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
x_sum=2.26476
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Since uninitialized out-of-stream variables default to 0 for addition/subtraction and 1 for multiplication when they appear on expression right-hand sides (not quite as in `awk`, where they'd default to 0 either way), the above can be written more succinctly as
@ -561,6 +624,9 @@ a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
x_sum=2.26476
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The **put -q** option suppresses printing of each output record, with only `emit` statements being output. So to get only summary outputs, you could write
@ -573,6 +639,9 @@ The **put -q** option suppresses printing of each output record, with only `emit
</pre>
<pre class="pre-non-highlight-in-pair">
x_sum=2.26476
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
We can do similarly with multiple out-of-stream variables:
@ -590,6 +659,9 @@ We can do similarly with multiple out-of-stream variables:
<pre class="pre-non-highlight-in-pair">
x_count=5
x_sum=2.26476
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This is of course (see also [here](reference-dsl.md#verbs-compared-to-dsl)) not much different than
@ -599,6 +671,9 @@ This is of course (see also [here](reference-dsl.md#verbs-compared-to-dsl)) not
</pre>
<pre class="pre-non-highlight-in-pair">
x_count=5,x_sum=2.26476
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that it's a syntax error for begin/end blocks to refer to field names (beginning with `$`), since begin/end blocks execute outside the context of input records.

View file

@ -44,7 +44,7 @@ semicolon where one is needed . The parser tries to remind you about semicolons
whenever there's a chance a missing semicolon might be involved in a parse
error.
<pre class="pre-highlight-non-pair">
<pre class="pre-highlight-in-pair">
<b>mlr --csv --from example.csv put -q '</b>
<b> begin {</b>
<b> @count = 0 # No semicolon required -- before closing curly brace</b>
@ -52,6 +52,11 @@ error.
<b> $x=1 # No semicolon required -- at end of expression</b>
<b>'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --csv --from example.csv put -q '</b>
@ -171,6 +176,9 @@ avoid this, use the dot operator for string-concatenation instead.
<pre class="pre-non-highlight-in-pair">
[ a b c ]
[abc]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Similarly, a final newline is printed for you; use [`printn`](reference-dsl-output-statements.md#print-statements) to avoid this.
@ -222,6 +230,9 @@ word,value
apple,37
ball,28
cat,54
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -238,6 +249,9 @@ cat,54
Record 1 has word apple
Record 2 has word ball
Record 3 has word cat
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Also, slices for arrays and strings are _doubly inclusive_: `x[3:5]` gets you

View file

@ -25,6 +25,9 @@ You can use the `filter` DSL keyword within the `put` verb. In fact, the followi
color,shape,flag,k,index,quantity,rate
red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -34,6 +37,9 @@ red,circle,true,3,16,13.8103,2.9010
color,shape,flag,k,index,quantity,rate
red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The former, of course, is a little easier to type. For another example:
@ -46,6 +52,9 @@ color,shape,flag,k,index,quantity,rate
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -56,4 +65,7 @@ color,shape,flag,k,index,quantity,rate
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -78,6 +78,9 @@ Evens:
Odds:
[9, 3, 1, 5, 7]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Map examples:
@ -119,6 +122,9 @@ Values with last digit >= 5:
"apple": 199,
"bottle": 107
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## apply
@ -169,6 +175,9 @@ Cubes:
Sorted cubes:
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -228,6 +237,9 @@ Same, with upcased keys:
"DALE": 2197,
"EMBER": 6967871
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## reduce
@ -292,6 +304,9 @@ Product of values:
Concatenation of values:
2,9,10,3,1,4,5,8,7,6
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -366,6 +381,9 @@ String-join of values:
{
"joined": "823,13,199,191,107"
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## fold
@ -409,6 +427,9 @@ Sum with fold and 0 initial value:
Sum with fold and 1000000 initial value:
1000055
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -465,6 +486,9 @@ Sum of values with fold and 1000000 initial value:
{
"sum": 1001333
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## sort
@ -519,6 +543,9 @@ Ascending:
Descending:
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Map examples:
@ -610,6 +637,9 @@ Descending by value:
"bottle": 107,
"dale": 13
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Please see the [sorting page](sorting.md) for more examples.
@ -633,6 +663,9 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -645,6 +678,9 @@ red circle true 3 16 13.8103 2.9010
red square false 4 48 77.5542 7.4670
red square false 6 64 77.1991 9.5310
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -655,6 +691,9 @@ color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
red square false 4 48 77.5542 7.4670
red square false 6 64 77.1991 9.5310
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -672,6 +711,9 @@ purple triangle false 7 65 80.1405 5.8240 false
yellow circle true 8 73 63.9785 4.2370 false
yellow circle true 9 87 63.5058 8.3350 false
purple square false 10 91 72.3735 8.2430 false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -682,6 +724,9 @@ color shape flag k index quantity rate
red circle true 3 16 13.8103 2.9010
purple triangle false 5 51 81.2290 8.5910
red square false 6 64 77.1991 9.5310
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This last example could also be done using a map:
@ -699,6 +744,9 @@ color shape flag k index quantity rate
red circle true 3 16 13.8103 2.9010
purple triangle false 5 51 81.2290 8.5910
red square false 6 64 77.1991 9.5310
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Combined examples
@ -722,6 +770,9 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -770,6 +821,9 @@ Sorted, then cubed:
Sorted, then cubed, then summed:
2589905
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Caveats
@ -792,6 +846,9 @@ instead of
</pre>
<pre class="pre-non-highlight-in-pair">
[3, 4, 5]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### No IIFEs
@ -831,6 +888,9 @@ but this does:
</pre>
<pre class="pre-non-highlight-in-pair">
2187
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
### Built-in functions currently unsupported as arguments
@ -871,4 +931,7 @@ but this does:
</pre>
<pre class="pre-non-highlight-in-pair">
[1, 0.9238795325112867, 0.7071067811865476, 0.38268343236508984]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -111,6 +111,9 @@ bar.baz
bar.baz
[
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This also works on the left-hand sides of assignment statements:
@ -144,6 +147,9 @@ This also works on the left-hand sides of assignment statements:
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
A few caveats:
@ -159,6 +165,9 @@ A few caveats:
6989
[
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* However (awkwardly), if you want to use `.` for map-traversal as well as string-concatenation in the same statement, you'll need to insert parentheses, as the default associativity is left-to-right:
@ -172,6 +181,9 @@ A few caveats:
(error)
[
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -183,4 +195,7 @@ A few caveats:
GET -- api/check
[
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -102,11 +102,19 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-non-pair">
<pre class="pre-highlight-in-pair">
<b>mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --csv cat circle.csv</b>
@ -116,6 +124,9 @@ color,shape,flag,k,index,quantity,rate
red,circle,true,3,16,13.8103,2.9010
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -127,6 +138,9 @@ red,square,true,2,15,79.2778,0.0130
red,square,false,4,48,77.5542,7.4670
red,square,false,6,64,77.1991,9.5310
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -137,6 +151,9 @@ color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
purple,triangle,false,5,51,81.2290,8.5910
purple,triangle,false,7,65,80.1405,5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also [Redirected-output statements](reference-dsl-output-statements.md#redirected-output-statements) for examples.
@ -384,6 +401,9 @@ id color shape flag k index quantity rate
8 yellow circle true 8 73 63.9785 4.2370
9 yellow circle true 9 87 63.5058 8.3350
10 purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
And if you want indexing, redirects, etc., just assign to a temporary variable and use one of the other emit variants:
@ -406,6 +426,9 @@ id color shape flag k index quantity rate
8 yellow circle true 8 73 63.9785 4.2370
9 yellow circle true 9 87 63.5058 8.3350
10 purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Emitf statements
@ -422,6 +445,9 @@ Use **emitf** to output several out-of-stream variables side-by-side in the same
</pre>
<pre class="pre-non-highlight-in-pair">
count=5,x_sum=2.26476,y_sum=2.585083
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Emit statements
@ -446,6 +472,9 @@ a=wye,b=pan,i=5,x=0.573288,y=0.863624
{
"sum": 2.26476
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -453,6 +482,9 @@ a=wye,b=pan,i=5,x=0.573288,y=0.863624
</pre>
<pre class="pre-non-highlight-in-pair">
sum=2.26476
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If it's indexed then use as many names after `emit` as there are indices:
@ -468,6 +500,9 @@ If it's indexed then use as many names after `emit` as there are indices:
"wye": 0.777891
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -477,6 +512,9 @@ If it's indexed then use as many names after `emit` as there are indices:
a=pan,sum=0.346791
a=eks,sum=1.140078
a=wye,sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -498,6 +536,9 @@ a=wye,sum=0.777891
}
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -509,6 +550,9 @@ a=eks,b=pan,sum=0.758679
a=eks,b=wye,sum=0.381399
a=wye,b=wye,sum=0.204603
a=wye,b=pan,sum=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -540,6 +584,9 @@ a=wye,b=pan,sum=0.573288
}
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -554,6 +601,9 @@ a=eks,b=pan,i=2,sum=0.758679
a=eks,b=wye,i=4,sum=0.381399
a=wye,b=wye,i=3,sum=0.204603
a=wye,b=pan,i=5,sum=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Now for **emitp**: if you have as many names following `emit` as there are levels in the out-of-stream variable's map, then `emit` and `emitp` do the same thing. Where they differ is when you don't specify as many names as there are map levels. In this case, Miller needs to flatten multiple map indices down to output-record keys: `emitp` includes full prefixing (hence the `p` in `emitp`) while `emit` takes the deepest map key as the output-record key:
@ -577,6 +627,9 @@ Now for **emitp**: if you have as many names following `emit` as there are level
}
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -586,6 +639,9 @@ Now for **emitp**: if you have as many names following `emit` as there are level
a=pan,pan=0.346791
a=eks,pan=0.758679,wye=0.381399
a=wye,wye=0.204603,pan=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -595,6 +651,9 @@ a=wye,wye=0.204603,pan=0.573288
pan=0.346791
pan=0.758679,wye=0.381399
wye=0.204603,pan=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -604,6 +663,9 @@ wye=0.204603,pan=0.573288
a=pan,sum.pan=0.346791
a=eks,sum.pan=0.758679,sum.wye=0.381399
a=wye,sum.wye=0.204603,sum.pan=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -611,6 +673,9 @@ a=wye,sum.wye=0.204603,sum.pan=0.573288
</pre>
<pre class="pre-non-highlight-in-pair">
sum.pan.pan=0.346791,sum.eks.pan=0.758679,sum.eks.wye=0.381399,sum.wye.wye=0.204603,sum.wye.pan=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -622,6 +687,9 @@ sum.eks.pan 0.758679
sum.eks.wye 0.381399
sum.wye.wye 0.204603
sum.wye.pan 0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Use **--flatsep** to specify the character which joins multilevel
@ -634,6 +702,9 @@ keys for `emitp` (it defaults to a colon):
a=pan,sum/pan=0.346791
a=eks,sum/pan=0.758679,sum/wye=0.381399
a=wye,sum/wye=0.204603,sum/pan=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -641,6 +712,9 @@ a=wye,sum/wye=0.204603,sum/pan=0.573288
</pre>
<pre class="pre-non-highlight-in-pair">
sum/pan/pan=0.346791,sum/eks/pan=0.758679,sum/eks/wye=0.381399,sum/wye/wye=0.204603,sum/wye/pan=0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -655,6 +729,9 @@ sum/eks/pan 0.758679
sum/eks/wye 0.381399
sum/wye/wye 0.204603
sum/wye/pan 0.573288
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Multi-emit statements
@ -701,6 +778,9 @@ hat zee 196.3494502965293 385 0.5099985721987774
hat eks 189.0067933716193 389 0.48587864619953547
hat hat 182.8535323148762 381 0.47993053101017374
hat pan 168.5538067327806 363 0.4643355557376876
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
What this does is walk through the first out-of-stream variable (`@x_sum` in this example) as usual, then for each keylist found (e.g. `pan,wye`), include the values for the remaining out-of-stream variables (here, `@x_count` and `@x_mean`). You should use this when all out-of-stream variables in the emit statement have **the same shape and the same keylists**.
@ -723,6 +803,9 @@ eks pan 0.758679 1
eks wye 0.381399 1
wye wye 0.204603 1
wye pan 0.573288 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -746,6 +829,9 @@ eks pan 1
eks wye 1
wye wye 1
wye pan 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -762,4 +848,7 @@ eks pan 0.758679 1
eks wye 0.381399 1
wye wye 0.204603 1
wye pan 0.573288 1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -35,6 +35,9 @@ i j k
7 8 15
8 9 17
9 10 19
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Newlines within the expression are ignored, which can help increase legibility of complex expressions:
@ -60,6 +63,9 @@ wye eks 10000 0.734806020620654365 0.884788571337605134 5 7 2 2 data/s
pan wye 10001 0.870530722602517626 0.009854780514656930 5 8 3 2 data/small2
hat wye 10002 0.321507044286237609 0.568893318795083758 5 9 4 2 data/small2
pan zee 10003 0.272054845593895200 0.425789896597056627 5 10 5 2 data/small2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -70,6 +76,9 @@ pan zee 10003 0.272054845593895200 0.425789896597056627 5 10 5 2 data/s
<pre class="pre-non-highlight-in-pair">
x_y_corr
-0.7479940285189345
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Expressions from files
@ -85,6 +94,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,xy=0.9209970096813562
a=wye,b=wye,i=3,x=0.204603,y=0.338318,xy=0.3953750836016352
a=eks,b=wye,i=4,x=0.381399,y=0.134188,xy=0.40431623334340655
a=wye,b=pan,i=5,x=0.573288,y=0.863624,xy=1.036583592538489
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -96,6 +108,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,xy=0.9209970096813562
a=wye,b=wye,i=3,x=0.204603,y=0.338318,xy=0.3953750836016352
a=eks,b=wye,i=4,x=0.381399,y=0.134188,xy=0.40431623334340655
a=wye,b=pan,i=5,x=0.573288,y=0.863624,xy=1.036583592538489
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You may, though, find it convenient to put expressions into files for reuse, and read them
@ -120,6 +135,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,xy=0.9209970096813562
a=wye,b=wye,i=3,x=0.204603,y=0.338318,xy=0.3953750836016352
a=eks,b=wye,i=4,x=0.381399,y=0.134188,xy=0.40431623334340655
a=wye,b=pan,i=5,x=0.573288,y=0.863624,xy=1.036583592538489
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If you have some of the logic in a file and you want to write the rest on the command line, you can **use the -f and -e options together**:
@ -142,6 +160,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,xy=0.9209970096813562
a=wye,b=wye,i=3,x=0.204603,y=0.338318,xy=0.3953750836016352
a=eks,b=wye,i=4,x=0.381399,y=0.134188,xy=0.40431623334340655
a=wye,b=pan,i=5,x=0.573288,y=0.863624,xy=1.036583592538489
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
A suggested use-case here is defining functions in files, and calling them from command-line expressions.
@ -168,6 +189,9 @@ Semicolons are optional after closing curly braces (which close conditionals and
</pre>
<pre class="pre-non-highlight-in-pair">
x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -175,6 +199,9 @@ x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
</pre>
<pre class="pre-non-highlight-in-pair">
x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Semicolons are required between statements even if those statements are on separate lines. **Newlines** are for your convenience but have no syntactic meaning: line endings do not terminate statements. For example, adjacent assignment statements must be separated by semicolons even if those statements are on separate lines:
@ -216,6 +243,9 @@ mlr put '
s,t,u,v
3,-1,5,1
9,-1,41,2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Bodies for all compound statements must be enclosed in **curly braces**, even if the body is a single statement:

View file

@ -56,6 +56,9 @@ treating epoch-milliseconds as epoch-seconds.
<pre class="pre-non-highlight-in-pair">
2017-07-14T02:40:00Z
49503-02-10T02:40:00Z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can get the current system time, as epoch-seconds, using the
@ -113,6 +116,9 @@ We also have [sec2gmtdate](reference-dsl-builtin-functions.md#sec2gmtdate) DSL f
1970-01-01
2009-02-13
1930-11-18
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Local times with standard format; specifying timezones
@ -145,6 +151,9 @@ mlr : unknown time zone This/Is/A/Typo
</pre>
<pre class="pre-non-highlight-in-pair">
1970-01-01 02:00:00
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -152,6 +161,9 @@ mlr : unknown time zone This/Is/A/Typo
</pre>
<pre class="pre-non-highlight-in-pair">
1969-12-31 21:00:00
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -175,6 +187,9 @@ mlr : unknown time zone This/Is/A/Typo
1969-12-31 21:00:00
1969-12-31
946789445
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -196,6 +211,9 @@ mlr : unknown time zone This/Is/A/Typo
1969-12-31 21:00:00
1969-12-31
946789445
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that for local times, Miller omits the `T` and the `Z` you see in GMT times.
@ -214,6 +232,9 @@ We also have the
<pre class="pre-non-highlight-in-pair">
1970-01-01 02:00:00
1969-12-31T22:00:00Z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -229,6 +250,9 @@ We also have the
1970-01-01 02:00:00
1970-01-01T03:00:00Z
1969-12-31T22:00:00Z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Custom formats: strptime and strftime
@ -322,6 +346,9 @@ Examples:
<pre class="pre-non-highlight-in-pair">
1970-01-01T00:00:00Z
1970-01-01T00:00:00Z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -340,6 +367,9 @@ Examples:
1970-01-01 00:00:00 +0000
Thursday, January 1, 1970
09:33 PM
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Unfortunately, names from `%A` and `%B` are only available in English, as an artifact of a design
@ -376,6 +406,9 @@ For historical reasons, Miller's `strftime` and `strptime` use different format
1970-01-02 10:17:36.789000
(error)
123456.789
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## strptime_local and strftime_local
@ -409,6 +442,9 @@ Wednesday, December 31, 1969
1970-01-01 08:00:00 +0800
Thursday, January 1, 1970
1582992000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -434,6 +470,9 @@ Wednesday, December 31, 1969
1970-01-01 08:00:00 +0800
Thursday, January 1, 1970
1582992000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Relative times

View file

@ -38,6 +38,9 @@ b=pan,i=2,y=0.522151
b=wye,i=3,y=0.338318
b=wye,i=4,y=0.134188
b=pan,i=5,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This can also be done, of course, using `mlr cut -x`. You can also clear out-of-stream or local variables, at the base name level, or at an indexed sublevel:
@ -62,6 +65,9 @@ This can also be done, of course, using `mlr cut -x`. You can also clear out-of-
}
}
{}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -94,6 +100,9 @@ This can also be done, of course, using `mlr cut -x`. You can also clear out-of-
}
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If you use `unset all` (or `unset @*` which is synonymous), that will unset all out-of-stream variables which have been assigned up to that point.

View file

@ -45,6 +45,9 @@ eks pan 2 0.758679 0.522151 3.6808304227112796 2
wye wye 3 0.204603 0.338318 1.7412477437471126 6
eks wye 4 0.381399 0.134188 18.588317372151177 24
wye pan 5 0.573288 0.863624 211.38663947090302 120
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Properties of user-defined functions:
@ -99,6 +102,9 @@ NR=4
numcalls=10
NR=5
numcalls=15
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Properties of user-defined subroutines:
@ -179,6 +185,9 @@ purple triangle false 7 65 80.1405 5.8240 purple:triangle
yellow circle true 8 73 63.9785 4.2370 yellow:circle
yellow circle true 9 87 63.5058 8.3350 yellow:circle
purple square false 10 91 72.3735 8.2430 purple:square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -205,6 +214,9 @@ purple triangle false 7 65 80.1405 5.8240 purple:triangle above
yellow circle true 8 73 63.9785 4.2370 yellow:circle above
yellow circle true 9 87 63.5058 8.3350 yellow:circle above
purple square false 10 91 72.3735 8.2430 purple:square above
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that you need a semicolon after the closing curly brace of the function literal.
@ -238,6 +250,9 @@ purple triangle false 7 65 80.1405 5.8240 purple:triangle above
yellow circle true 8 73 63.9785 4.2370 yellow:circle above
yellow circle true 9 87 63.5058 8.3350 yellow:circle above
purple square false 10 91 72.3735 8.2430 purple:square above
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See the [page on higher-order functions](reference-dsl-higher-order-functions.md) for more.

View file

@ -36,15 +36,23 @@ If field names have **special characters** such as `.` then you can use braces,
You may also use a **computed field name** in square brackets, e.g.
<pre class="pre-highlight-non-pair">
<pre class="pre-highlight-in-pair">
<b>echo a=3,b=4 | mlr filter '$["x"] < 0.5'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>echo s=green,t=blue,a=3,b=4 | mlr put '$[$s."_".$t] = $a * $b'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
s=green,t=blue,a=3,b=4,green_blue=12
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Notes:
@ -74,6 +82,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -85,6 +96,9 @@ a=eks,b=pan,NEW=2,x=0.758679,y=0.522151
a=wye,b=wye,NEW=3,x=0.204603,y=0.338318
a=eks,b=wye,NEW=4,x=0.381399,y=0.134188
a=wye,b=pan,NEW=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -96,6 +110,9 @@ a=eks,b=pan,i=NEW,x=0.758679,y=0.522151
a=wye,b=wye,i=NEW,x=0.204603,y=0.338318
a=eks,b=wye,i=NEW,x=0.381399,y=0.134188
a=wye,b=pan,i=NEW,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -107,6 +124,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,NEW=b
a=wye,b=wye,i=3,x=0.204603,y=0.338318,NEW=i
a=eks,b=wye,i=4,x=0.381399,y=0.134188,NEW=x
a=wye,b=pan,i=5,x=0.573288,y=0.863624,NEW=y
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -118,6 +138,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,NEW=pan
a=wye,b=wye,i=3,x=0.204603,y=0.338318,NEW=3
a=eks,b=wye,i=4,x=0.381399,y=0.134188,NEW=0.381399
a=wye,b=pan,i=5,x=0.573288,y=0.863624,NEW=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -129,6 +152,9 @@ a=eks,b=NEW,i=2,x=0.758679,y=0.522151
a=wye,b=wye,i=NEW,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=NEW,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=NEW
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Right-hand side accesses to non-existent fields -- i.e. with index less than 1 or greater than `NF` -- return an absent value. Likewise, left-hand side accesses only refer to fields which already exist. For example, if a field has 5 records then assigning the name or value of the 6th (or 600th) field results in a no-op.
@ -142,6 +168,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -153,6 +182,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Out-of-stream variables
@ -170,6 +202,9 @@ You may use a **computed key** in square brackets, e.g.
</pre>
<pre class="pre-non-highlight-in-pair">
green_blue=12
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Out-of-stream variables are **scoped** to the `put` command in which they appear. In particular, if you have two or more `put` commands separated by `then`, each put will have its own set of out-of-stream variables:
@ -192,6 +227,9 @@ a=10,b=2,c=3
a=40,b=5,c=6
sum=5
sum=50
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Out-of-stream variables' **extent** is from the start to the end of the record stream, i.e. every time the `put` or `filter` statement referring to them is executed.
@ -219,6 +257,9 @@ a=wye,x_count=2
a=pan,x_sum=0.346791
a=eks,x_sum=1.140078
a=wye,x_sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -228,6 +269,9 @@ a=wye,x_sum=0.777891
a=pan,x_count=1,x_sum=0.346791
a=eks,x_count=2,x_sum=1.140078
a=wye,x_count=2,x_sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Indices can be arbitrarily deep -- here there are two or more of them:
@ -267,6 +311,9 @@ a=hat,b=zee,x_count=385,x_sum=196.3494502965293
a=hat,b=eks,x_count=389,x_sum=189.0067933716193
a=hat,b=hat,x_count=381,x_sum=182.8535323148762
a=hat,b=pan,x_count=363,x_sum=168.5538067327806
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The idea is that `stats1`, and other Miller verbs, encapsulate frequently-used patterns with a minimum of keystroking (and run a little faster), whereas using out-of-stream variables you have more flexibility and control in what you do.
@ -296,6 +343,9 @@ x=1,y=0,z=0
x=2,y=0.3010299956639812,z=0.5486620049392715
x=3,y=0.4771212547196624,z=0.6907396432228734
num_total=5,num_positive=3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Local variables
@ -333,6 +383,9 @@ i=7,o=13.966128063060479
i=8,o=13.99248245928659
i=9,o=15.784270485515197
i=10,o=15.37686787628025
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Things which are completely unsurprising, resembling many other languages:
@ -424,6 +477,9 @@ inner_d 70
outer_a 10
outer_b 50
outer_c 60
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
And this example demonstrates the type-declaration rules:
@ -494,6 +550,9 @@ a i y
3 wye 3.3831800000000003
4 eks 1.34188
5 wye 8.636239999999999
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Likewise, you can assign map literals to out-of-stream variables or local variables; pass them as arguments to user-defined functions, return them from functions, and so on:
@ -513,6 +572,9 @@ a=eks,x=151.7358
a=wye,x=40.9206
a=eks,x=76.2798
a=wye,x=114.6576
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Like out-of-stream and local variables, map literals can be multi-level:
@ -546,6 +608,9 @@ Like out-of-stream and local variables, map literals can be multi-level:
"non-numeric": 10
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also the [Maps page](reference-main-maps.md).
@ -573,6 +638,9 @@ read/write access to environment variables, e.g. `ENV["HOME"]` or
a=eks,b=pan,i=2,x=0.758679,y=0.522151
1=pan,2=pan,3=1,4=0.3467901443380824,5=0.7268028627434533
a=wye,b=eks,i=10000,x=0.734806020620654365,y=0.884788571337605134
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -595,6 +663,9 @@ a=wye,b=eks,i=10000,x=0.734806020620654365,y=0.884788571337605134,fnr=2
a=pan,b=wye,i=10001,x=0.870530722602517626,y=0.009854780514656930,fnr=3
a=hat,b=wye,i=10002,x=0.321507044286237609,y=0.568893318795083758,fnr=4
a=pan,b=zee,i=10003,x=0.272054845593895200,y=0.425789896597056627,fnr=5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Their values of `NF`, `NR`, `FNR`, `FILENUM`, and `FILENAME` change from one
@ -613,6 +684,9 @@ Their **scope is global**: you can refer to them in any `filter` or `put` statem
a,b,c,nr
1,2,3,1
4,5,6,2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -626,6 +700,9 @@ a,b,c,nr
4,5,6,2
4,5,6,2
4,5,6,2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The **extent** is for the duration of the put/filter: in a `begin` statement (which executes before the first input record is consumed) you will find `NR=1` and in an `end` statement (which is executed after the last input record is consumed) you will find `NR` to be the total number of records ingested.
@ -839,6 +916,9 @@ Example recursive copy of out-of-stream variables:
"count": 5
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Example of out-of-stream variable assigned to full stream record, where the 2nd record is stashed, and the 4th record is overwritten with that:
@ -852,6 +932,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=wye,b=wye,i=3,x=0.204603,y=0.338318
a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=wye,b=pan,i=5,x=0.573288,y=0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Example of full stream record assigned to an out-of-stream variable, finding the record for which the `x` field has the largest value in the input stream:
@ -876,6 +959,9 @@ a=wye,b=pan,i=5,x=0.573288,y=0.863624
<pre class="pre-non-highlight-in-pair">
a b i x y
eks pan 2 0.758679 0.522151
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Keywords for filter and put

View file

@ -39,6 +39,9 @@ Example:
a=pan,x_sum=0.346791
a=eks,x_sum=1.140078
a=wye,x_sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* Verbs are coded in Go
@ -56,6 +59,9 @@ Example:
a=pan,x_sum=0.346791
a=eks,x_sum=1.140078
a=wye,x_sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* You get to write your own DSL expressions
@ -120,6 +126,9 @@ apple,37,1
ball,28,2
cat,54,3
end
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The `print` statements for `begin` and `end` went out before the first record
@ -159,6 +168,9 @@ you might retain only the records whose `a` field has value `eks`:
<pre class="pre-non-highlight-in-pair">
a=eks,b=pan,i=2,x=0.758679,y=0.522151
a=eks,b=wye,i=4,x=0.381399,y=0.134188
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
or you might add a new field which is a function of existing fields:
@ -172,6 +184,9 @@ a=eks,b=pan,i=2,x=0.758679,y=0.522151,ab=eks_pan
a=wye,b=wye,i=3,x=0.204603,y=0.338318,ab=wye_wye
a=eks,b=wye,i=4,x=0.381399,y=0.134188,ab=eks_wye
a=wye,b=pan,i=5,x=0.573288,y=0.863624,ab=wye_pan
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Differences between put and filter
@ -206,6 +221,9 @@ purple triangle false 5 51 81.2290 8.5910 high rate
red square false 6 64 77.1991 9.5310 high rate
purple triangle false 7 65 80.1405 5.8240 low rate
purple square false 10 91 72.3735 8.2430 high rate
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -227,6 +245,9 @@ red square false 6 64 77.1991 9.5310 squ are
yellow circle true 8 73 63.9785 4.2370 cir cle
yellow circle true 9 87 63.5058 8.3350 cir cle
purple square false 10 91 72.3735 8.2430 squ are
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -46,6 +46,9 @@ Array literals are written in square brackets braces with integer indices. Array
99,
true
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
As with maps and argument-lists, trailing commas are supported:
@ -64,6 +67,9 @@ As with maps and argument-lists, trailing commas are supported:
</pre>
<pre class="pre-non-highlight-in-pair">
["a", "b", "c"]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Also note that several [built-in functions](reference-dsl-builtin-functions.md) operate on arrays and/or return arrays.
@ -108,6 +114,9 @@ while positive indices read forward from the start. If an array has length `n` t
50
[10, 20]
[40, 50]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Slicing
@ -135,6 +144,9 @@ x[4], x[5]]`.
[30, 40, 50]
[10, 20, 30, 40, 50]
[20, 30, 40]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Out-of-bounds indexing
@ -157,6 +169,9 @@ behavior intentionally imitates Python.)
10
50
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -173,6 +188,9 @@ behavior intentionally imitates Python.)
[10, 20]
[10, 20, 30, 40, 50]
[]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Auto-create results in maps
@ -197,6 +215,9 @@ as-yet-assigned local variable or out-of-stream variable results in
"square": 8.2430,
"circle": 8.3350
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
*This also means that auto-create results in maps, not arrays, even if keys are integers.*
@ -224,6 +245,9 @@ If you want to auto-extend an [array](reference-main-arrays.md), initialize it e
"4": 7.4670
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Auto-extend and null-gaps
@ -262,6 +286,9 @@ are called **null-gaps**.
<pre class="pre-non-highlight-in-pair">
["a", "b"]
["a", null, null, null, "e"]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Unset as shift
@ -281,6 +308,9 @@ Unsetting an array index results in shifting all higher-index elements down by o
<pre class="pre-non-highlight-in-pair">
["a", "b", "c", "d", "e"]
["a", "c", "d", "e"]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
More generally, you can get shift and pop operations by unsetting indices 1 and -1:

View file

@ -46,6 +46,9 @@ red,square,false,6,64,77.1991,9.5310
yellow,triangle,true,1,11,43.6498,9.8870
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
This will decompress the input data on the fly, while leaving the disk file unmodified. This helps you save disk space, at the cost of some additional runtime CPU usage to decompress the data.
@ -81,6 +84,9 @@ red,square,false,6,64,77.1991,9.5310
yellow,triangle,true,1,11,43.6498,9.8870
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The benefit of `--prepipe` is that Miller will run the specified program once per

View file

@ -76,6 +76,9 @@ Examples:
a,b,c
1.2,3,true
4,5.6,buongiorno
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -123,6 +126,9 @@ f 8.9
tf float
g 15.9
tg float
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
On input, string values representable as boolean (e.g. `"true"`, `"false"`)
@ -153,6 +159,9 @@ or the
id,blob
100,"{""a"":1,""b"":[2,3,4]}"
105,"{""a"":6,""b"":[7,8,9]}"
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -169,6 +178,9 @@ id,blob
"blob": "{\"a\":6,\"b\":[7,8,9]}"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -191,6 +203,9 @@ id,blob
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -213,6 +228,9 @@ id,blob
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
These have their respective operations to convert back to string: the

View file

@ -33,6 +33,9 @@ Here are flags you can use when invoking Miller. For example, when you type
"rate": 9.8870
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
the `--icsv` and `--ojson` bits are _flags_. See the [Miller command
@ -373,6 +376,7 @@ These are flags for profiling Miller performance.
**Flags:**
* `--cpuprofile {CPU-profile file name}`: Create a CPU-profile file for performance analysis. Instructions will be printed to stderr. This flag must be the very first thing after 'mlr' on the command line.
* `--memprofile {Memory-profile file name}`: Create a memory-profile file for performance analysis. Instructions will be printed to stderr. This flag must be the very first thing after 'mlr' on the command line.
* `--time`: Print elapsed execution time in seconds to stderr at the end of the execution of the program.
* `--traceprofile`: Create a trace-profile file for performance analysis. Instructions will be printed to stderr. This flag must be the very first thing after 'mlr' on the command line.

View file

@ -48,6 +48,9 @@ _Map literals_ are written in curly braces with string keys any [Miller data typ
}
true
true
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
As with arrays and argument-lists, trailing commas are supported:
@ -70,6 +73,9 @@ As with arrays and argument-lists, trailing commas are supported:
"b": 2,
"c": 3
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The current record, accessible using `$*`, is a map.
@ -101,6 +107,9 @@ Color is yellow
"rate": 0.0130
}
Color is red
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The collection of all [out-of-stream variables](reference-dsl-variables.md#out-of-stream0variables), `@*`, is a map.
@ -126,6 +135,9 @@ The collection of all [out-of-stream variables](reference-dsl-variables.md#out-o
},
"last_color": "purple"
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Also note that several [built-in functions](reference-dsl-builtin-functions.md) operate on maps and/or return maps.
@ -165,6 +177,9 @@ in **auto-create** of that variable as a map variable:
"square": 8.2430,
"circle": 8.3350
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
*This also means that auto-create results in maps, not arrays, even if keys are integers.*
@ -192,6 +207,9 @@ If you want to auto-extend an [array](reference-main-arrays.md), initialize it e
"4": 7.4670
}
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Auto-deepen
@ -217,6 +235,9 @@ red square 17.011
red circle 2.9010
purple triangle 14.415
purple square 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Looping

View file

@ -69,6 +69,9 @@ a=1,b=8
a=,b=4
x=9,b=10
a=5,b=7
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -80,6 +83,9 @@ a=3,b=2
a=5,b=7
a=,b=4
x=9,b=10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -91,6 +97,9 @@ a=5,b=7
a=3,b=2
a=1,b=8
x=9,b=10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* Functions/operators which have one or more *empty* arguments produce empty output: e.g.
@ -100,6 +109,9 @@ x=9,b=10
</pre>
<pre class="pre-non-highlight-in-pair">
x=2,y=3,a=5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -107,6 +119,9 @@ x=2,y=3,a=5
</pre>
<pre class="pre-non-highlight-in-pair">
x=,y=3,a=
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -114,6 +129,9 @@ x=,y=3,a=
</pre>
<pre class="pre-non-highlight-in-pair">
x=,y=3,a=,b=1.0986122886681096
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
with the exception that the `min` and `max` functions are special: if one argument is non-null, it wins:
@ -123,6 +141,9 @@ with the exception that the `min` and `max` functions are special: if one argume
</pre>
<pre class="pre-non-highlight-in-pair">
x=,y=3,a=3,b=
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* Functions of *absent* variables (e.g. `mlr put '$y = log10($nonesuch)'`) evaluate to absent, and arithmetic/bitwise/boolean operators with both operands being absent evaluate to absent. Arithmetic operators with one absent operand return the other operand. More specifically, absent values act like zero for addition/subtraction, and one for multiplication: Furthermore, **any expression which evaluates to absent is not stored in the left-hand side of an assignment statement**:
@ -132,6 +153,9 @@ x=,y=3,a=3,b=
</pre>
<pre class="pre-non-highlight-in-pair">
x=2,y=3,b=3,c=5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -139,6 +163,9 @@ x=2,y=3,b=3,c=5
</pre>
<pre class="pre-non-highlight-in-pair">
x=2,y=3,a=2,b=3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* Likewise, for assignment to maps, **absent-valued keys or values result in a skipped assignment**.
@ -166,6 +193,9 @@ record_count=100,resource=/path/to/file
resource=/path/to/second/file,loadsec=0.32,ok=true
record_count=150,resource=/path/to/second/file
resource=/some/other/path,loadsec=0.97,ok=false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -177,6 +207,9 @@ record_count=100,resource=/path/to/file
resource=/path/to/second/file,loadsec=0.32,ok=true,loadmillis=320
record_count=150,resource=/path/to/second/file
resource=/some/other/path,loadsec=0.97,ok=false,loadmillis=970
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -188,6 +221,9 @@ record_count=100,resource=/path/to/file,loadmillis=0
resource=/path/to/second/file,loadsec=0.32,ok=true,loadmillis=320
record_count=150,resource=/path/to/second/file,loadmillis=0
resource=/some/other/path,loadsec=0.97,ok=false,loadmillis=970
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Arithmetic rules

View file

@ -35,6 +35,9 @@ pipe the output to something else, particularly CSV. I use Miller's pretty-print
</pre>
<pre class="pre-non-highlight-in-pair">
x= 3.100,y= 4.300
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -42,6 +45,9 @@ x= 3.100,y= 4.300
</pre>
<pre class="pre-non-highlight-in-pair">
x=3.10000000e+00,y=4.30000000e+00
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## The format-values verb
@ -60,6 +66,9 @@ put`. For example:
</pre>
<pre class="pre-non-highlight-in-pair">
x=3.1,y=4.3,z=13.330000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -67,6 +76,9 @@ x=3.1,y=4.3,z=13.330000
</pre>
<pre class="pre-non-highlight-in-pair">
x=0xffff,y=0xff,z=00feff01
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Input conversion from hexadecimal is done automatically on fields handled by `mlr put` and `mlr filter` as long as the field value begins with `0x`. To apply output conversion to hexadecimal on a single column, you may use `fmtnum`, or the keystroke-saving [`hexfmt`](reference-dsl-builtin-functions.md#hexfmt) function. Example:
@ -76,6 +88,9 @@ Input conversion from hexadecimal is done automatically on fields handled by `ml
</pre>
<pre class="pre-non-highlight-in-pair">
x=0xffff,y=0xff,z=16711425
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -83,4 +98,7 @@ x=0xffff,y=0xff,z=16711425
</pre>
<pre class="pre-non-highlight-in-pair">
x=0xffff,y=0xff,z=0xfeff01
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -34,6 +34,9 @@ For example, reading from a file:
color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -43,6 +46,9 @@ yellow triangle true 1 11 43.6498 9.8870
color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Reading from standard input:
@ -54,6 +60,9 @@ Reading from standard input:
color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The rest of this reference section gives you full information on each of these parts of the command line.
@ -79,6 +88,9 @@ Example of using a verb for data processing:
a=pan,x_sum=0.346791
a=eks,x_sum=1.140078
a=wye,x_sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* Verbs are coded in Go
@ -96,6 +108,9 @@ Example of doing the same thing using a DSL expression:
a=pan,x_sum=0.346791
a=eks,x_sum=1.140078
a=wye,x_sum=0.777891
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
* You get to write your own expressions in Miller's programming language

View file

@ -59,6 +59,9 @@ name=bull,regex=^b[ou]ll$
<pre class="pre-non-highlight-in-pair">
name=jane,regex=^j.*e$
name=bull,regex=^b[ou]ll$
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Regex captures
@ -95,13 +98,11 @@ Regular expressions are those supported by the [Go regexp package](https://pkg.g
<pre class="pre-non-highlight-in-pair">
package syntax // import "regexp/syntax"
Package syntax parses regular expressions into parse trees and compiles
parse trees into programs. Most clients of regular expressions will use the
facilities of package regexp (such as Compile and Match) instead of this
package.
Package syntax parses regular expressions into parse trees and compiles parse
trees into programs. Most clients of regular expressions will use the facilities
of package regexp (such as Compile and Match) instead of this package.
Syntax
# Syntax
The regular expression syntax understood by this package when parsing with
the Perl flag is as follows. Parts of the syntax can be disabled by passing
@ -141,9 +142,9 @@ Repetitions:
x{n,}? n or more x, prefer fewer
x{n}? exactly n x
Implementation restriction: The counting forms x{n,m}, x{n,}, and x{n}
reject forms that create a minimum or maximum repetition count above 1000.
Unlimited repetitions are not subject to this restriction.
Implementation restriction: The counting forms x{n,m}, x{n,}, and x{n} reject
forms that create a minimum or maximum repetition count above 1000. Unlimited
repetitions are not subject to this restriction.
Grouping:
@ -229,8 +230,7 @@ ASCII character classes:
[[:word:]] word characters (== [0-9A-Za-z_])
[[:xdigit:]] hex digit (== [0-9A-Fa-f])
Unicode character classes are those in unicode.Categories and
unicode.Scripts.
Unicode character classes are those in unicode.Categories and unicode.Scripts.
func IsWordChar(r rune) bool
type EmptyOp uint8

View file

@ -74,6 +74,9 @@ a=4,b=5,c=6
<pre class="pre-non-highlight-in-pair">
c:3;a:1;b:2
c:6;a:4;b:5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -83,6 +86,9 @@ c:6;a:4;b:5
color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -92,6 +98,9 @@ red,square,true,2,15,79.2778,0.0130
color|shape|flag|k|index|quantity|rate
yellow|triangle|true|1|11|43.6498|9.8870
red|square|true|2|15|79.2778|0.0130
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If your data has non-default separators and you don't want to change those
@ -112,6 +121,9 @@ a:4;b:5;c:6
<pre class="pre-non-highlight-in-pair">
c:3;a:1;b:2
c:6;a:4;b:5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Multi-character separators
@ -126,6 +138,9 @@ restrictions), IRS must be `\n` and IFS must be a single character.
<pre class="pre-non-highlight-in-pair">
c:=3;;;a:=1;;;b:=2
c:=6;;;a:=4;;;b:=5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
If your data has field separators which are one or more consecutive spaces, you
@ -166,6 +181,9 @@ early light what so
2 light
3 what
4 so
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Regular-expression separators
@ -255,6 +273,9 @@ their values indicate what you specified at the command line -- so their use is
<pre class="pre-non-highlight-in-pair">
a:1;b:2;c:3;d:>>>,|||;<<<
a:4;b:5;c:6;d:>>>,|||;<<<
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Which separators apply to which file formats

View file

@ -41,6 +41,9 @@ purple triangle false 7 65 80.1405 5.8240 purple:triangle
yellow circle true 8 73 63.9785 4.2370 yellow:circle
yellow circle true 9 87 63.5058 8.3350 yellow:circle
purple square false 10 91 72.3735 8.2430 purple:square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Also see the [list of string-related built-in functions](reference-dsl-builtin-functions.md#string-functions).
@ -92,6 +95,9 @@ a
e
ab
de
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Slicing
@ -118,6 +124,9 @@ ab
cde
abcde
bcd
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Out-of-bounds indexing
@ -140,6 +149,9 @@ accesses result in trimming the indices, resulting in a short string or even the
a
e
(error)
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -156,6 +168,9 @@ e
"ab"
"abcde"
""
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Escape sequences for string literals

File diff suppressed because it is too large Load diff

View file

@ -91,6 +91,9 @@ HELLO
}GOODBYE
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Using Miller with the REPL

View file

@ -30,6 +30,9 @@ shape count count_fraction
triangle 3 0.3
square 4 0.4
circle 3 0.3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Typing this out can get a bit old, if the only thing that changes for you is the filename. Some options include:
@ -72,6 +75,9 @@ shape count count_fraction
triangle 3 0.3
square 4 0.4
circle 3 0.3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -82,6 +88,9 @@ shape count count_fraction
triangle 3 0.3
square 4 0.4
circle 3 0.3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -105,6 +114,9 @@ circle 3 0.3
"count_fraction": 0.3
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -123,6 +135,9 @@ circle 3 0.3
"count_fraction": 0.3
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
etc.
@ -160,6 +175,9 @@ shape count count_fraction
triangle 3 0.3
square 4 0.4
circle 3 0.3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -170,6 +188,9 @@ shape count count_fraction
triangle 3 0.3
square 4 0.4
circle 3 0.3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -193,6 +214,9 @@ circle 3 0.3
"count_fraction": 0.3
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -211,6 +235,9 @@ circle 3 0.3
"count_fraction": 0.3
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Miller scripts on Windows
@ -247,6 +274,9 @@ shape count count_fraction
triangle 3 0.3
square 4 0.4
circle 3 0.3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -270,6 +300,9 @@ circle 3 0.3
"count_fraction": 0.3
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -288,6 +321,9 @@ circle 3 0.3
"count_fraction": 0.3
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
and so on. See also [Miller on Windows](miller-on-windows.md).

View file

@ -118,6 +118,9 @@ Miller records are ordered lists of key-value pairs. For NIDX format, DKVP forma
</pre>
<pre class="pre-non-highlight-in-pair">
1=x,2=y,3=z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -125,6 +128,9 @@ Miller records are ordered lists of key-value pairs. For NIDX format, DKVP forma
</pre>
<pre class="pre-non-highlight-in-pair">
1=x,2=y,3=z,6=a,4=b,55=cde
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -132,6 +138,9 @@ Miller records are ordered lists of key-value pairs. For NIDX format, DKVP forma
</pre>
<pre class="pre-non-highlight-in-pair">
x,y,z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -140,6 +149,9 @@ x,y,z
<pre class="pre-non-highlight-in-pair">
1,2,3
x,y,z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -147,6 +159,9 @@ x,y,z
</pre>
<pre class="pre-non-highlight-in-pair">
1=x,999=y,3=z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -154,6 +169,9 @@ x,y,z
</pre>
<pre class="pre-non-highlight-in-pair">
1=x,newname=y,3=z
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -162,6 +180,9 @@ x,y,z
<pre class="pre-non-highlight-in-pair">
3,1,2
z,x,y
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Why doesn't mlr cut put fields in the order I want?
@ -200,6 +221,9 @@ triangle,false,5.8240
circle,true,4.2370
circle,true,8.3350
square,false,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The issue is that Miller's `cut`, by default, outputs cut fields in the order they appear in the input data. This design decision was made intentionally to parallel the Unix/Linux system `cut` command, which has the same semantics.
@ -221,6 +245,9 @@ rate,shape,flag
4.2370,circle,true
8.3350,circle,true
8.2430,square,false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Numbering and renumbering records
@ -259,6 +286,9 @@ purple,triangle,false,7,65,80.1405,5.8240,7
yellow,circle,true,8,73,63.9785,4.2370,8
yellow,circle,true,9,87,63.5058,8.3350,9
purple,square,false,10,91,72.3735,8.2430,10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
However, this is the record number within the original input stream -- not after any filtering you may have done:
@ -271,6 +301,9 @@ color,shape,flag,k,index,quantity,rate,nr
yellow,triangle,true,1,11,43.6498,9.8870,1
yellow,circle,true,8,73,63.9785,4.2370,8
yellow,circle,true,9,87,63.5058,8.3350,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
There are two good options here. One is to use the `cat` verb with `-n`:
@ -283,6 +316,9 @@ n,color,shape,flag,k,index,quantity,rate
1,yellow,triangle,true,1,11,43.6498,9.8870
2,yellow,circle,true,8,73,63.9785,4.2370
3,yellow,circle,true,9,87,63.5058,8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The other is to keep your own counter within the `put` DSL:
@ -295,6 +331,9 @@ color,shape,flag,k,index,quantity,rate,n
yellow,triangle,true,1,11,43.6498,9.8870,1
yellow,circle,true,8,73,63.9785,4.2370,2
yellow,circle,true,9,87,63.5058,8.3350,3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The difference is a matter of taste (although `mlr cat -n` puts the counter first).
@ -383,6 +422,9 @@ outer=2,middle=21,inner1=210,inner2=211
outer=3,middle=30,inner1=300,inner2=301
outer=3,middle=31,inner1=312,inner2=301
outer=3,middle=31,inner1=313,inner2=314
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
See also the [record-heterogeneity page](record-heterogeneity.md); see in

View file

@ -30,6 +30,9 @@ eks pan 2 0.758679 0.522151 hello world
wye wye 3 0.204603 0.338318 hello world
eks wye 4 0.381399 0.134188 hello world
wye pan 5 0.573288 0.863624 hello world
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -42,6 +45,9 @@ eks pan 2 0.758679 0.522151 {2}
wye wye 3 0.204603 0.338318 {3}
eks wye 4 0.381399 0.134188 {4}
wye pan 5 0.573288 0.863624 {5}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -54,6 +60,9 @@ eks pan 2 0.758679 0.522151 585d25a8ff04840f77779eeff61167dc
wye wye 3 0.204603 0.338318 fb6361a373147c163e65ada94719fa16
eks wye 4 0.381399 0.134188 585d25a8ff04840f77779eeff61167dc
wye pan 5 0.573288 0.863624 fb6361a373147c163e65ada94719fa16
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Note that running a subprocess on every record takes a non-trivial amount of time. Comparing asking the system `date` command for the current time in nanoseconds versus computing it in process:

View file

@ -49,6 +49,9 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Sorted numerically ascending by rate:
@ -68,6 +71,9 @@ yellow circle true 9 87 63.5058 8.3350
purple triangle false 5 51 81.2290 8.5910
red square false 6 64 77.1991 9.5310
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Sorted lexically ascending by color; then, within each color, numerically descending by quantity:
@ -87,6 +93,9 @@ red circle true 3 16 13.8103 2.9010
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Example of natural sort, adapted from [https://github.com/facette/natsort](https://github.com/facette/natsort):
@ -123,6 +132,9 @@ n name
25 Xiph Xlater 40
26 Allegia 6R Clasteron
27 Callisto Morphamax 5000
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -157,6 +169,9 @@ n name
3 Xiph Xlater 58
21 Xiph Xlater 300
14 Xiph Xlater 500
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Sorting fields within records: the sort-within-records verb
@ -200,6 +215,9 @@ b a c
c b a
7 8 9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -210,6 +228,9 @@ a b c
1 2 3
5 4 6
9 8 7
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## The sort function by example
@ -228,6 +249,9 @@ a b c
</pre>
<pre class="pre-non-highlight-in-pair">
[1, 2, 3, 4, 5]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -240,6 +264,9 @@ a b c
</pre>
<pre class="pre-non-highlight-in-pair">
[5, 4, 3, 2, 1]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -252,6 +279,9 @@ a b c
</pre>
<pre class="pre-non-highlight-in-pair">
[1, 2, 3, 4, 5]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -264,6 +294,9 @@ a b c
</pre>
<pre class="pre-non-highlight-in-pair">
[5, 4, 3, 2, 1]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -280,6 +313,9 @@ a b c
"b": 1,
"c": 2
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -296,6 +332,9 @@ a b c
"b": 1,
"a": 3
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -319,6 +358,9 @@ a b c
"c": 2,
"a": 3
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -342,6 +384,9 @@ a b c
"c": 2,
"b": 1
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -354,6 +399,9 @@ a b c
</pre>
<pre class="pre-non-highlight-in-pair">
["a1", "a2", "a10", "a20", "a100", "a200"]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
In the rest of this page we'll look more closely at these variants.
@ -397,6 +445,9 @@ key values
alpha 1;4;5;6
beta 7;8;9;9
gamma 1;2;11;12
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Use the `"r"` flag for reverse, which is numerical descending:
@ -413,6 +464,9 @@ key values
alpha 6;5;4;1
beta 9;9;8;7
gamma 12;11;2;1
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Use the `"f"` flag for lexical ascending sort (and `"fr"` would lexical descending):
@ -429,6 +483,9 @@ key values
alpha 1;4;5;6
beta 7;8;9;9
gamma 1;11;12;2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Without and with case-folding:
@ -457,6 +514,9 @@ alpha,cat;bat;Australia;Bavaria;apple;Colombia
key values
alpha Australia;Bavaria;Colombia;apple;bat;cat
alpha apple;Australia;bat;Bavaria;cat;Colombia
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Simple sorting of maps within records
@ -528,6 +588,9 @@ Also note that, unlike the `sort-within-record` verb with its `-r` flag,
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Simple sorting of maps across records
@ -570,6 +633,9 @@ red square false 6 64 77.1991 9.5310 6
purple triangle false 7 65 80.1405 5.8240 7
yellow circle true 8 73 63.9785 4.2370 8
yellow circle true 9 87 63.5058 8.3350 9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Custom sorting of arrays within records
@ -638,6 +704,9 @@ recapitulate (for reference) what `sort` with default flags already does; the th
"even_then_odd": [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Custom sorting of arrays across records
@ -691,6 +760,9 @@ red square true 2 15 79.2778 0.0130
purple triangle false 7 65 80.1405 5.8240
purple triangle false 5 51 81.2290 8.5910
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Custom sorting of maps within records
@ -754,6 +826,9 @@ For example, we can sort ascending or descending by map key or map value:
"b": 2,
"c": 1
}
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Custom sorting of maps across records
@ -796,4 +871,7 @@ red square false 4 48 77.5542 7.4670
red circle true 3 16 13.8103 2.9010
red square true 2 15 79.2778 0.0130
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -45,6 +45,9 @@ Likewise [JSON](file-formats.md#json):
"Role": "tester"
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
For Miller's [XTAB](file-formats.md#xtab-vertical-tabular) there is no escaping for carriage returns, but commas work fine:
@ -58,6 +61,9 @@ Role administrator
Name Khavari, Darius
Role tester
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
But for [key-value-pairs](file-formats.md#dkvp-key-value-pairs) and [index-numbered](file-formats.md#nidx-index-numbered-toolkit-style) formats, commas are the default field separator. And -- as of Miller 5.4.0 anyway -- there is no CSV-style double-quote-handling like there is for CSV. So commas within the data look like delimiters:
@ -68,6 +74,9 @@ But for [key-value-pairs](file-formats.md#dkvp-key-value-pairs) and [index-numbe
<pre class="pre-non-highlight-in-pair">
Name=Xiao, Lin,Role=administrator
Name=Khavari, Darius,Role=tester
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
One solution is to use a different delimiter, such as a pipe character:
@ -78,6 +87,9 @@ One solution is to use a different delimiter, such as a pipe character:
<pre class="pre-non-highlight-in-pair">
Name=Xiao, Lin|Role=administrator
Name=Khavari, Darius|Role=tester
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
To be extra-sure to avoid data/delimiter clashes, you can also use control
@ -89,6 +101,9 @@ characters as delimiters -- here, control-A:
<pre class="pre-non-highlight-in-pair">
Name=Xiao, Lin^ARole=administrator
Name=Khavari, Darius^ARole=tester
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## How can I handle field names with special symbols in them?
@ -100,6 +115,9 @@ Simply surround the field names with curly braces:
</pre>
<pre class="pre-non-highlight-in-pair">
x.a=3,y:b=4,z/c=5,product.all=60
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## How can I put single quotes into strings?
@ -115,6 +133,9 @@ $a = "It's OK, I said, then 'for now'."
</pre>
<pre class="pre-non-highlight-in-pair">
a=It's OK, I said, then 'for now'.
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
So: Miller's DSL uses double quotes for strings, and you can put single quotes (or backslash-escaped double-quotes) inside strings, no problem.
@ -126,6 +147,9 @@ Without putting the update expression in a file, it's messier:
</pre>
<pre class="pre-non-highlight-in-pair">
a=It's OK, I said, 'for now'.
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The idea is that the outermost single-quotes are to protect the `put` expression from the shell, and the double quotes within them are for Miller. To get a single quote in the middle there, you need to actually put it *outside* the single-quoting for the shell. The pieces are the following, all concatenated together:
@ -155,6 +179,9 @@ a=is it?,b=it is!
a is it?
b it is!
c is it ...
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --oxtab put '$c = ssub($a, "?"," ...")' data/question.dat</b>
@ -163,6 +190,9 @@ c is it ...
a is it?
b it is!
c is it ...
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The
@ -186,6 +216,9 @@ The `ssub` and `gssub` functions are also handy for dealing with non-UTF-8 strin
</pre>
<pre class="pre-non-highlight-in-pair">
Kaðlín og Þormundr
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
More generally, though, we have the DSL functions
@ -219,4 +252,7 @@ See also the [page on regular expressions](reference-main-regular-expressions.md
</pre>
<pre class="pre-non-highlight-in-pair">
a=14°45',degrees=14.75
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -29,6 +29,9 @@ For one or more specified field names, simply compute p25 and p75, then write th
x_p25 0.24667037823231752
x_p75 0.7481860062358446
x_iqr 0.5015156280035271
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
For wildcarded field names, first compute p25 and p75, then loop over field names with `p25` in them:
@ -52,6 +55,9 @@ y_p75 0.7640028449996572
i_iqr 5000
x_iqr 0.5015156280035271
y_iqr 0.5118661397595003
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Computing weighted means
@ -90,4 +96,7 @@ a=eks,wmean=4890.3815931472145,mean=4956.2900763358775
a=wye,wmean=4946.987746229947,mean=4920.001017293998
a=zee,wmean=5164.719684856538,mean=5123.092330239375
a=hat,wmean=4925.533162478552,mean=4967.743946419371
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -50,6 +50,9 @@ you can simply do
</pre>
<pre class="pre-non-highlight-in-pair">
x_sum 4986.019681679581
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
or
@ -64,6 +67,9 @@ wye 1023.5484702619565
zee 979.7420161495838
eks 1016.7728571314786
hat 1000.192668193983
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
rather than the more tedious
@ -78,6 +84,9 @@ rather than the more tedious
</pre>
<pre class="pre-non-highlight-in-pair">
x_sum 4986.019681679581
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
or
@ -97,6 +106,9 @@ wye 1023.5484702619565
zee 979.7420161495838
eks 1016.7728571314786
hat 1000.192668193983
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
The former (`mlr stats1` et al.) has the advantages of being easier to type, being less error-prone to type, and running faster.
@ -143,6 +155,9 @@ NR x x_pct
3 0.204603 0
4 0.381399 31.90825807289974
5 0.573288 66.54051068806446
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Line-number ratios
@ -170,6 +185,9 @@ I N PCT a b i x y
3 5 60 wye wye 3 0.204603 0.338318
4 5 80 eks wye 4 0.381399 0.134188
5 5 100 wye pan 5 0.573288 0.863624
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Records having max value
@ -212,6 +230,9 @@ blue purple 2 0.208785
purple purple 1 0.455077
red purple 4 0.477187
blue red 4 0.007487
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Of course, the largest value of `n` isn't known until after all data have been read. Using an [out-of-stream variable](reference-dsl-variables.md#out-of-stream-variables) we can [retain all records as they are read](operating-on-all-records.md), then filter them at the end:
@ -251,6 +272,9 @@ purple red 5 0.454779
orange blue 5 0.705700
purple red 5 0.072936
green purple 5 0.203577
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Feature-counting
@ -349,6 +373,9 @@ Then
"key_fraction": 0.08333333333333333
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -373,6 +400,9 @@ latency 0.5833333333333334
name 0.3333333333333333
uid 0.25
uid2 0.08333333333333333
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Unsparsing
@ -465,6 +495,9 @@ end {
"w": 2
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -476,6 +509,9 @@ a,b,v,u,x,w
,2,,1,,
1,,2,,3,
,,1,,,2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -487,6 +523,9 @@ a b v u x w
- 2 - 1 - -
1 - 2 - 3 -
- - 1 - - 2
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Mean without/with oosvars
@ -497,6 +536,9 @@ a b v u x w
<pre class="pre-non-highlight-in-pair">
x_mean
0.49860196816795804
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -512,6 +554,9 @@ x_mean
<pre class="pre-non-highlight-in-pair">
x_mean
0.49860196816795804
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Keyed mean without/with oosvars
@ -546,6 +591,9 @@ zee hat 0.46772617655014515
wye zee 0.5059066170573692
eks hat 0.5006790659966355
wye eks 0.5306035254809106
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -587,6 +635,9 @@ hat zee 0.5099985721987774
hat eks 0.48587864619953547
hat hat 0.47993053101017374
hat pan 0.4643355557376876
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Variance and standard deviation without/with oosvars
@ -600,6 +651,9 @@ x_sum 4986.019681679581
x_mean 0.49860196816795804
x_var 0.08426974433144456
x_stddev 0.2902925151144007
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -627,6 +681,9 @@ sumx2 3328.652400179729
mean 0.49860196816795804
var 0.08426974433144456
stddev 0.2902925151144007
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
You can also do this keyed, of course, imitating the keyed-mean example above.
@ -639,6 +696,9 @@ You can also do this keyed, of course, imitating the keyed-mean example above.
<pre class="pre-non-highlight-in-pair">
x_min 0.00004509679127584487
x_max 0.999952670371898
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -651,6 +711,9 @@ x_max 0.999952670371898
<pre class="pre-non-highlight-in-pair">
x_min 0.00004509679127584487
x_max 0.999952670371898
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Keyed min/max without/with oosvars
@ -665,6 +728,9 @@ eks 0.0006917972627396018 0.9988110946859143
wye 0.0001874794831505655 0.9998228522652893
zee 0.0005486114815762555 0.9994904324789629
hat 0.00004509679127584487 0.999952670371898
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -683,6 +749,9 @@ eks 0.0006917972627396018 0.9988110946859143
wye 0.0001874794831505655 0.9998228522652893
zee 0.0005486114815762555 0.9994904324789629
hat 0.00004509679127584487 0.999952670371898
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Delta without/with oosvars
@ -697,6 +766,9 @@ eks pan 2 0.758679 0.522151 0.411888
wye wye 3 0.204603 0.338318 -0.554076
eks wye 4 0.381399 0.134188 0.17679599999999998
wye pan 5 0.573288 0.863624 0.19188900000000003
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -712,6 +784,9 @@ eks pan 2 0.758679 0.522151 0.411888
wye wye 3 0.204603 0.338318 -0.554076
eks wye 4 0.381399 0.134188 0.17679599999999998
wye pan 5 0.573288 0.863624 0.19188900000000003
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Keyed delta without/with oosvars
@ -726,6 +801,9 @@ eks pan 2 0.758679 0.522151 0
wye wye 3 0.204603 0.338318 0
eks wye 4 0.381399 0.134188 -0.37728
wye pan 5 0.573288 0.863624 0.36868500000000004
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -741,6 +819,9 @@ eks pan 2 0.758679 0.522151 0
wye wye 3 0.204603 0.338318 0
eks wye 4 0.381399 0.134188 -0.37728
wye pan 5 0.573288 0.863624 0.36868500000000004
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
## Exponentially weighted moving averages without/with oosvars
@ -755,6 +836,9 @@ eks pan 2 0.758679 0.522151 0.3879798
wye wye 3 0.204603 0.338318 0.36964211999999996
eks wye 4 0.381399 0.134188 0.37081780799999997
wye pan 5 0.573288 0.863624 0.3910648272
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -771,4 +855,7 @@ eks pan 2 0.758679 0.522151 0.3879798
wye wye 3 0.204603 0.338318 0.36964211999999996
eks wye 4 0.381399 0.134188 0.37081780799999997
wye pan 5 0.573288 0.863624 0.3910648272
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

View file

@ -47,6 +47,9 @@ a,b,c
1,2,3
4,5,6
7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
<pre class="pre-highlight-in-pair">
@ -57,6 +60,9 @@ a,b,c
7,8,9
4,5,6
1,2,3
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>
Likewise with `mlr sort`, `mlr tac`, and so on.