mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-23 02:14:13 +00:00
Reorganization of new-in-miller-6 docpage (#750)
This commit is contained in:
parent
86f5c6bdef
commit
dba3ed5dc4
21 changed files with 274 additions and 166 deletions
|
|
@ -1,17 +1,20 @@
|
|||
# Scope
|
||||
|
||||
This note is for a developer point of view. For a user point of view, please see [https://miller.readthedocs.io/en/latest/new-in-miller-6](https://miller.readthedocs.io/en/latest/new-in-miller-6).
|
||||
|
||||
# Quickstart for developers
|
||||
|
||||
See `makefile` in the repo base directory.
|
||||
|
||||
# Continuous integration
|
||||
|
||||
* The Go implementation is auto-built using GitHub Actions: see [../.github/workflows/go.yml](../.github/workflows/go.yml). This works splendidly on Linux, MacOS, and Windows.
|
||||
* See also [../README.md](../README.md).
|
||||
The Go implementation is auto-built using GitHub Actions: see [.github/workflows/go.yml](.github/workflows/go.yml). This works splendidly on Linux, MacOS, and Windows.
|
||||
|
||||
# Benefits of porting to Go
|
||||
|
||||
* The [lack of a streaming (record-by-record) JSON reader](http://johnkerl.org/miller/doc/file-formats.html#JSON_non-streaming) in the C implementation ([issue 99](https://github.com/johnkerl/miller/issues/99)) is immediately solved in the Go implementation.
|
||||
* In the C implementation, [arrays were not supported in the DSL](http://johnkerl.org/miller/doc/file-formats.html#Arrays); in the Go implementation they are.
|
||||
* [Flattening nested map structures to output records](http://johnkerl.org/miller/doc/file-formats.html#Formatting_JSON_options) was clumsy. Now, Miller will be a JSON-to-JSON processor, if your inputs and outputs are both JSON; JSON input and output will be idiomatic.
|
||||
* The lack of a streaming (record-by-record) JSON reader in the C implementation ([issue 99](https://github.com/johnkerl/miller/issues/99)) is immediately solved in the Go implementation.
|
||||
* In the C implementation, arrays were not supported in the DSL; in the Go implementation they are.
|
||||
* Flattening nested map structures to output records was clumsy. Now, Miller will be a JSON-to-JSON processor, if your inputs and outputs are both JSON; JSON input and output will be idiomatic.
|
||||
* The quoted-DKVP feature from [issue 266](https://github.com/johnkerl/miller/issues/266) will be easily addressed.
|
||||
* String/number-formatting issues in [issue 211](https://github.com/johnkerl/miller/issues/211), [issue 178](https://github.com/johnkerl/miller/issues/178), [issue 151](https://github.com/johnkerl/miller/issues/151), and [issue 259](https://github.com/johnkerl/miller/issues/259) will be fixed during the Go port.
|
||||
* I think some DST/timezone issues such as [issue 359](https://github.com/johnkerl/miller/issues/359) will be easier to fix using the Go datetime library than using the C datetime library
|
||||
|
|
@ -28,7 +31,7 @@ See `makefile` in the repo base directory.
|
|||
|
||||
# Efficiency of the Go port
|
||||
|
||||
As I wrote [here](http://johnkerl.org/miller/doc/whyc.html) back in 2015 I couldn't get Rust or Go (or any other language I tried) to do some test-case processing as quickly as C, so I stuck with C.
|
||||
As I wrote [here](https://johnkerl.org//miller-docs-by-release/1.0.0/performance.html) back in 2015 I couldn't get Rust or Go (or any other language I tried) to do some test-case processing as quickly as C, so I stuck with C.
|
||||
|
||||
Either Go has improved since 2015, or I'm a better Go programmer than I used to be, or both -- but as of 2020 I can get Go-Miller to process data about as quickly as C-Miller.
|
||||
|
||||
|
|
@ -53,10 +56,10 @@ During the coding of Miller, I've been guided by the following:
|
|||
* `README.md` files throughout the directory tree are intended to give you a sense of what is where, what to read first and and what doesn't need reading right away, and so on -- so you spend a minimum of time being confused or frustrated.
|
||||
* Names of files, variables, functions, etc. should be fully spelled out (e.g. `NewEvaluableLeafNode`), except for a small number of most-used names where a longer name would cause unnecessary line-wraps (e.g. `Mlrval` instead of `MillerValue` since this appears very very often).
|
||||
* Code should not be too clever. This includes some reasonable amounts of code duplication from time to time, to keep things inline, rather than lasagna code.
|
||||
* Things should be transparent. For example, `mlr -n put -v '$y = 3 + 0.1 * $x'` shows you the abstract syntax tree derived from the DSL expression.
|
||||
* Comments should be robust with respect to reasonably anticipated changes. For example, one package should cross-link to another in its comments, but I try to avoid mentioning specific filenames too much in the comments and README files since these may change over time. I make an exception for stable points such as [mlr.go](./mlr.go), [mlr.bnf](./internal/pkg/parsing/mlr.bnf), [stream.go](./internal/pkg/stream/stream.go), etc.
|
||||
* Things should be transparent. For example, the `-v` in `mlr -n put -v '$y = 3 + 0.1 * $x'` shows you the abstract syntax tree derived from the DSL expression.
|
||||
* Comments should be robust with respect to reasonably anticipated changes. For example, one package should cross-link to another in its comments, but I try to avoid mentioning specific filenames too much in the comments and README files since these may change over time. I make an exception for stable points such as [cmd/mlr/main.go](./cmd/mlr/main.go), [mlr.bnf](./internal/pkg/parsing/mlr.bnf), [stream.go](./internal/pkg/stream/stream.go), etc.
|
||||
* *Miller should be pleasant to write.*
|
||||
* It should be quick to answer the question *Did I just break anything?* -- hence the `build` and `reg_test/run` regression scripts.
|
||||
* It should be quick to answer the question *Did I just break anything?* -- hence `mlr regtest` functionality.
|
||||
* It should be quick to find out what to do next as you iteratively develop -- see for example [cst/README.md](./internal/pkg/dsl/cst/README.md).
|
||||
* *The language should be an asset, not a liability.*
|
||||
* One of the reasons I chose Go is that (personally anyway) I find it to be reasonably efficient, well-supported with standard libraries, straightforward, and fun. I hope you enjoy it as much as I have.
|
||||
|
|
@ -78,7 +81,7 @@ So, in broad overview, the key packages are:
|
|||
|
||||
* [internal/pkg/stream](./internal/pkg/stream) -- connect input -> transforms -> output via Go channels
|
||||
* [internal/pkg/input](./internal/pkg/input) -- read input records
|
||||
* [internal/pkg/transforming](./internal/pkg/transforming) -- transform input records to output records
|
||||
* [internal/pkg/transformers](./internal/pkg/transformers) -- transform input records to output records
|
||||
* [internal/pkg/output](./internal/pkg/output) -- write output records
|
||||
* The rest are details to support this.
|
||||
|
||||
|
|
@ -99,7 +102,7 @@ So, in broad overview, the key packages are:
|
|||
|
||||
### Miller per se
|
||||
|
||||
* The main entry point is [mlr.go](./mlr.go); everything else in [internal/pkg](./internal/pkg).
|
||||
* The main entry point is [cmd/mlr/main.go](./cmd/mlr/main.go); everything else in [internal/pkg](./internal/pkg).
|
||||
* [internal/pkg/entrypoint](./internal/pkg/entrypoint): All the usual contents of `main()` are here, for ease of testing.
|
||||
* [internal/pkg/platform](./internal/pkg/platform): Platform-dependent code, which as of early 2021 is the command-line parser. Handling single quotes and double quotes is different on Windows unless particular care is taken, which is what this package does.
|
||||
* [internal/pkg/lib](./internal/pkg/lib):
|
||||
|
|
@ -107,12 +110,11 @@ So, in broad overview, the key packages are:
|
|||
* [`Mlrmap`](./internal/pkg/types/mlrmap.go) is the sequence of key-value pairs which represents a Miller record. The key-lookup mechanism is optimized for Miller read/write usage patterns -- please see [mlrmap.go](./internal/pkg/types/mlrmap.go) for more details.
|
||||
* [`context`](./internal/pkg/types/context.go) supports AWK-like variables such as `FILENAME`, `NF`, `NR`, and so on.
|
||||
* [internal/pkg/cli](./internal/pkg/cli) is the flag-parsing logic for supporting Miller's command-line interface. When you type something like `mlr --icsv --ojson put '$sum = $a + $b' then filter '$sum > 1000' myfile.csv`, it's the CLI parser which makes it possible for Miller to construct a CSV record-reader, a transformer-chain of `put` then `filter`, and a JSON record-writer.
|
||||
* [internal/pkg/cliutil](./internal/pkg/cliutil) contains datatypes for the CLI-parser, which was split out to avoid a Go package-import cycle.
|
||||
* [internal/pkg/climain](./internal/pkg/climain) contains a layer which invokes `internal/pkg/cli`, which was split out to avoid a Go package-import cycle.
|
||||
* [internal/pkg/stream](./internal/pkg/stream) is as above -- it uses Go channels to pipe together file-reads, to record-reading/parsing, to a chain of record-transformers, to record-writing/formatting, to terminal standard output.
|
||||
* [internal/pkg/input](./internal/pkg/input) is as above -- one record-reader type per supported input file format, and a factory method.
|
||||
* [internal/pkg/output](./internal/pkg/output) is as above -- one record-writer type per supported output file format, and a factory method.
|
||||
* [internal/pkg/transforming](./internal/pkg/transforming) contains the abstract record-transformer interface datatype, as well as the Go-channel chaining mechanism for piping one transformer into the next.
|
||||
* [internal/pkg/transformers](./internal/pkg/transformers) is all the concrete record-transformers such as `cat`, `tac`, `sort`, `put`, and so on. I put it here, not in `transforming`, so all files in `transformers` would be of the same type.
|
||||
* [internal/pkg/transformers](./internal/pkg/transformers) contains the abstract record-transformer interface datatype, as well as the Go-channel chaining mechanism for piping one transformer into the next. It also contains all the concrete record-transformers such as `cat`, `tac`, `sort`, `put`, and so on.
|
||||
* [internal/pkg/parsing](./internal/pkg/parsing) contains a single source file, `mlr.bnf`, which is the lexical/semantic grammar file for the Miller `put`/`filter` DSL using the GOCC framework. All subdirectories of `internal/pkg/parsing/` are autogen code created by GOCC's processing of `mlr.bnf`. If you need to edit `mlr.bnf`, please use [tools/build-dsl](./tools/build-dsl) to autogenerate Go code from it (using the GOCC tool). (This takes several minutes to run.)
|
||||
* [internal/pkg/dsl](./internal/pkg/dsl) contains [`ast_types.go`](internal/pkg/dsl/ast_types.go) which is the abstract syntax tree datatype shared between GOCC and Miller. I didn't use a `internal/pkg/dsl/ast` naming convention, although that would have been nice, in order to avoid a Go package-dependency cycle.
|
||||
* [internal/pkg/dsl/cst](./internal/pkg/dsl/cst) is the concrete syntax tree, constructed from an AST produced by GOCC. The CST is what is actually executed on every input record when you do things like `$z = $x * 0.3 * $y`. Please see the [internal/pkg/dsl/cst/README.md](./internal/pkg/dsl/cst/README.md) for more information.
|
||||
|
|
@ -149,17 +151,17 @@ nil through the reader/transformer/writer sequence.
|
|||
|
||||
[`Mlrval`](./internal/pkg/types/mlrval.go) is the datatype of record values, as well as expression/variable values in the Miller `put`/`filter` DSL. It includes string/int/float/boolean/void/absent/error types, not unlike PHP's `zval`.
|
||||
|
||||
* Miller's `absent` type is like Javascript's `undefined` -- it's for times when there is no such key, as in a DSL expression `$out = $foo` when the input record is `$x=3,y=4` -- there is no `$foo` so `$foo` has `absent` type. Nothing is written to the `$out` field in this case. See also [here](http://johnkerl.org/miller/doc/reference.html#Null_data:_empty_and_absent) for more information.
|
||||
* Miller's `absent` type is like Javascript's `undefined` -- it's for times when there is no such key, as in a DSL expression `$out = $foo` when the input record is `$x=3,y=4` -- there is no `$foo` so `$foo` has `absent` type. Nothing is written to the `$out` field in this case. See also [here](https://miller.readthedocs.io/en/latest/reference-main-null-data) for more information.
|
||||
* Miller's `void` type is like Javascript's `null` -- it's for times when there is a key with no value, as in `$out = $x` when the input record is `$x=,$y=4`. This is an overlap with `string` type, since a void value looks like an empty string. I've gone back and forth on this (including when I was writing the C implementation) -- whether to retain `void` as a distinct type from empty-string, or not. I ended up keeping it as it made the `Mlrval` logic easier to understand.
|
||||
* Miller's `error` type is for things like doing type-uncoerced addition of strings. Data-dependent errors are intended to result in `(error)`-valued output, rather than crashing Miller. See also [here](http://johnkerl.org/miller/doc/reference.html#Data_types) for more information.
|
||||
* Miller's `error` type is for things like doing type-uncoerced addition of strings. Data-dependent errors are intended to result in `(error)`-valued output, rather than crashing Miller. See also [here](https://miller.readthedocs.io/en/latest/reference-main-data-types) for more information.
|
||||
* Miller's number handling makes auto-overflow from int to float transparent, while preserving the possibility of 64-bit bitwise arithmetic.
|
||||
* This is different from JavaScript, which has only double-precision floats and thus no support for 64-bit numbers (note however that there is now [`BigInt`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt)).
|
||||
* This is also different from C and Go, wherein casts are necessary -- without which int arithmetic overflows.
|
||||
* See also [here](http://johnkerl.org/miller/doc/reference.html#Arithmetic) for the semantics of Miller arithmetic, which the [`Mlrval`](./internal/pkg/types/mlrval.go) class implements.
|
||||
* See also [here](https://miller.readthedocs.io/en/latest/reference-main-arithmetic) for the semantics of Miller arithmetic, which the [`Mlrval`](./internal/pkg/types/mlrval.go) class implements.
|
||||
|
||||
## Software-testing methodology
|
||||
|
||||
See [./regtest/README.md](./regtest/README.md).
|
||||
See [./test/README.md](./test/README.md).
|
||||
|
||||
## Godoc
|
||||
|
||||
|
|
@ -172,7 +174,7 @@ To view doc material, you can:
|
|||
* `cd go`
|
||||
* `godoc -http=:6060 -goroot .`
|
||||
* Browse to `http://localhost:6060`
|
||||
* Note: control-C an restart the server, then reload in the browser, to pick up edits to source files
|
||||
* Note: control-C and restart the server, then reload in the browser, to pick up edits to source files
|
||||
|
||||
## Source-code indexing
|
||||
|
||||
|
|
|
|||
|
|
@ -39,7 +39,7 @@ key-value-pair data in a variety of data formats.
|
|||
# More documentation links
|
||||
|
||||
* [**Full documentation**](https://miller.readthedocs.io/)
|
||||
* [Miller's license is two-clause BSD](https://github.com/johnkerl/miller/blob/master/LICENSE.txt)
|
||||
* [Miller's license is two-clause BSD](https://github.com/johnkerl/miller/blob/main/LICENSE.txt)
|
||||
* [Notes about issue-labeling in the Github repo](https://github.com/johnkerl/miller/wiki/Issue-labeling)
|
||||
* [Active issues](https://github.com/johnkerl/miller/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc)
|
||||
|
||||
|
|
@ -79,12 +79,12 @@ See also [building from source](https://miller.readthedocs.io/en/latest/build.ht
|
|||
# Building from source
|
||||
|
||||
* `make` and `make check`
|
||||
* The Miller executable is `go/mlr` (or `go\mlr.exe` on Windows)
|
||||
* For more developer information please see [go/README.md](./go/README.md)
|
||||
* The Miller executable is `./mlr` (or `.\mlr.exe` on Windows)
|
||||
* For more developer information please see [README-go-port.md](./README-go-port.md)
|
||||
|
||||
# License
|
||||
|
||||
[License: BSD2](https://github.com/johnkerl/miller/blob/master/LICENSE.txt)
|
||||
[License: BSD2](https://github.com/johnkerl/miller/blob/main/LICENSE.txt)
|
||||
|
||||
# Community
|
||||
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ You will need to first install Go version 1.15 or higher: please see [https://go
|
|||
|
||||
## Miller license
|
||||
|
||||
Two-clause BSD license [https://github.com/johnkerl/miller/blob/master/LICENSE.txt](https://github.com/johnkerl/miller/blob/master/LICENSE.txt).
|
||||
Two-clause BSD license [https://github.com/johnkerl/miller/blob/main/LICENSE.txt](https://github.com/johnkerl/miller/blob/main/LICENSE.txt).
|
||||
|
||||
## From release tarball
|
||||
|
||||
|
|
@ -30,7 +30,7 @@ Two-clause BSD license [https://github.com/johnkerl/miller/blob/master/LICENSE.t
|
|||
* `tar zxvf mlr-i.j.k.tar.gz`
|
||||
* `cd mlr-i.j.k`
|
||||
* `cd go`
|
||||
* `make` creates the `go/mlr` (or `go\mlr.exe` on Windows) executable
|
||||
* `make` creates the `./mlr` (or `.\mlr.exe` on Windows) executable
|
||||
* `make check` runs tests
|
||||
* `make install` installs the `mlr` executable and the `mlr` manpage
|
||||
* On Windows, if you don't have `make`, then you can do `choco install make` -- or, alternatively:
|
||||
|
|
@ -95,7 +95,7 @@ In this example I am using version 6.1.0 to 6.2.0; of course that will change fo
|
|||
* Notify:
|
||||
|
||||
* Submit `brew` pull request; notify any other distros which don't appear to have autoupdated since the previous release (notes below)
|
||||
* Similarly for `macports`: [https://github.com/macports/macports-ports/blob/master/textproc/miller/Portfile](https://github.com/macports/macports-ports/blob/master/textproc/miller/Portfile)
|
||||
* Similarly for `macports`: [https://github.com/macports/macports-ports/blob/main/textproc/miller/Portfile](https://github.com/macports/macports-ports/blob/main/textproc/miller/Portfile)
|
||||
* Social-media updates.
|
||||
|
||||
<pre class="pre-non-highlight-non-pair">
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Two-clause BSD license [https://github.com/johnkerl/miller/blob/master/LICENSE.t
|
|||
* `tar zxvf mlr-i.j.k.tar.gz`
|
||||
* `cd mlr-i.j.k`
|
||||
* `cd go`
|
||||
* `make` creates the `go/mlr` (or `go\mlr.exe` on Windows) executable
|
||||
* `make` creates the `./mlr` (or `.\mlr.exe` on Windows) executable
|
||||
* `make check` runs tests
|
||||
* `make install` installs the `mlr` executable and the `mlr` manpage
|
||||
* On Windows, if you don't have `make`, then you can do `choco install make` -- or, alternatively:
|
||||
|
|
|
|||
|
|
@ -40,9 +40,9 @@ As of Miller-6's current pre-release status, the best way to test is to either b
|
|||
|
||||
Issues: [https://github.com/johnkerl/miller/issues](https://github.com/johnkerl/miller/issues)
|
||||
|
||||
Developer notes: [https://github.com/johnkerl/miller/blob/main/go/README.md](https://github.com/johnkerl/miller/blob/main/go/README.md)
|
||||
Developer notes: [https://github.com/johnkerl/miller/blob/main/README-go-port.md](https://github.com/johnkerl/miller/blob/main/README-go-port.md)
|
||||
|
||||
PRs which pass regression test ([https://github.com/johnkerl/miller/blob/main/go/regtest/README.md](https://github.com/johnkerl/miller/blob/main/go/regtest/README.md)) are always welcome!
|
||||
PRs which pass regression test ([https://github.com/johnkerl/miller/blob/main/test/README.md](https://github.com/johnkerl/miller/blob/main/test/README.md)) are always welcome!
|
||||
|
||||
## Build script
|
||||
|
||||
|
|
|
|||
|
|
@ -24,9 +24,9 @@ As of Miller-6's current pre-release status, the best way to test is to either b
|
|||
|
||||
Issues: [https://github.com/johnkerl/miller/issues](https://github.com/johnkerl/miller/issues)
|
||||
|
||||
Developer notes: [https://github.com/johnkerl/miller/blob/main/go/README.md](https://github.com/johnkerl/miller/blob/main/go/README.md)
|
||||
Developer notes: [https://github.com/johnkerl/miller/blob/main/README-go-port.md](https://github.com/johnkerl/miller/blob/main/README-go-port.md)
|
||||
|
||||
PRs which pass regression test ([https://github.com/johnkerl/miller/blob/main/go/regtest/README.md](https://github.com/johnkerl/miller/blob/main/go/regtest/README.md)) are always welcome!
|
||||
PRs which pass regression test ([https://github.com/johnkerl/miller/blob/main/test/README.md](https://github.com/johnkerl/miller/blob/main/test/README.md)) are always welcome!
|
||||
|
||||
## Build script
|
||||
|
||||
|
|
|
|||
|
|
@ -243,7 +243,7 @@ purple triangle 0 257 0.435535 0.859129 0.812290 5.753095
|
|||
red square 0 322 0.201551 0.953110 0.771991 5.612050
|
||||
</pre>
|
||||
|
||||
Look at uncategorized stats (using [creach](https://github.com/johnkerl/scripts/blob/master/fundam/creach) for spacing).
|
||||
Look at uncategorized stats (using [creach](https://github.com/johnkerl/scripts/blob/main/fundam/creach) for spacing).
|
||||
|
||||
Here it looks reasonable that `u` is unit-uniform; something's up with `v` but we can't yet see what:
|
||||
|
||||
|
|
|
|||
|
|
@ -18,7 +18,9 @@ Quick links:
|
|||
|
||||
See also the [list of issues tagged with go-port](https://github.com/johnkerl/miller/issues?q=label%3Ago-port).
|
||||
|
||||
## Documentation improvements
|
||||
## User experience
|
||||
|
||||
### Documentation improvements
|
||||
|
||||
Documentation (what you're reading here) and online help (`mlr --help`) have been completely reworked.
|
||||
|
||||
|
|
@ -40,45 +42,88 @@ pages have been split up into separate pages. (See also
|
|||
Since CSV is overwhelmingly the most popular data format for Miller, it is
|
||||
now discussed first, and more examples use CSV.
|
||||
|
||||
## Improved internationalization support
|
||||
|
||||
You can now write field names, local variables, etc. all in UTF-8, e.g. `mlr
|
||||
--c2p filter '$σχήμα == "κύκλος"' παράδειγμα.csv`. See the
|
||||
[internationalization page](internationalization.md) for examples.
|
||||
|
||||
## Improved datetime/timezone support
|
||||
|
||||
Including support for specifying timezone via function arguments, as an alternative to
|
||||
the `TZ` environment variable. Please see [DSL datetime/timezone functions](reference-dsl-time.md).
|
||||
|
||||
## Improved JSON support, and arrays
|
||||
|
||||
Arrays are now supported in Miller's `put`/`filter` programming language, as
|
||||
described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is
|
||||
now a keyword so this is no longer usable as a local-variable or UDF name.)
|
||||
|
||||
JSON support is improved:
|
||||
|
||||
* Direct support for arrays means that you can now use Miller to process more JSON files.
|
||||
* Streamable JSON parsing: Miller's internal record-processing pipeline starts as soon as the first record is read (which was already the case for other file formats). This means that, unless records are wrapped with outermost `[...]`, Miller now handles JSON in `tail -f` contexts like it does for other file formats.
|
||||
* Flatten/unflatten -- conversion of JSON nested data structures (arrays and/or maps in record values) to/from non-JSON formats is a powerful new feature, discussed in the page [Flatten/unflatten: JSON vs. tabular formats](flatten-unflatten.md).
|
||||
* Since types are better handled now, the workaround flags `--jvquoteall` and `--jknquoteint` no longer have meaning -- although they're accepted as no-ops at the command line for backward compatibility.
|
||||
* Multi-line JSON is now the default. Use `--no-jvstack` for Miller-5 style, which required `--jvstack` to get multiline output.
|
||||
|
||||
See also the [Arrays reference](reference-main-arrays.md) for more information.
|
||||
|
||||
## Improved Windows experience
|
||||
### Improved Windows experience
|
||||
|
||||
Stronger support for Windows (with or without MSYS2), with a couple of
|
||||
exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
|
||||
|
||||
Binaries are reliably available using GitHub Actions: see also [Installation](installing-miller.md).
|
||||
|
||||
## In-process support for compressed input
|
||||
### Output colorization
|
||||
|
||||
Miller uses separate, customizable colors for keys and values whenever the output is to a terminal. See [Output Colorization](output-colorization.md).
|
||||
|
||||
### Improved command-line parsing
|
||||
|
||||
Miller 6 has getoptish command-line parsing ([pull request 467](https://github.com/johnkerl/miller/pull/467)):
|
||||
|
||||
* `-xyz` expands automatically to `-x -y -z`, so (for example) `mlr cut -of shape,flag` is the same as `mlr cut -o -f shape,flag`.
|
||||
* `--foo=bar` expands automatically to `--foo bar`, so (for example) `mlr --ifs=comma` is the same as `mlr --ifs comma`.
|
||||
* `--mfrom`, `--load`, `--mload` as described in the [flags reference](reference-main-flag-list.md#miscellaneous-flags).
|
||||
|
||||
A small but nice item: since **mlr --csv** and **mlr --json** are so common, you can now use alternate shorthands **mlr -c** and **mlr -j**, respectively.
|
||||
|
||||
### Improved error messages for DSL parsing
|
||||
|
||||
For `mlr put` and `mlr filter`, parse-error messages now include location information:
|
||||
|
||||
<pre class="pre-non-highlight-non-pair">
|
||||
mlr: cannot parse DSL expression.
|
||||
Parse error on token ">" at line 63 columnn 7.
|
||||
</pre>
|
||||
|
||||
### REPL
|
||||
|
||||
Miller now has a read-evaluate-print-loop ([REPL](repl.md)) where you can single-step through your data-file record, express arbitrary statements to converse with the data, etc.
|
||||
|
||||
<pre class="pre-highlight-in-pair">
|
||||
<b>mlr repl</b>
|
||||
</pre>
|
||||
<pre class="pre-non-highlight-in-pair">
|
||||
|
||||
[mlr] 1 + 2
|
||||
3
|
||||
|
||||
[mlr] apply([1,2,3,4,5], func(e) {return e ** 3})
|
||||
[1, 8, 27, 64, 125]
|
||||
|
||||
[mlr] :open example.csv
|
||||
|
||||
[mlr] :read
|
||||
|
||||
[mlr] $*
|
||||
{
|
||||
"color": "yellow",
|
||||
"shape": "triangle",
|
||||
"flag": "true",
|
||||
"k": 1,
|
||||
"index": 11,
|
||||
"quantity": 43.6498,
|
||||
"rate": 9.8870
|
||||
}
|
||||
|
||||
</pre>
|
||||
|
||||
## Localization and internationalization
|
||||
|
||||
### Improved internationalization support
|
||||
|
||||
You can now write field names, local variables, etc. all in UTF-8, e.g. `mlr
|
||||
--c2p filter '$σχήμα == "κύκλος"' παράδειγμα.csv`. See the
|
||||
[internationalization page](internationalization.md) for examples.
|
||||
|
||||
### Improved datetime/timezone support
|
||||
|
||||
Including support for specifying timezone via function arguments, as an alternative to
|
||||
the `TZ` environment variable. Please see [DSL datetime/timezone functions](reference-dsl-time.md).
|
||||
|
||||
## Data ingestion
|
||||
|
||||
### In-process support for compressed input
|
||||
|
||||
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly for `.z` and `.bz2` files. Please see the page on [Compressed data](reference-main-compressed-data.md) for more information.
|
||||
|
||||
## Support for reading web URLs
|
||||
### Support for reading web URLs
|
||||
|
||||
You can read input with prefixes `https://`, `http://`, and `file://`:
|
||||
|
||||
|
|
@ -100,11 +145,25 @@ purple,triangle,false,5,51,81.2290,8.5910
|
|||
purple,triangle,false,7,65,80.1405,5.8240
|
||||
</pre>
|
||||
|
||||
## Output colorization
|
||||
## Data processing
|
||||
|
||||
Miller uses separate, customizable colors for keys and values whenever the output is to a terminal. See [Output Colorization](output-colorization.md).
|
||||
### Improved JSON support, and arrays
|
||||
|
||||
## Improved numeric conversion
|
||||
Arrays are now supported in Miller's `put`/`filter` programming language, as
|
||||
described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is
|
||||
now a keyword so this is no longer usable as a local-variable or UDF name.)
|
||||
|
||||
JSON support is improved:
|
||||
|
||||
* Direct support for arrays means that you can now use Miller to process more JSON files.
|
||||
* Streamable JSON parsing: Miller's internal record-processing pipeline starts as soon as the first record is read (which was already the case for other file formats). This means that, unless records are wrapped with outermost `[...]`, Miller now handles JSON in `tail -f` contexts like it does for other file formats.
|
||||
* Flatten/unflatten -- conversion of JSON nested data structures (arrays and/or maps in record values) to/from non-JSON formats is a powerful new feature, discussed in the page [Flatten/unflatten: JSON vs. tabular formats](flatten-unflatten.md).
|
||||
* Since types are better handled now, the workaround flags `--jvquoteall` and `--jknquoteint` no longer have meaning -- although they're accepted as no-ops at the command line for backward compatibility.
|
||||
* Multi-line JSON is now the default. Use `--no-jvstack` for Miller-5 style, which required `--jvstack` to get multiline output.
|
||||
|
||||
See also the [Arrays reference](reference-main-arrays.md) for more information.
|
||||
|
||||
### Improved numeric conversion
|
||||
|
||||
The most central part of Miller 6 is a deep refactor of how data values are parsed
|
||||
from file contents, how types are inferred, and how they're converted back to
|
||||
|
|
@ -146,11 +205,7 @@ For example (see [https://github.com/johnkerl/miller/issues/178](https://github.
|
|||
}
|
||||
</pre>
|
||||
|
||||
## REPL
|
||||
|
||||
Miller now has a read-evaluate-print-loop ([REPL](repl.md)) where you can single-step through your data-file record, express arbitrary statements to converse with the data, etc.
|
||||
|
||||
## Regex support for IFS and IPS
|
||||
### Regex support for IFS and IPS
|
||||
|
||||
You can now split fields on whitespace when whitespace is a mix of tabs and
|
||||
spaces. As well, you can use regular expressions for the input field-separator
|
||||
|
|
@ -159,11 +214,11 @@ and the input pair-separator. Please see the section on
|
|||
|
||||
In particular, for NIDX format, the default IFS now allows splitting on one or more of space or tab.
|
||||
|
||||
## Case-folded sorting options
|
||||
### Case-folded sorting options
|
||||
|
||||
The [sort](reference-verbs.md#sort) verb now accepts `-c` and `-cr` options for case-folded ascending/descending sort, respetively.
|
||||
|
||||
## New DSL functions / operators
|
||||
### New DSL functions / operators
|
||||
|
||||
* Higher-order functions [`select`](reference-dsl-builtin-functions.md#select), [`apply`](reference-dsl-builtin-functions.md#apply), [`reduce`](reference-dsl-builtin-functions.md#reduce), [`fold`](reference-dsl-builtin-functions.md#fold), and [`sort`](reference-dsl-builtin-functions.md#sort). See the [sorting page](sorting.md) and the [higher-order-functions page](reference-dsl-higher-order-functions.md) for more information.
|
||||
|
||||
|
|
@ -178,26 +233,9 @@ absent-empty-coalesce operator [`???`](reference-dsl-builtin-functions.md#absent
|
|||
|
||||
* Unsigned right-shift [`>>>`](reference-dsl-builtin-functions.md#ursh) along with `>>>=`.
|
||||
|
||||
## Improved command-line parsing
|
||||
|
||||
Miller 6 has getoptish command-line parsing ([pull request 467](https://github.com/johnkerl/miller/pull/467)):
|
||||
|
||||
* `-xyz` expands automatically to `-x -y -z`, so (for example) `mlr cut -of shape,flag` is the same as `mlr cut -o -f shape,flag`.
|
||||
* `--foo=bar` expands automatically to `--foo bar`, so (for example) `mlr --ifs=comma` is the same as `mlr --ifs comma`.
|
||||
* `--mfrom`, `--load`, `--mload` as described in the [flags reference](reference-main-flag-list.md#miscellaneous-flags).
|
||||
|
||||
## Improved error messages for DSL parsing
|
||||
|
||||
For `mlr put` and `mlr filter`, parse-error messages now include location information:
|
||||
|
||||
<pre class="pre-non-highlight-non-pair">
|
||||
mlr: cannot parse DSL expression.
|
||||
Parse error on token ">" at line 63 columnn 7.
|
||||
</pre>
|
||||
|
||||
## Developer-specific aspects
|
||||
|
||||
* Miller has been ported from C to Go. Developer notes: [https://github.com/johnkerl/miller/blob/main/go/README.md](https://github.com/johnkerl/miller/blob/main/go/README.md).
|
||||
* Miller has been ported from C to Go. Developer notes: [https://github.com/johnkerl/miller/blob/main/README-go-port.md](https://github.com/johnkerl/miller/blob/main/README-go-port.md).
|
||||
* Regression testing has been completely reworked, including regression-testing now running fully on Windows (alongside Linux and Mac) [on each GitHub commit](https://github.com/johnkerl/miller/actions).
|
||||
|
||||
## Changes from Miller 5
|
||||
|
|
|
|||
|
|
@ -2,7 +2,9 @@
|
|||
|
||||
See also the [list of issues tagged with go-port](https://github.com/johnkerl/miller/issues?q=label%3Ago-port).
|
||||
|
||||
## Documentation improvements
|
||||
## User experience
|
||||
|
||||
### Documentation improvements
|
||||
|
||||
Documentation (what you're reading here) and online help (`mlr --help`) have been completely reworked.
|
||||
|
||||
|
|
@ -24,18 +26,97 @@ pages have been split up into separate pages. (See also
|
|||
Since CSV is overwhelmingly the most popular data format for Miller, it is
|
||||
now discussed first, and more examples use CSV.
|
||||
|
||||
## Improved internationalization support
|
||||
### Improved Windows experience
|
||||
|
||||
Stronger support for Windows (with or without MSYS2), with a couple of
|
||||
exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
|
||||
|
||||
Binaries are reliably available using GitHub Actions: see also [Installation](installing-miller.md).
|
||||
|
||||
### Output colorization
|
||||
|
||||
Miller uses separate, customizable colors for keys and values whenever the output is to a terminal. See [Output Colorization](output-colorization.md).
|
||||
|
||||
### Improved command-line parsing
|
||||
|
||||
Miller 6 has getoptish command-line parsing ([pull request 467](https://github.com/johnkerl/miller/pull/467)):
|
||||
|
||||
* `-xyz` expands automatically to `-x -y -z`, so (for example) `mlr cut -of shape,flag` is the same as `mlr cut -o -f shape,flag`.
|
||||
* `--foo=bar` expands automatically to `--foo bar`, so (for example) `mlr --ifs=comma` is the same as `mlr --ifs comma`.
|
||||
* `--mfrom`, `--load`, `--mload` as described in the [flags reference](reference-main-flag-list.md#miscellaneous-flags).
|
||||
|
||||
A small but nice item: since **mlr --csv** and **mlr --json** are so common, you can now use alternate shorthands **mlr -c** and **mlr -j**, respectively.
|
||||
|
||||
### Improved error messages for DSL parsing
|
||||
|
||||
For `mlr put` and `mlr filter`, parse-error messages now include location information:
|
||||
|
||||
GENMD-CARDIFY
|
||||
mlr: cannot parse DSL expression.
|
||||
Parse error on token ">" at line 63 columnn 7.
|
||||
GENMD-EOF
|
||||
|
||||
### REPL
|
||||
|
||||
Miller now has a read-evaluate-print-loop ([REPL](repl.md)) where you can single-step through your data-file record, express arbitrary statements to converse with the data, etc.
|
||||
|
||||
GENMD-CARDIFY-HIGHLIGHT-ONE
|
||||
mlr repl
|
||||
|
||||
[mlr] 1 + 2
|
||||
3
|
||||
|
||||
[mlr] apply([1,2,3,4,5], func(e) {return e ** 3})
|
||||
[1, 8, 27, 64, 125]
|
||||
|
||||
[mlr] :open example.csv
|
||||
|
||||
[mlr] :read
|
||||
|
||||
[mlr] $*
|
||||
{
|
||||
"color": "yellow",
|
||||
"shape": "triangle",
|
||||
"flag": "true",
|
||||
"k": 1,
|
||||
"index": 11,
|
||||
"quantity": 43.6498,
|
||||
"rate": 9.8870
|
||||
}
|
||||
|
||||
GENMD-EOF
|
||||
|
||||
## Localization and internationalization
|
||||
|
||||
### Improved internationalization support
|
||||
|
||||
You can now write field names, local variables, etc. all in UTF-8, e.g. `mlr
|
||||
--c2p filter '$σχήμα == "κύκλος"' παράδειγμα.csv`. See the
|
||||
[internationalization page](internationalization.md) for examples.
|
||||
|
||||
## Improved datetime/timezone support
|
||||
### Improved datetime/timezone support
|
||||
|
||||
Including support for specifying timezone via function arguments, as an alternative to
|
||||
the `TZ` environment variable. Please see [DSL datetime/timezone functions](reference-dsl-time.md).
|
||||
|
||||
## Improved JSON support, and arrays
|
||||
## Data ingestion
|
||||
|
||||
### In-process support for compressed input
|
||||
|
||||
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly for `.z` and `.bz2` files. Please see the page on [Compressed data](reference-main-compressed-data.md) for more information.
|
||||
|
||||
### Support for reading web URLs
|
||||
|
||||
You can read input with prefixes `https://`, `http://`, and `file://`:
|
||||
|
||||
GENMD-RUN-COMMAND
|
||||
mlr --csv sort -f shape \
|
||||
https://raw.githubusercontent.com/johnkerl/miller/main/docs/src/gz-example.csv.gz
|
||||
GENMD-EOF
|
||||
|
||||
## Data processing
|
||||
|
||||
### Improved JSON support, and arrays
|
||||
|
||||
Arrays are now supported in Miller's `put`/`filter` programming language, as
|
||||
described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is
|
||||
|
|
@ -51,31 +132,7 @@ JSON support is improved:
|
|||
|
||||
See also the [Arrays reference](reference-main-arrays.md) for more information.
|
||||
|
||||
## Improved Windows experience
|
||||
|
||||
Stronger support for Windows (with or without MSYS2), with a couple of
|
||||
exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
|
||||
|
||||
Binaries are reliably available using GitHub Actions: see also [Installation](installing-miller.md).
|
||||
|
||||
## In-process support for compressed input
|
||||
|
||||
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly for `.z` and `.bz2` files. Please see the page on [Compressed data](reference-main-compressed-data.md) for more information.
|
||||
|
||||
## Support for reading web URLs
|
||||
|
||||
You can read input with prefixes `https://`, `http://`, and `file://`:
|
||||
|
||||
GENMD-RUN-COMMAND
|
||||
mlr --csv sort -f shape \
|
||||
https://raw.githubusercontent.com/johnkerl/miller/main/docs/src/gz-example.csv.gz
|
||||
GENMD-EOF
|
||||
|
||||
## Output colorization
|
||||
|
||||
Miller uses separate, customizable colors for keys and values whenever the output is to a terminal. See [Output Colorization](output-colorization.md).
|
||||
|
||||
## Improved numeric conversion
|
||||
### Improved numeric conversion
|
||||
|
||||
The most central part of Miller 6 is a deep refactor of how data values are parsed
|
||||
from file contents, how types are inferred, and how they're converted back to
|
||||
|
|
@ -106,11 +163,7 @@ GENMD-RUN-COMMAND
|
|||
echo '{ "x": 1.230, "y": 1.230000000 }' | mlr --json cat
|
||||
GENMD-EOF
|
||||
|
||||
## REPL
|
||||
|
||||
Miller now has a read-evaluate-print-loop ([REPL](repl.md)) where you can single-step through your data-file record, express arbitrary statements to converse with the data, etc.
|
||||
|
||||
## Regex support for IFS and IPS
|
||||
### Regex support for IFS and IPS
|
||||
|
||||
You can now split fields on whitespace when whitespace is a mix of tabs and
|
||||
spaces. As well, you can use regular expressions for the input field-separator
|
||||
|
|
@ -119,11 +172,11 @@ and the input pair-separator. Please see the section on
|
|||
|
||||
In particular, for NIDX format, the default IFS now allows splitting on one or more of space or tab.
|
||||
|
||||
## Case-folded sorting options
|
||||
### Case-folded sorting options
|
||||
|
||||
The [sort](reference-verbs.md#sort) verb now accepts `-c` and `-cr` options for case-folded ascending/descending sort, respetively.
|
||||
|
||||
## New DSL functions / operators
|
||||
### New DSL functions / operators
|
||||
|
||||
* Higher-order functions [`select`](reference-dsl-builtin-functions.md#select), [`apply`](reference-dsl-builtin-functions.md#apply), [`reduce`](reference-dsl-builtin-functions.md#reduce), [`fold`](reference-dsl-builtin-functions.md#fold), and [`sort`](reference-dsl-builtin-functions.md#sort). See the [sorting page](sorting.md) and the [higher-order-functions page](reference-dsl-higher-order-functions.md) for more information.
|
||||
|
||||
|
|
@ -138,26 +191,9 @@ absent-empty-coalesce operator [`???`](reference-dsl-builtin-functions.md#absent
|
|||
|
||||
* Unsigned right-shift [`>>>`](reference-dsl-builtin-functions.md#ursh) along with `>>>=`.
|
||||
|
||||
## Improved command-line parsing
|
||||
|
||||
Miller 6 has getoptish command-line parsing ([pull request 467](https://github.com/johnkerl/miller/pull/467)):
|
||||
|
||||
* `-xyz` expands automatically to `-x -y -z`, so (for example) `mlr cut -of shape,flag` is the same as `mlr cut -o -f shape,flag`.
|
||||
* `--foo=bar` expands automatically to `--foo bar`, so (for example) `mlr --ifs=comma` is the same as `mlr --ifs comma`.
|
||||
* `--mfrom`, `--load`, `--mload` as described in the [flags reference](reference-main-flag-list.md#miscellaneous-flags).
|
||||
|
||||
## Improved error messages for DSL parsing
|
||||
|
||||
For `mlr put` and `mlr filter`, parse-error messages now include location information:
|
||||
|
||||
GENMD-CARDIFY
|
||||
mlr: cannot parse DSL expression.
|
||||
Parse error on token ">" at line 63 columnn 7.
|
||||
GENMD-EOF
|
||||
|
||||
## Developer-specific aspects
|
||||
|
||||
* Miller has been ported from C to Go. Developer notes: [https://github.com/johnkerl/miller/blob/main/go/README.md](https://github.com/johnkerl/miller/blob/main/go/README.md).
|
||||
* Miller has been ported from C to Go. Developer notes: [https://github.com/johnkerl/miller/blob/main/README-go-port.md](https://github.com/johnkerl/miller/blob/main/README-go-port.md).
|
||||
* Regression testing has been completely reworked, including regression-testing now running fully on Windows (alongside Linux and Mac) [on each GitHub commit](https://github.com/johnkerl/miller/actions).
|
||||
|
||||
## Changes from Miller 5
|
||||
|
|
|
|||
|
|
@ -159,7 +159,7 @@ coinmate
|
|||
|
||||
## Randomly generating jabberwocky words
|
||||
|
||||
These are simple *n*-grams as [described here](http://johnkerl.org/randspell/randspell-slides-ts.pdf). Some common functions are [located here](https://github.com/johnkerl/miller/blob/master/docs/ngrams/ngfuncs.mlr.txt). Then here are scripts for [1-grams](https://github.com/johnkerl/miller/blob/master/docs/ngrams/ng1.mlr.txt), [2-grams](https://github.com/johnkerl/miller/blob/master/docs/ngrams/ng2.mlr.txt), [3-grams](https://github.com/johnkerl/miller/blob/master/docs/ngrams/ng3.mlr.txt), [4-grams](https://github.com/johnkerl/miller/blob/master/docs/ngrams/ng4.mlr.txt), and [5-grams](https://github.com/johnkerl/miller/blob/master/docs/ngrams/ng5.mlr.txt).
|
||||
These are simple *n*-grams as [described here](http://johnkerl.org/randspell/randspell-slides-ts.pdf). Some common functions are [located here](https://github.com/johnkerl/miller/blob/main/docs/ngrams/ngfuncs.mlr.txt). Then here are scripts for [1-grams](https://github.com/johnkerl/miller/blob/main/docs/ngrams/ng1.mlr.txt), [2-grams](https://github.com/johnkerl/miller/blob/main/docs/ngrams/ng2.mlr.txt), [3-grams](https://github.com/johnkerl/miller/blob/main/docs/ngrams/ng3.mlr.txt), [4-grams](https://github.com/johnkerl/miller/blob/main/docs/ngrams/ng4.mlr.txt), and [5-grams](https://github.com/johnkerl/miller/blob/main/docs/ngrams/ng5.mlr.txt).
|
||||
|
||||
The idea is that words from the input file are consumed, then taken apart and pasted back together in ways which imitate the letter-to-letter transitions found in the word list -- giving us automatically generated words in the same vein as *bromance* and *spork*:
|
||||
|
||||
|
|
|
|||
|
|
@ -140,4 +140,4 @@ green,678.12
|
|||
orange,123.45
|
||||
</pre>
|
||||
|
||||
Additionally, [`mlr help`](online-help.md), [`mlr repl`](repl.md), and [`mlr regtest`](https://github.com/johnkerl/miller/blob/main/go/regtest/README.md) are implemented here.
|
||||
Additionally, [`mlr help`](online-help.md), [`mlr repl`](repl.md), and [`mlr regtest`](https://github.com/johnkerl/miller/blob/main/test/README.md) are implemented here.
|
||||
|
|
|
|||
|
|
@ -44,4 +44,4 @@ GENMD-RUN-COMMAND
|
|||
mlr hex -r data/budget.csv | sed 's/20/2a/g' | mlr unhex
|
||||
GENMD-EOF
|
||||
|
||||
Additionally, [`mlr help`](online-help.md), [`mlr repl`](repl.md), and [`mlr regtest`](https://github.com/johnkerl/miller/blob/main/go/regtest/README.md) are implemented here.
|
||||
Additionally, [`mlr help`](online-help.md), [`mlr repl`](repl.md), and [`mlr regtest`](https://github.com/johnkerl/miller/blob/main/test/README.md) are implemented here.
|
||||
|
|
|
|||
|
|
@ -27,6 +27,25 @@ Miller's REPL isn't a source-level debugger which lets you execute one source-co
|
|||
|
||||
[mlr] 1 + 2
|
||||
3
|
||||
|
||||
[mlr] apply([1,2,3,4,5], func(e) {return e ** 3})
|
||||
[1, 8, 27, 64, 125]
|
||||
|
||||
[mlr] :open example.csv
|
||||
|
||||
[mlr] :read
|
||||
|
||||
[mlr] $*
|
||||
{
|
||||
"color": "yellow",
|
||||
"shape": "triangle",
|
||||
"flag": "true",
|
||||
"k": 1,
|
||||
"index": 11,
|
||||
"quantity": 43.6498,
|
||||
"rate": 9.8870
|
||||
}
|
||||
|
||||
</pre>
|
||||
|
||||
## Using Miller without the REPL
|
||||
|
|
|
|||
|
|
@ -9,6 +9,25 @@ mlr repl
|
|||
|
||||
[mlr] 1 + 2
|
||||
3
|
||||
|
||||
[mlr] apply([1,2,3,4,5], func(e) {return e ** 3})
|
||||
[1, 8, 27, 64, 125]
|
||||
|
||||
[mlr] :open example.csv
|
||||
|
||||
[mlr] :read
|
||||
|
||||
[mlr] $*
|
||||
{
|
||||
"color": "yellow",
|
||||
"shape": "triangle",
|
||||
"flag": "true",
|
||||
"k": 1,
|
||||
"index": 11,
|
||||
"quantity": 43.6498,
|
||||
"rate": 9.8870
|
||||
}
|
||||
|
||||
GENMD-EOF
|
||||
|
||||
## Using Miller without the REPL
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
Please see [go/README.md](https://github.com/johnkerl/miller/blob/master/go/README.md) for an overview; please see each subdirectory for details about it.
|
||||
Please see [../../README-go-port.md](../../README-go-port.md) for an overview; please see each subdirectory for details about it.
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
See [../../../regtest/README.md](../../../regtest/README.md).
|
||||
See [../../../test/README.md](../../../test/README.md).
|
||||
|
|
|
|||
|
|
@ -66,12 +66,6 @@ func RegTestMain(args []string) int {
|
|||
exeName = args[argi]
|
||||
argi++
|
||||
|
||||
} else if arg == "-c" {
|
||||
exeName = "../c/mlr"
|
||||
|
||||
} else if arg == "-g" {
|
||||
exeName = "../go/mlr"
|
||||
|
||||
} else if arg == "-s" {
|
||||
if argi >= argc {
|
||||
regTestUsage(verbName, os.Stderr, 1)
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@
|
|||
// * Optionally, an 'env' file with environment variables to be set before the
|
||||
// case and unset after.
|
||||
// * Optionally a case-local 'input' file; many cases use shared/common data
|
||||
// in regtest/input/.
|
||||
// in test/input/.
|
||||
// * The cmd file can refer to '${CASEDIR}' which is expanded at runtime to
|
||||
// the case directory path, so cases can refer to their 'input' and 'mlr'
|
||||
// files.
|
||||
|
|
@ -33,7 +33,7 @@
|
|||
// test/cases/dsl-redirects/0109/mlr
|
||||
//
|
||||
// $ cat test/cases/dsl-redirects//0109/cmd
|
||||
// mlr head -n 4 then put -q -f ${CASEDIR}/mlr regtest/input/abixy
|
||||
// mlr head -n 4 then put -q -f ${CASEDIR}/mlr test/input/abixy
|
||||
//
|
||||
// $ cat test/cases/dsl-redirects//0109/experr
|
||||
// NR=1,a=pan,b=pan
|
||||
|
|
|
|||
|
|
@ -4,9 +4,9 @@ This contains the implementation of the [`types.Mlrval`](./mlrval.go) datatype w
|
|||
|
||||
The [`types.Mlrval`](./mlrval.go) structure includes **string, int, float, boolean, array-of-mlrval, map-string-to-mlrval, void, absent, and error** types as well as type-conversion logic for various operators.
|
||||
|
||||
* Miller's `absent` type is like Javascript's `undefined` -- it's for times when there is no such key, as in a DSL expression `$out = $foo` when the input record is `$x=3,y=4` -- there is no `$foo` so `$foo` has `absent` type. Nothing is written to the `$out` field in this case. See also [here](http://johnkerl.org/miller/doc/reference.html#Null_data:_empty_and_absent) for more information.
|
||||
* Miller's `absent` type is like Javascript's `undefined` -- it's for times when there is no such key, as in a DSL expression `$out = $foo` when the input record is `$x=3,y=4` -- there is no `$foo` so `$foo` has `absent` type. Nothing is written to the `$out` field in this case. See also [here](https://miller.readthedocs.io/en/latest/reference-main-null-data) for more information.
|
||||
* Miller's `void` type is like Javascript's `null` -- it's for times when there is a key with no value, as in `$out = $x` when the input record is `$x=,$y=4`. This is an overlap with `string` type, since a void value looks like an empty string. I've gone back and forth on this (including when I was writing the C implementation) -- whether to retain `void` as a distinct type from empty-string, or not. I ended up keeping it as it made the `Mlrval` logic easier to understand.
|
||||
* Miller's `error` type is for things like doing type-uncoerced addition of strings. Data-dependent errors are intended to result in `(error)`-valued output, rather than crashing Miller. See also [here](http://johnkerl.org/miller/doc/reference.html#Data_types) for more information.
|
||||
* Miller's `error` type is for things like doing type-uncoerced addition of strings. Data-dependent errors are intended to result in `(error)`-valued output, rather than crashing Miller. See also [here](https://miller.readthedocs.io/en/latest/reference-main-data-types) for more information.
|
||||
* Miller's number handling makes auto-overflow from int to float transparent, while preserving the possibility of 64-bit bitwise arithmetic.
|
||||
* This is different from JavaScript, which has only double-precision floats and thus no support for 64-bit numbers (note however that there is now [`BigInt`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt)).
|
||||
* This is also different from C and Go, wherein casts are necessary -- without which int arithmetic overflows.
|
||||
|
|
@ -15,7 +15,7 @@ The [`types.Mlrval`](./mlrval.go) structure includes **string, int, float, boole
|
|||
* Bitwise operators such as `|`, `&`, and `^` map ints to ints.
|
||||
* The auto-overflowing math operators `+`, `*`, etc. map ints to ints unless they overflow in which case float is produced.
|
||||
* The int-preserving math operators `.+`, `.*`, etc. map ints to ints even if they overflow.
|
||||
* See also [here](http://johnkerl.org/miller/doc/reference.html#Arithmetic) for the semantics of Miller arithmetic, which the `Mlrval` class implements.
|
||||
* See also [here](https://miller.readthedocs.io/en/latest/reference-main-arithmetic) for the semantics of Miller arithmetic, which the `Mlrval` class implements.
|
||||
* Since a Mlrval can be of type array-of-mlrval or map-string-to-mlrval, a Mlrval is suited for JSON decoding/encoding.
|
||||
|
||||
# Mlrmap
|
||||
|
|
|
|||
|
|
@ -1,7 +1,8 @@
|
|||
# Respective MANPATH entries would include /usr/local/share/man or $HOME/man.
|
||||
# This should be run after make in the ../c directory but before make in the ../docs directory,
|
||||
# since ../go/mlr is used to autogenerate ./manpage.txt which is used in ../docs.
|
||||
# See also https://miller.readthedocs.io/en/latest/build.html#creating-a-new-release-for-developers
|
||||
# This should be run after make in the base directory but before make in the ../docs directory,
|
||||
# since ../mlr is used to autogenerate ./manpage.txt which is used in ../docs.
|
||||
# See also ../Makefile and
|
||||
# https://miller.readthedocs.io/en/latest/build.html#creating-a-new-release-for-developers
|
||||
PREFIX=/usr/local
|
||||
INSTALLDIR=$(PREFIX)/share/man/man1
|
||||
|
||||
|
|
@ -27,5 +28,4 @@ install:
|
|||
mkdir -p $(DESTDIR)/$(INSTALLDIR)
|
||||
cp mlr.1 $(DESTDIR)/$(INSTALLDIR)/mlr.1
|
||||
|
||||
# ----------------------------------------------------------------
|
||||
.PHONY: build install
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ The vast majority of Miller tests, though -- thousands of cases -- are tested by
|
|||
|
||||
## How to run the regression tests, in brief
|
||||
|
||||
*Note: while this `README.md` file is within the `go/regtest/` subdirectory, all paths in this file are written from the perspective of the user being cd'ed into the `go/` directory, i.e. this directory's parent directory.*
|
||||
*Note: while this `README.md` file is within the `test/` subdirectory, all paths in this file are written from the perspective of the user being cd'ed into the repository base directory, i.e. this directory's parent directory.*
|
||||
|
||||
* `mlr regtest --help`
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue