Compare commits

...

462 commits
v6.8.0 ... main

Author SHA1 Message Date
dependabot[bot]
f98a35bb05
Bump actions/cache from 5.0.1 to 5.0.2 (#1941)
Bumps [actions/cache](https://github.com/actions/cache) from 5.0.1 to 5.0.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](9255dc7a25...8b402f58fb)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 5.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-20 09:20:41 -05:00
dependabot[bot]
09083a0d25
Bump github.com/klauspost/compress from 1.18.2 to 1.18.3 (#1940)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.18.2 to 1.18.3.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Commits](https://github.com/klauspost/compress/compare/v1.18.2...v1.18.3)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-version: 1.18.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 09:52:25 -05:00
dependabot[bot]
b13037c84f
Bump actions/setup-go from 6.1.0 to 6.2.0 (#1938)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 6.1.0 to 6.2.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](4dc6199c7b...7a3fe6cf4c)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: 6.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-13 10:38:30 -05:00
dependabot[bot]
8ec8de61e3
Bump github/codeql-action from 4.31.9 to 4.31.10 (#1939)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.9 to 4.31.10.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](5d4e8d1aca...cdefb33c0f)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-13 10:33:32 -05:00
dependabot[bot]
888d27acdb
Bump golang.org/x/term from 0.38.0 to 0.39.0 (#1936)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.38.0 to 0.39.0.
- [Commits](https://github.com/golang/term/compare/v0.38.0...v0.39.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.39.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-12 10:55:11 -05:00
dependabot[bot]
49869ba8e4
Bump golang.org/x/text from 0.32.0 to 0.33.0 (#1937)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.32.0 to 0.33.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.32.0...v0.33.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.33.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-12 10:49:50 -05:00
John Kerl
eb972e19eb
Use GOCC fork for performance improvement (#1934)
* Use GOCC fork for performance improvement

* fix versions
2026-01-10 16:48:53 -05:00
dependabot[bot]
4ce21e998b
Bump golang.org/x/sys from 0.39.0 to 0.40.0 (#1933)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.39.0 to 0.40.0.
- [Commits](https://github.com/golang/sys/compare/v0.39.0...v0.40.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.40.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-09 09:54:19 -05:00
John Kerl
e08e3ca80c
Add snapcraft.io link to install instructions 2026-01-07 13:45:36 -05:00
dependabot[bot]
1cc17e27b0
Bump actions/checkout from 4 to 6 (#1932)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](https://github.com/actions/checkout/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-05 09:37:08 -05:00
John Kerl
a504e16b93
Try to build for Ubuntu arm64 (#1931)
* Fix Snap link

* Try to build for Ubuntu arm64
2026-01-02 15:19:34 -05:00
John Kerl
cee04c0747 Fix Snap link 2026-01-02 14:50:20 -05:00
John Kerl
421042833a README.md 2026-01-02 14:24:51 -05:00
John Kerl
b8db798a2f
Miller 6.16.0 (#1930)
* Miller 6.16.0

* make dev
2026-01-02 13:57:59 -05:00
John Kerl
5b6f64669a
Snap notes (#1929) 2026-01-02 13:29:41 -05:00
John Kerl
7b8822e2ef
Snap name is not mlr but miller (#1928) 2026-01-02 11:58:49 -05:00
kz6fittycent
ac30743242
Fixed README (#1871)
* initial snap commit

* Needs network interface

Network interface added - should correct connectivity issue.
Added workflow badge, too

* not needed

* README updates

* One more indentation/gap fixed

* Changed name to mlr from miller

No alias needed with this name change.

---------

Co-authored-by: John Kerl <kerl.john.r@gmail.com>
2026-01-02 11:52:52 -05:00
John Kerl
0b8da34b4a
Use snap name mlr, not miller (#1872) 2026-01-02 11:02:02 -05:00
dependabot[bot]
dc9105a922
Bump github/codeql-action from 4.31.8 to 4.31.9 (#1926)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.8 to 4.31.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](1b168cd394...5d4e8d1aca)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-17 09:40:33 -05:00
dependabot[bot]
38e9ff212b
Bump actions/cache from 5.0.0 to 5.0.1 (#1924)
Bumps [actions/cache](https://github.com/actions/cache) from 5.0.0 to 5.0.1.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](a783357455...9255dc7a25)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 5.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-15 09:55:46 -05:00
dependabot[bot]
8f1e327b4e
Bump actions/upload-artifact from 5.0.0 to 6.0.0 (#1925)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5.0.0 to 6.0.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](330a01c490...b7c566a772)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-15 09:55:39 -05:00
dependabot[bot]
e5d65fd28c
Bump actions/cache from 4.3.0 to 5.0.0 (#1922)
Bumps [actions/cache](https://github.com/actions/cache) from 4.3.0 to 5.0.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0057852bfa...a783357455)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 5.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-12 09:39:39 -05:00
dependabot[bot]
fe6c8d57bc
Bump github/codeql-action from 4.31.7 to 4.31.8 (#1923)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.7 to 4.31.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](cf1bb45a27...1b168cd394)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-12 09:39:28 -05:00
dependabot[bot]
c078c80361
Bump golang.org/x/term from 0.37.0 to 0.38.0 (#1921)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.37.0 to 0.38.0.
- [Commits](https://github.com/golang/term/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-10 11:11:28 -05:00
dependabot[bot]
34b1f0d4e9
Bump golang.org/x/term from 0.36.0 to 0.37.0 (#1909)
* `mlr sort -b` feature

* mlr regtest -p test/cases/cli-help && make dev

* Bump golang.org/x/term from 0.36.0 to 0.37.0

Bumps [golang.org/x/term](https://github.com/golang/term) from 0.36.0 to 0.37.0.
- [Commits](https://github.com/golang/term/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: John Kerl <john.kerl@datadoghq.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: John Kerl <kerl.john.r@gmail.com>
2025-12-09 10:21:08 -05:00
dependabot[bot]
9920e28b91
Bump golang.org/x/text from 0.31.0 to 0.32.0 (#1919)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.31.0 to 0.32.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.31.0...v0.32.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.32.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-09 10:11:14 -05:00
dependabot[bot]
1279a9b4a7
Bump golang.org/x/sys from 0.38.0 to 0.39.0 (#1920)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.38.0 to 0.39.0.
- [Commits](https://github.com/golang/sys/compare/v0.38.0...v0.39.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.39.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-09 10:10:19 -05:00
dependabot[bot]
155227cb4c
Bump github/codeql-action from 4.31.6 to 4.31.7 (#1918)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.6 to 4.31.7.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](fe4161a26a...cf1bb45a27)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-08 09:39:53 -05:00
dependabot[bot]
2f46fec72d
Bump github/codeql-action from 4.31.5 to 4.31.6 (#1916)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.5 to 4.31.6.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](fdbfb4d275...fe4161a26a)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-02 09:29:17 -05:00
dependabot[bot]
93be5051ff
Bump github.com/klauspost/compress from 1.18.1 to 1.18.2 (#1917)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.18.1 to 1.18.2.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Commits](https://github.com/klauspost/compress/compare/v1.18.1...v1.18.2)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-version: 1.18.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-02 09:18:20 -05:00
dependabot[bot]
df74ffe40d
Bump github/codeql-action from 4.31.4 to 4.31.5 (#1915)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.4 to 4.31.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](e12f017898...fdbfb4d275)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 09:59:00 -05:00
dependabot[bot]
439c4a2061
Bump actions/checkout from 5 to 6 (#1913)
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-21 08:51:40 -05:00
dependabot[bot]
efb7b55da5
Bump actions/setup-go from 6.0.0 to 6.1.0 (#1912)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 6.0.0 to 6.1.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](4469467582...4dc6199c7b)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: 6.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-20 09:00:05 -05:00
dependabot[bot]
2aa664bfea
Bump github/codeql-action from 4.31.3 to 4.31.4 (#1911)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.3 to 4.31.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](014f16e7ab...e12f017898)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-19 11:17:48 -05:00
dependabot[bot]
e5218ed8e7
Bump github/codeql-action from 4.31.2 to 4.31.3 (#1910)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.2 to 4.31.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](0499de31b9...014f16e7ab)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-14 09:43:26 -05:00
dependabot[bot]
a66e45539d
Bump golang.org/x/text from 0.30.0 to 0.31.0 (#1908)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.30.0 to 0.31.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.30.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.31.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-12 08:57:25 -05:00
dependabot[bot]
6351f51eeb
Bump golang.org/x/sys from 0.37.0 to 0.38.0 (#1907)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.37.0 to 0.38.0.
- [Commits](https://github.com/golang/sys/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 10:20:11 -05:00
dependabot[bot]
df8e979b66
Bump codespell-project/actions-codespell from 2.1 to 2.2 (#1906)
Bumps [codespell-project/actions-codespell](https://github.com/codespell-project/actions-codespell) from 2.1 to 2.2.
- [Release notes](https://github.com/codespell-project/actions-codespell/releases)
- [Commits](406322ec52...8f01853be1)

---
updated-dependencies:
- dependency-name: codespell-project/actions-codespell
  dependency-version: '2.2'
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-07 09:36:06 -05:00
dependabot[bot]
2a78d165ae
Bump github/codeql-action from 4.31.1 to 4.31.2 (#1904)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.1 to 4.31.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](5fe9434cd2...0499de31b9)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-31 10:28:52 -04:00
dependabot[bot]
bc9c718cf9
Bump github/codeql-action from 4.31.0 to 4.31.1 (#1903)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.31.0 to 4.31.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4e94bd11f7...5fe9434cd2)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-30 09:24:19 -04:00
dependabot[bot]
9149fd0d34
Bump github/codeql-action from 4.30.9 to 4.31.0 (#1902)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.30.9 to 4.31.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](16140ae1a1...4e94bd11f7)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.31.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-27 09:37:21 -04:00
dependabot[bot]
aea74327ff
Bump actions/upload-artifact from 4.6.2 to 5.0.0 (#1901)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.6.2 to 5.0.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](ea165f8d65...330a01c490)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: 5.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-27 09:36:08 -04:00
dependabot[bot]
6100f21785
Bump github.com/klauspost/compress from 1.18.0 to 1.18.1 (#1899)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.18.0 to 1.18.1.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.18.0...v1.18.1)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-version: 1.18.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-20 09:01:46 -04:00
dependabot[bot]
3e374f8861
Bump github/codeql-action from 4.30.8 to 4.30.9 (#1900)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.30.8 to 4.30.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f443b600d9...16140ae1a1)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.30.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-20 08:59:32 -04:00
dependabot[bot]
74f4901d05
Bump github/codeql-action from 4.30.7 to 4.30.8 (#1897)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.30.7 to 4.30.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](e296a93559...f443b600d9)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.30.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-13 09:38:05 -04:00
dependabot[bot]
e71b36d8c1
Bump golang.org/x/term from 0.35.0 to 0.36.0 (#1895)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.35.0 to 0.36.0.
- [Commits](https://github.com/golang/term/compare/v0.35.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.36.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-09 13:22:41 +02:00
dependabot[bot]
1557e47ae1
Bump golang.org/x/text from 0.29.0 to 0.30.0 (#1894)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.29.0 to 0.30.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.29.0...v0.30.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.30.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-09 13:14:58 +02:00
dependabot[bot]
8f882b2f75
Bump golang.org/x/sys from 0.36.0 to 0.37.0 (#1896)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.36.0 to 0.37.0.
- [Commits](https://github.com/golang/sys/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-09 13:08:43 +02:00
dependabot[bot]
f5226e87fe
Bump github/codeql-action from 3.30.6 to 4.30.7 (#1893)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.6 to 4.30.7.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](64d10c1313...e296a93559)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.30.7
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-08 13:08:08 +02:00
dependabot[bot]
f485bc07a5
Bump github/codeql-action from 3.30.5 to 3.30.6 (#1892)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.5 to 3.30.6.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](3599b3baa1...64d10c1313)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-03 09:24:44 -04:00
dependabot[bot]
eac1785756
Bump github/codeql-action from 3.30.4 to 3.30.5 (#1891)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.4 to 3.30.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](303c0aef88...3599b3baa1)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-29 10:57:13 -04:00
dependabot[bot]
f350581175
Bump actions/cache from 4.2.4 to 4.3.0 (#1889)
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.4 to 4.3.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0400d5f644...0057852bfa)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 4.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-25 09:41:02 -04:00
dependabot[bot]
5c5281fe28
Bump github/codeql-action from 3.30.3 to 3.30.4 (#1890)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.3 to 3.30.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](192325c861...303c0aef88)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-25 09:40:06 -04:00
dependabot[bot]
14e0229c34
Bump github/codeql-action from 3.30.2 to 3.30.3 (#1887)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.2 to 3.30.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](d3678e237b...192325c861)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-11 10:21:57 -04:00
dependabot[bot]
fbe1143e8a
Bump github/codeql-action from 3.30.1 to 3.30.2 (#1886)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.1 to 3.30.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f1f6e5f6af...d3678e237b)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-09 08:31:04 -04:00
dependabot[bot]
46a86503ea
Bump golang.org/x/term from 0.34.0 to 0.35.0 (#1883)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.34.0 to 0.35.0.
- [Commits](https://github.com/golang/term/compare/v0.34.0...v0.35.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.35.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 08:58:40 -04:00
dependabot[bot]
2d29beb204
Bump golang.org/x/sys from 0.35.0 to 0.36.0 (#1884)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.35.0 to 0.36.0.
- [Commits](https://github.com/golang/sys/compare/v0.35.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.36.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 08:46:45 -04:00
Rehan Daphedar
aec5c03093
fix go install command (#1881) 2025-09-08 08:45:57 -04:00
dependabot[bot]
26826a0b4b
Bump github/codeql-action from 3.30.0 to 3.30.1 (#1882)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.0 to 3.30.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](2d92b76c45...f1f6e5f6af)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 08:41:52 -04:00
dependabot[bot]
46653f0a8f
Bump golang.org/x/text from 0.28.0 to 0.29.0 (#1885)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.28.0 to 0.29.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 08:40:04 -04:00
dependabot[bot]
d87bd9f7d3
Bump actions/setup-go from 5.5.0 to 6.0.0 (#1880)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.5.0 to 6.0.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](d35c59abb0...4469467582)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-04 10:31:45 -04:00
John Kerl
3b9f169162
Support -o jsonl as well as --ojsonl (#1879)
* `mlr sort -b` feature

* mlr regtest -p test/cases/cli-help && make dev

* Support `-o jsonl` as well as `--ojsonl`
2025-09-02 16:47:19 -04:00
dependabot[bot]
05429ee3ba
Bump github/codeql-action from 3.29.11 to 3.30.0 (#1878)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.11 to 3.30.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](3c3833e0f8...2d92b76c45)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.30.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-01 20:15:19 -04:00
Stephen Kitt
2f3b6d38f9
Allow any Go 1.24 version (#1876)
Miller doesn't require 1.24.5 specifically, reduce the language level
to 1.24.0. This allows building with any 1.24 toolchain.

Signed-off-by: Stephen Kitt <steve@sk2.org>
2025-08-29 09:02:26 -04:00
dependabot[bot]
74e8e3cef6
Bump github.com/stretchr/testify from 1.11.0 to 1.11.1 (#1875)
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.11.0 to 1.11.1.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.11.0...v1.11.1)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-version: 1.11.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-28 09:04:25 -04:00
dependabot[bot]
2f38933a87
Bump github.com/stretchr/testify from 1.10.0 to 1.11.0 (#1874)
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-version: 1.11.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-25 18:26:46 -04:00
dependabot[bot]
43f6fa9ea6
Bump github/codeql-action from 3.29.10 to 3.29.11 (#1873)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.10 to 3.29.11.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](96f518a34f...3c3833e0f8)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.11
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-22 10:41:54 -04:00
John Kerl
d0f824aefe
Run make dev after merge of PR 1868 (#1869) 2025-08-20 10:21:51 -07:00
Andrea Borruso
120e977c1e
Update subs.go (#1868)
If I read “Convert all field names,” I think the verb acts on the field names. I think it would be better to write “Convert all fields.”
2025-08-20 10:11:34 -07:00
dependabot[bot]
6266a869eb
Bump github/codeql-action from 3.29.9 to 3.29.10 (#1867)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.9 to 3.29.10.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](df559355d5...96f518a34f)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-20 09:55:58 -07:00
dependabot[bot]
6509ed4586
Bump actions/checkout from 4 to 5 (#1866)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-20 09:53:09 -07:00
kz6fittycent
db11c17e54
initial snap commit (#1864) 2025-08-15 20:30:56 -04:00
John Kerl
3c2d4b22d2
Miller 6.15.0-dev (#1862)
* 6.15.0-dev

* make dev
2025-08-15 19:55:46 -04:00
John Kerl
3ad00b5686 unit-test coverage for error-handling 2025-08-15 19:54:36 -04:00
John Kerl
d2925aafe5 error-handling in the CSV-reader's constructor 2025-08-15 19:52:37 -04:00
John Kerl
8b524b3ada make dev 2025-08-15 19:48:48 -04:00
John Kerl
4d83e88ff6 Note that comment prefix for CSV must be single-character 2025-08-15 19:45:18 -04:00
dependabot[bot]
cd6431f7aa
Bump goreleaser/goreleaser-action from 6.3.0 to 6.4.0 (#1863)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 6.3.0 to 6.4.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](9c156ee8a1...e435ccd777)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-version: 6.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-15 08:07:06 -04:00
John Kerl
4ebef873d2
Miller 6.15.0 (#1860)
* miller 6.15.0

* make dev
2025-08-14 18:00:22 -04:00
John Kerl
06e16ea3ee
Don't parse CSV comments (#1859)
* `mlr sort -b` feature

* mlr regtest -p test/cases/cli-help && make dev

* Don't parse CSV comments

* Add tests for PR 1346

* Add tests for PR 1787

* Add test CSV files
2025-08-13 18:07:32 -04:00
dependabot[bot]
369156b70d
Bump actions/checkout from 4.2.2 to 5.0.0 (#1857)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.2 to 5.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](11bd71901b...08c6903cd8)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 5.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-12 14:43:42 -04:00
dependabot[bot]
78da997077
Bump github/codeql-action from 3.29.8 to 3.29.9 (#1856)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.8 to 3.29.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](76621b61de...df559355d5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-12 13:18:26 -04:00
dependabot[bot]
d4ace7527b
Bump golang.org/x/text from 0.27.0 to 0.28.0 (#1850)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.27.0 to 0.28.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.27.0...v0.28.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.28.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-11 09:16:11 -04:00
dependabot[bot]
f3a8fd42bc
Bump golang.org/x/term from 0.33.0 to 0.34.0 (#1851)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.33.0 to 0.34.0.
- [Commits](https://github.com/golang/term/compare/v0.33.0...v0.34.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-10 17:12:29 -05:00
dependabot[bot]
24a6e98709
Bump golang.org/x/sys from 0.34.0 to 0.35.0 (#1852)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.34.0 to 0.35.0.
- [Commits](https://github.com/golang/sys/compare/v0.34.0...v0.35.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.35.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-10 16:54:39 -05:00
dependabot[bot]
ab7a80cbf4
Bump github/codeql-action from 3.29.7 to 3.29.8 (#1853)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.7 to 3.29.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](51f77329af...76621b61de)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-10 16:54:11 -05:00
dependabot[bot]
44ddaea651
Bump actions/cache from 4.2.3 to 4.2.4 (#1854)
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.3 to 4.2.4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](5a3ec84eff...0400d5f644)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 4.2.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-10 16:53:54 -05:00
John Kerl
19e72f9dac
Preserve file mods on mlr -I (#1849)
* extract a helper function

* Preserve file mode on mlr -I
2025-08-05 18:11:27 -05:00
dependabot[bot]
3b8668d06f
Bump github/codeql-action from 3.29.4 to 3.29.5 (#1847)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.4 to 3.29.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4e828ff8d4...51f77329af)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-30 09:32:31 -04:00
dependabot[bot]
e6ca3f6856
Bump github.com/lestrrat-go/strftime from 1.1.0 to 1.1.1 (#1846)
Bumps [github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/lestrrat-go/strftime/releases)
- [Changelog](https://github.com/lestrrat-go/strftime/blob/master/Changes)
- [Commits](https://github.com/lestrrat-go/strftime/compare/v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/lestrrat-go/strftime
  dependency-version: 1.1.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-29 09:12:48 -04:00
John Kerl
1ef87c6278 add github.com/GuilloteauQ/miller-exercises to README.md 2025-07-26 13:10:43 -04:00
dependabot[bot]
226c9555ef
Bump github/codeql-action from 3.29.3 to 3.29.4 (#1845)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.3 to 3.29.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](d6bbdef45e...4e828ff8d4)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-24 09:21:32 -04:00
dependabot[bot]
cf03b6d49c
Bump github.com/klauspost/compress from 1.17.11 to 1.18.0 (#1844)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.11 to 1.18.0.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.11...v1.18.0)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-version: 1.18.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 10:53:04 -04:00
dependabot[bot]
f3fdfc4e29
try to fix build error (#1757)
Co-authored-by: John Kerl <kerl.john.r@gmail.com>
2025-07-22 23:11:25 -04:00
John Kerl
52b7a47ae9
Use Go 1.24.5 (#1843) 2025-07-22 20:15:48 -04:00
Duncan Lock
c4c3ae2119
Add scoop install to README.md (#1842)
Add `scoop install main/miller` to Windows installation options.
2025-07-22 09:10:01 -04:00
dependabot[bot]
b77d9826ea
Bump github/codeql-action from 3.29.2 to 3.29.3 (#1841)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.2 to 3.29.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](181d5eefc2...d6bbdef45e)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-21 09:47:44 -04:00
John Kerl
9445046bfe
Force decimal formatting for ints on JSON output (#1840)
* Force decimal formatting for ints on JSON output

* update a test case
2025-07-20 17:42:37 -04:00
John Kerl
fccdf215e6
DKVP --incr-key option (#1839)
* Code support for --incr-key

* Add source code for online help for new flag

* Run `make dev`
2025-07-20 17:05:24 -04:00
John Kerl
d264f562dc
Fix doc typo re empty and multiplication (#1838)
* Fix docs typo re empty and multiplication

* Run `make dev`
2025-07-20 16:36:50 -04:00
John Kerl
e7fe363d9a
mlr sort -b feature (#1833)
* `mlr sort -b` feature

* mlr regtest -p test/cases/cli-help && make dev
2025-07-11 12:41:04 -04:00
dependabot[bot]
865c9cc563
Bump golang.org/x/text from 0.26.0 to 0.27.0 (#1830)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.27.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:31:58 -04:00
dependabot[bot]
23acc8424a
Bump golang.org/x/term from 0.32.0 to 0.33.0 (#1831)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.32.0 to 0.33.0.
- [Commits](https://github.com/golang/term/compare/v0.32.0...v0.33.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.33.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:05:38 -04:00
dependabot[bot]
f673c1a30e
Bump golang.org/x/sys from 0.33.0 to 0.34.0 (#1832)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.33.0 to 0.34.0.
- [Commits](https://github.com/golang/sys/compare/v0.33.0...v0.34.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:01:21 -04:00
John Kerl
3137313867 Release-specific docs for 6.14.0 2025-07-04 15:18:08 -04:00
John Kerl
0ba6710a79 Update main version to 6.14.0-dev 2025-07-04 15:10:26 -04:00
John Kerl
127c4925a2 Merge branch 'main' of https://github.com/johnkerl/miller 2025-07-04 14:16:14 -04:00
John Kerl
fefb304650 Update release docs on xattr trick for MacOS 2025-07-04 14:15:59 -04:00
John Kerl
7a6958926d
Miller 6.14.0 (#1828) 2025-07-04 13:55:56 -04:00
John Kerl
b7248bae98
Doc copy edits (#1827)
* Update index.md.in

* more copy-editing

* swipes.sh

* swipes.sh

* run `make docs` to generate `*.md` from `*.md.in`
2025-07-04 13:43:22 -04:00
John Kerl
99a98b0dc7
Add -c, -t, -j to doc matrix in PR 1824 (#1826)
* Add `-c`, `-t`, `-j` to doc matrix in PR 1824

* Run `make dev`
2025-07-03 19:23:38 -04:00
Balki
d6cd981c87
Add Keystroke savers for same format (#1824) 2025-07-03 19:01:17 -04:00
Balki
e67bdef98e
cut: Consider -o flag even when using regexes with -r (#1823)
* cut: Consider `-o` flag even when using regexes with `-r`

* update doc for cut -r flag
2025-07-03 18:54:09 -04:00
dependabot[bot]
4d84f99120
Bump github/codeql-action from 3.29.1 to 3.29.2 (#1825)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.1 to 3.29.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](39edc492db...181d5eefc2)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-01 09:32:07 -04:00
dependabot[bot]
de05d9665b
Bump github/codeql-action from 3.29.0 to 3.29.1 (#1822)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.29.0 to 3.29.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](ce28f5bb42...39edc492db)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-27 10:36:05 -04:00
John Kerl
d30501a69b
Argument parsing is different in mlr -s scripts (#1817) 2025-06-13 13:54:34 -04:00
dependabot[bot]
34c9d764d8
Bump github/codeql-action from 3.28.19 to 3.29.0 (#1814)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.19 to 3.29.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](fca7ace96b...ce28f5bb42)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-12 17:08:14 -04:00
dependabot[bot]
8e07a2f78d
Bump golang.org/x/text from 0.25.0 to 0.26.0 (#1813)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.25.0 to 0.26.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.25.0...v0.26.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.26.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-06 09:30:38 -04:00
dependabot[bot]
cc7f72b741
Bump github/codeql-action from 3.28.18 to 3.28.19 (#1812)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.18 to 3.28.19.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](ff0a06e83c...fca7ace96b)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.19
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-04 08:11:40 -04:00
dependabot[bot]
68f2845578
Bump github/codeql-action from 3.28.17 to 3.28.18 (#1808)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.17 to 3.28.18.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](60168efe1c...ff0a06e83c)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.18
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-16 08:32:28 -04:00
John Kerl
ea242a242a
Docs for new surv verb (#1807) 2025-05-15 19:41:58 -04:00
dependabot[bot]
d14dc76318
Bump golang.org/x/term from 0.29.0 to 0.32.0 (#1799)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.29.0 to 0.32.0.
- [Commits](https://github.com/golang/term/compare/v0.29.0...v0.32.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-version: 0.32.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-15 18:29:17 -04:00
dependabot[bot]
230b348a71
Bump golang.org/x/text from 0.22.0 to 0.25.0 (#1800)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.22.0 to 0.25.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.22.0...v0.25.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-version: 0.25.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-15 18:23:42 -04:00
dependabot[bot]
e9637bba9d
Bump golang.org/x/sys from 0.30.0 to 0.33.0 (#1801)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.30.0 to 0.33.0.
- [Commits](https://github.com/golang/sys/compare/v0.30.0...v0.33.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.33.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-15 18:23:33 -04:00
Christian G. Warden
df73ad8ec0
Add surv Verb to Estimate a Survival Curve (#1788)
Add a surv verb to estimate a survival curve using Kaplan-Meier.  It
requires duration and status (event or censored) columns, and outputs
each distinct duration and corresponding probability of survival.
2025-05-15 18:17:08 -04:00
dependabot[bot]
35c7eeb977
Bump actions/setup-go from 5.4.0 to 5.5.0 (#1802)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](0aaccfd150...d35c59abb0)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: 5.5.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-08 08:15:02 -04:00
John Kerl
ca7d47454d
Improve help message on non-existent verb (#1798) 2025-05-05 09:33:03 -04:00
dependabot[bot]
bbcf903647
Bump github/codeql-action from 3.28.16 to 3.28.17 (#1796)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.16 to 3.28.17.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](28deaeda66...60168efe1c)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.17
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-02 09:31:18 -04:00
John Kerl
34bc8a1c3d
Fix print within begin{}/end{} (#1795)
* codemod per se

* unit-test coverage

* lint
2025-05-01 17:18:17 -04:00
John Kerl
100166532c
Fix joinv with "" separator (#1794)
* codemod per se

* unit-test coverage
2025-05-01 17:08:55 -04:00
dependabot[bot]
629aebb989
Bump github/codeql-action from 3.28.15 to 3.28.16 (#1790)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.15 to 3.28.16.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](45775bd823...28deaeda66)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.16
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-24 08:37:41 -04:00
dependabot[bot]
121dd9425f
Bump github/codeql-action from 3.28.14 to 3.28.15 (#1783)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.14 to 3.28.15.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](fc7e4a0fa0...45775bd823)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.15
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-08 08:57:58 -04:00
dependabot[bot]
07130d8d65
Bump github/codeql-action from 3.28.13 to 3.28.14 (#1779)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.13 to 3.28.14.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](1b549b9259...fc7e4a0fa0)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.14
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-07 08:54:34 -04:00
dependabot[bot]
b6ee2eb202
Bump goreleaser/goreleaser-action from 6.2.1 to 6.3.0 (#1778)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 6.2.1 to 6.3.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](90a3faa9d0...9c156ee8a1)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 08:42:13 -04:00
dependabot[bot]
6e6e893bda
Bump github/codeql-action from 3.28.12 to 3.28.13 (#1776)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.12 to 3.28.13.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](5f8171a638...1b549b9259)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:10:26 -04:00
dependabot[bot]
f13a246754
Bump github/codeql-action from 3.28.11 to 3.28.12 (#1772)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.11 to 3.28.12.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](6bb031afdd...5f8171a638)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-20 08:49:25 -04:00
dependabot[bot]
48eba537aa
Bump actions/upload-artifact from 4.6.1 to 4.6.2 (#1773)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.6.1 to 4.6.2.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](4cec3d8aa0...ea165f8d65)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-20 08:49:17 -04:00
dependabot[bot]
1bfb8b0cc4
Bump actions/cache from 4.2.2 to 4.2.3 (#1774)
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.2 to 4.2.3.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](d4323d4df1...5a3ec84eff)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-20 08:47:44 -04:00
dependabot[bot]
b0addbe4f7
Bump actions/setup-go from 5.3.0 to 5.4.0 (#1771)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.3.0 to 5.4.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](f111f3307d...0aaccfd150)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-19 09:23:55 -04:00
dependabot[bot]
d45e7b06a6
Bump github/codeql-action from 3.28.10 to 3.28.11 (#1769)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.10 to 3.28.11.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](b56ba49b26...6bb031afdd)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-10 08:43:38 -04:00
John Kerl
d08ee47732
Use Go 1.21 in CI (#1768) 2025-03-06 08:32:03 -05:00
John Kerl
9963df4090
Switch to generics (#1763)
* gradually replace list.List with slices

* gradually replace list.List with slices

* more

* more

* more
2025-03-05 08:19:15 -05:00
dependabot[bot]
7d51030b88
Bump actions/cache from 4.2.1 to 4.2.2 (#1762)
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.1 to 4.2.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0c907a75c2...d4323d4df1)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-28 09:15:59 -05:00
dependabot[bot]
8e11fd36d5
Bump github/codeql-action from 3.28.9 to 3.28.10 (#1759)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.9 to 3.28.10.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](9e8d0789d4...b56ba49b26)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-24 09:12:48 -05:00
dependabot[bot]
4fe7051c1e
Bump actions/upload-artifact from 4.6.0 to 4.6.1 (#1760)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.6.0 to 4.6.1.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](65c4c4a1dd...4cec3d8aa0)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-24 09:01:06 -05:00
dependabot[bot]
ea0550b09b
Bump actions/cache from 4.2.0 to 4.2.1 (#1756)
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.0 to 4.2.1.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](1bd1e32a3b...0c907a75c2)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-19 09:04:29 -05:00
dependabot[bot]
a9a2549074
Bump goreleaser/goreleaser-action from 6.1.0 to 6.2.1 (#1755)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 6.1.0 to 6.2.1.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](9ed2f89a66...90a3faa9d0)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-11 08:11:12 -05:00
dependabot[bot]
20e1c87801
Bump github/codeql-action from 3.28.8 to 3.28.9 (#1753)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.8 to 3.28.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](dd746615b3...9e8d0789d4)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-07 07:57:56 -05:00
Michel Lind
bd2497a285
Fix non-constant format string errors with Go 1.24 (#1745)
Use `errors.New` instead of `fmt.Errorf` and `fmt.Fprint` instead of
`fmt.Fprintf` if a non-constant string is used

Signed-off-by: Michel Lind <salimma@fedoraproject.org>
2025-02-05 08:32:23 -05:00
dependabot[bot]
225072384a
Bump golang.org/x/term from 0.28.0 to 0.29.0 (#1751)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.28.0 to 0.29.0.
- [Commits](https://github.com/golang/term/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-05 08:30:24 -05:00
dependabot[bot]
6bed7bb560
Bump golang.org/x/text from 0.21.0 to 0.22.0 (#1752)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.21.0 to 0.22.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.21.0...v0.22.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-05 07:58:10 -05:00
dependabot[bot]
813a5204dc
Bump github/codeql-action from 3.28.6 to 3.28.8 (#1748)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.6 to 3.28.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](17a820bf2e...dd746615b3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-30 09:17:09 -05:00
dependabot[bot]
70c485695c
Bump github/codeql-action from 3.28.5 to 3.28.6 (#1747)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.5 to 3.28.6.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f6091c0113...17a820bf2e)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-28 09:11:58 -05:00
dependabot[bot]
107e57e3e4
Bump github/codeql-action from 3.28.4 to 3.28.5 (#1746)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.4 to 3.28.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](ee117c905a...f6091c0113)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-27 09:26:00 -05:00
dependabot[bot]
cf458f0230
Bump github/codeql-action from 3.28.3 to 3.28.4 (#1744)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.3 to 3.28.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](dd196fa9ce...ee117c905a)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-24 09:47:39 -05:00
dependabot[bot]
3738b617ae
Bump github/codeql-action from 3.28.2 to 3.28.3 (#1743)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.2 to 3.28.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](d68b2d4edb...dd196fa9ce)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-23 09:27:59 -05:00
dependabot[bot]
ce3123b3fa
Bump github/codeql-action from 3.28.1 to 3.28.2 (#1742)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.1 to 3.28.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](b6a472f63d...d68b2d4edb)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-22 09:11:00 -05:00
dependabot[bot]
e3a1e833f0
Bump actions/setup-go from 5.2.0 to 5.3.0 (#1741)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.2.0 to 5.3.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](3041bf56c9...f111f3307d)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-21 08:34:39 -05:00
dependabot[bot]
2b6fa35388
Bump github/codeql-action from 3.28.0 to 3.28.1 (#1740)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.0 to 3.28.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](48ab28a6f5...b6a472f63d)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-13 08:53:46 -05:00
dependabot[bot]
9bf883233e
Bump actions/upload-artifact from 4.5.0 to 4.6.0 (#1739)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.5.0 to 4.6.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](6f51ac03b9...65c4c4a1dd)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-10 09:42:51 -05:00
dependabot[bot]
a83470d16c
Bump golang.org/x/term from 0.27.0 to 0.28.0 (#1737)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.27.0 to 0.28.0.
- [Commits](https://github.com/golang/term/compare/v0.27.0...v0.28.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-06 09:13:45 -05:00
dependabot[bot]
6287b04fa8
Bump golang.org/x/sys from 0.28.0 to 0.29.0 (#1738)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.28.0 to 0.29.0.
- [Commits](https://github.com/golang/sys/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-06 09:03:18 -05:00
John Kerl
0060cceafc
Fix section-title typos for docs in #1735 (#1736)
* fix typo in flatten/unflatten doc section titles

* run `make docs`
2024-12-23 14:19:51 -05:00
John Kerl
cc1cd954ea
Fix unflatten with field names like . .x or x..y (#1735)
* Fix unflatten with field name like `.` `.x` or `x..y`

* docs & test data
2024-12-23 12:27:08 -05:00
dependabot[bot]
8088850505
Bump github/codeql-action from 3.27.9 to 3.28.0 (#1734)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.9 to 3.28.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](df409f7d92...48ab28a6f5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-23 09:26:34 -05:00
dependabot[bot]
06e33c0f82
Bump actions/upload-artifact from 4.4.3 to 4.5.0 (#1732)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.3 to 4.5.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](b4b15b8c7c...6f51ac03b9)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-18 08:34:21 -05:00
dependabot[bot]
929a2357d0
Bump github/codeql-action from 3.27.7 to 3.27.9 (#1731)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.7 to 3.27.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](babb554ede...df409f7d92)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-13 08:07:35 -05:00
dependabot[bot]
dde2cd20a7
Bump actions/setup-go from 5.1.0 to 5.2.0 (#1729)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.1.0 to 5.2.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](41dfa10bad...3041bf56c9)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-11 08:09:33 -05:00
dependabot[bot]
8bc3c5f645
Bump github/codeql-action from 3.27.6 to 3.27.7 (#1730)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.6 to 3.27.7.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](aa57810251...babb554ede)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-11 08:07:28 -05:00
dependabot[bot]
63654683f0
Bump actions/cache from 4.1.2 to 4.2.0 (#1728)
Bumps [actions/cache](https://github.com/actions/cache) from 4.1.2 to 4.2.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](6849a64899...1bd1e32a3b)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-06 09:42:15 -05:00
dependabot[bot]
c01fe78fbd
Bump golang.org/x/term from 0.26.0 to 0.27.0 (#1726)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.26.0 to 0.27.0.
- [Commits](https://github.com/golang/term/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-05 07:33:02 -05:00
dependabot[bot]
e62a0b4b20
Bump golang.org/x/text from 0.20.0 to 0.21.0 (#1727)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.20.0 to 0.21.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.20.0...v0.21.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-05 07:01:46 -05:00
dependabot[bot]
0614b37dfa
Bump github/codeql-action from 3.27.5 to 3.27.6 (#1724)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.5 to 3.27.6.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f09c1c0a94...aa57810251)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-04 08:51:27 -05:00
dependabot[bot]
a728524bf3
Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 (#1723)
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.9.0 to 1.10.0.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-25 09:19:57 -05:00
John Kerl
9f77bbe096
Add help strings for -a/-r in sub/gsub/ssub (#1721)
* Help strings for `-a`/`-r` in `sub`/`gsub`/`ssub`

* `mlr regtest -p test/cases/cli-help` to update expected outputs

* artifacts from `make dev`
2024-11-23 10:13:36 -05:00
John Kerl
5c65edba95 Merge branch 'main' of https://github.com/johnkerl/miller 2024-11-23 10:12:17 -05:00
John Kerl
019b15a310 delve.txt 2024-11-23 10:12:13 -05:00
dependabot[bot]
3050e0aeea
Bump github/codeql-action from 3.27.4 to 3.27.5 (#1719)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.4 to 3.27.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](ea9e4e3799...f09c1c0a94)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-21 08:00:11 -05:00
John Kerl
87da641d48 Merge branch 'main' of https://github.com/johnkerl/miller 2024-11-19 18:54:51 -05:00
John Kerl
2868fb6e7e rm xtodo.txt 2024-11-19 18:54:48 -05:00
dependabot[bot]
c189b6a2d8
Bump github/codeql-action from 3.27.3 to 3.27.4 (#1718)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.3 to 3.27.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](396bb3e453...ea9e4e3799)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-15 07:59:04 -05:00
dependabot[bot]
b0f9e03609
Bump github/codeql-action from 3.27.2 to 3.27.3 (#1717)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.2 to 3.27.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](9278e42166...396bb3e453)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-13 08:45:13 -05:00
dependabot[bot]
3d17ca117c
Bump github/codeql-action from 3.27.1 to 3.27.2 (#1716)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.1 to 3.27.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4f3212b617...9278e42166)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-12 09:05:02 -05:00
dependabot[bot]
cd3b0a62ab
Bump github/codeql-action from 3.27.0 to 3.27.1 (#1715)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.0 to 3.27.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](662472033e...4f3212b617)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-11 08:35:09 -07:00
dependabot[bot]
214129a95e
Bump golang.org/x/text from 0.19.0 to 0.20.0 (#1714)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.19.0 to 0.20.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-08 20:39:42 -07:00
dependabot[bot]
193a2ee37b
Bump goreleaser/goreleaser-action from 6.0.0 to 6.1.0 (#1711)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 6.0.0 to 6.1.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](286f3b13b1...9ed2f89a66)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-08 09:01:14 -05:00
dependabot[bot]
296430fe41
Bump golang.org/x/term from 0.25.0 to 0.26.0 (#1712)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.25.0 to 0.26.0.
- [Commits](https://github.com/golang/term/compare/v0.25.0...v0.26.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-08 09:00:52 -05:00
John Kerl
5424e753a4
Static-check fixes from @lespea #1657, batch 8/n (#1710) 2024-10-27 12:16:49 -04:00
John Kerl
41649bf4f9
Static-check fixes from @lespea #1657, batch 7/n (#1709) 2024-10-27 12:11:28 -04:00
John Kerl
b4ff26a7d0
Static-check fixes from @lespea #1657, batch 6/n (#1708)
* Static-check fixes from @lespea #1657, batch 2/n

* Static-check fixes from @lespea #1657, batch 3/n

* Static-check fixes from @lespea #1657, batch 4/n

* Static-check fixes from @lespea #1657, batch 5/n

* Static-check fixes from @lespea #1657, batch 6/n
2024-10-27 12:06:17 -04:00
John Kerl
02bd5344b9
Static-check fixes from @lespea #1657, batch 5/n (#1707)
* Static-check fixes from @lespea #1657, batch 2/n

* Static-check fixes from @lespea #1657, batch 3/n

* Static-check fixes from @lespea #1657, batch 4/n

* Static-check fixes from @lespea #1657, batch 5/n
2024-10-27 12:05:48 -04:00
John Kerl
8c791f5466
Static-check fixes from @lespea #1657, batch 4/n (#1706)
* Static-check fixes from @lespea #1657, batch 2/n

* Static-check fixes from @lespea #1657, batch 3/n

* Static-check fixes from @lespea #1657, batch 4/n
2024-10-27 12:00:25 -04:00
John Kerl
04a9b9decd
Static-check fixes from @lespea #1657, batch 3/n (#1705)
* Static-check fixes from @lespea #1657, batch 2/n

* Static-check fixes from @lespea #1657, batch 3/n
2024-10-27 11:55:38 -04:00
John Kerl
cc8a3c4b4e
Static-check fixes from @lespea #1657, batch 2/n (#1704) 2024-10-27 11:50:15 -04:00
John Kerl
047cb4bc28
Static-check fixes from @lespea #1657, batch 1/n (#1703) 2024-10-27 11:42:43 -04:00
dependabot[bot]
d7a5997d70
Bump actions/setup-go from 5.0.2 to 5.1.0 (#1700)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.0.2 to 5.1.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](0a12ed9d6a...41dfa10bad)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-25 08:02:54 -04:00
dependabot[bot]
1f6432e260
Bump actions/checkout from 4.2.1 to 4.2.2 (#1699)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.1 to 4.2.2.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](eef61447b9...11bd71901b)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-24 08:16:21 -04:00
dependabot[bot]
7225f2c094
Bump github/codeql-action from 3.26.13 to 3.27.0 (#1697)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.13 to 3.27.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f779452ac5...662472033e)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 07:58:15 -04:00
dependabot[bot]
bf320bcc99
Bump actions/cache from 4.1.1 to 4.1.2 (#1698)
Bumps [actions/cache](https://github.com/actions/cache) from 4.1.1 to 4.1.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](3624ceb22c...6849a64899)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 07:58:02 -04:00
John Kerl
05aa16cfcf
Join docs wrong link (#1695)
* Fix join-docs link in online help

* run `make dev` and commit the artifacts
2024-10-17 09:11:03 -04:00
dependabot[bot]
07c896833c
Bump github.com/klauspost/compress from 1.17.10 to 1.17.11 (#1691)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.10 to 1.17.11.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.10...v1.17.11)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-14 08:02:39 -04:00
dependabot[bot]
979addd3c3
Bump github/codeql-action from 3.26.12 to 3.26.13 (#1692)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.12 to 3.26.13.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](c36620d31a...f779452ac5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-14 08:02:27 -04:00
dependabot[bot]
2b4a0c2ca8
Bump actions/upload-artifact from 4.4.2 to 4.4.3 (#1690)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.2 to 4.4.3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](84480863f2...b4b15b8c7c)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-10 07:53:37 -04:00
dependabot[bot]
4e3b500f94
Bump actions/upload-artifact from 4.4.1 to 4.4.2 (#1689)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.1 to 4.4.2.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](604373da63...84480863f2)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 08:15:07 -04:00
dependabot[bot]
acc8a490e8
Bump actions/cache from 4.1.0 to 4.1.1 (#1688)
Bumps [actions/cache](https://github.com/actions/cache) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](2cdf405574...3624ceb22c)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 08:14:56 -04:00
dependabot[bot]
e9fbd9f48d
Bump actions/checkout from 4.2.0 to 4.2.1 (#1685)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.0 to 4.2.1.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](d632683dd7...eef61447b9)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 07:52:52 -04:00
dependabot[bot]
6ea8e238db
Bump actions/upload-artifact from 4.4.0 to 4.4.1 (#1686)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.0 to 4.4.1.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](50769540e7...604373da63)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 07:52:41 -04:00
dependabot[bot]
fd3e0d8ffc
Bump github/codeql-action from 3.26.11 to 3.26.12 (#1687)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.11 to 3.26.12.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](6db8d6351f...c36620d31a)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 07:52:30 -04:00
dependabot[bot]
bfa1fd4b28
Bump golang.org/x/term from 0.24.0 to 0.25.0 (#1680)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.24.0 to 0.25.0.
- [Commits](https://github.com/golang/term/compare/v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 08:30:19 -04:00
dependabot[bot]
e18eac29db
Bump golang.org/x/text from 0.18.0 to 0.19.0 (#1681)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.18.0 to 0.19.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.18.0...v0.19.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 08:25:31 -04:00
dependabot[bot]
8789f73d7b
Bump golang.org/x/sys from 0.25.0 to 0.26.0 (#1682)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.25.0 to 0.26.0.
- [Commits](https://github.com/golang/sys/compare/v0.25.0...v0.26.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 08:24:13 -04:00
dependabot[bot]
6eb5721070
Bump actions/cache from 4.0.2 to 4.1.0 (#1683)
Bumps [actions/cache](https://github.com/actions/cache) from 4.0.2 to 4.1.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0c45773b62...2cdf405574)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 08:24:00 -04:00
Stephen Kitt
7a0320fc27
Typo fix: programmatically (#1679)
Signed-off-by: Stephen Kitt <steve@sk2.org>
2024-10-06 17:30:12 -04:00
John Kerl
39c88041d6 make dev for previous commit 2024-10-05 10:50:37 -04:00
John Kerl
a0d65c3035 6.13.0-dev 2024-10-05 10:47:43 -04:00
John Kerl
f751084013 6.13 release docs 2024-10-05 10:46:36 -04:00
John Kerl
52f930ba31 trying again with go version / go mod tidy 2024-10-05 10:17:36 -04:00
John Kerl
7ef83f3a23 go mod tidy requires go 1.20 2024-10-05 09:50:07 -04:00
John Kerl
c66094a184 miller 6.13.0 2024-10-05 09:32:15 -04:00
Austin Letson
5cd457d565
Fix minor typo (#1673) 2024-10-05 09:27:31 -04:00
John Kerl
31d6164181
Fix 1668 error-source (#1672)
* Fix 1668 error-source

* run `make dev`
2024-10-05 09:25:47 -04:00
Andrea Borruso
26e55f2ec3
Characters to be removed (#1668)
It seems to me that they are to be removed
2024-10-05 08:49:53 -04:00
dependabot[bot]
5b3698402d
Bump github/codeql-action from 3.26.10 to 3.26.11 (#1669)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.10 to 3.26.11.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](e2b3eafc8d...6db8d6351f)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 08:53:13 -04:00
John Kerl
4a2f349289
Update source material for #1665 (#1666)
* Fix source info for #1665

* run `make dev`
2024-10-02 08:46:27 -04:00
Andrea Borruso
56210b045b
Update reference-verbs.md (#1665)
This should be a type
2024-10-02 08:08:49 -04:00
dependabot[bot]
8b2290bd70
Bump github/codeql-action from 3.26.9 to 3.26.10 (#1664)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.9 to 3.26.10.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](461ef6c76d...e2b3eafc8d)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 07:32:02 -04:00
dependabot[bot]
563fd4b3d0
Bump actions/checkout from 4.1.7 to 4.2.0 (#1662)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.7 to 4.2.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](692973e3d9...d632683dd7)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 09:15:38 -04:00
dependabot[bot]
fbd7ef446f
Bump github/codeql-action from 3.26.8 to 3.26.9 (#1660)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.8 to 3.26.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](294a9d9291...461ef6c76d)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 07:47:09 -04:00
dependabot[bot]
025ba0707c
Bump github.com/klauspost/compress from 1.17.9 to 1.17.10 (#1659)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.9 to 1.17.10.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.9...v1.17.10)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 08:38:16 -04:00
Adam Lesperance
7afa99dec4
Compiling on newer go versions doesn't work (#1655)
For whatever reason when compiling with go `1.23` it complains about
needing to `go tidy` and running that bumps the go version to `1.21` and
adds the `toolchain` directive, while also updating go.sum.

Compiling on go `1.20` works just fine without this update.

Not sure if you want to go all the way to `1.23` or do the minimum of
`1.21` so I just picked the latter and can change if you want to.
2024-09-20 12:52:15 -04:00
Adam Lesperance
085e831668
The package version must match the major tag version (#1654)
* Update package version

* Update makefile targets

* Update readme packages

* Remaining old packages via rg/sd
2024-09-20 12:10:11 -04:00
dependabot[bot]
a91abf5d5c
Bump github/codeql-action from 3.26.7 to 3.26.8 (#1652)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.7 to 3.26.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](8214744c54...294a9d9291)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 08:41:17 -04:00
Balki
d1767e7c18
Fix local time when TZ is not set (#1649)
Do not override time.Local when TZ is empty or unset. It is already set correctly by go standard library.
2024-09-17 10:58:09 -04:00
dependabot[bot]
5ef01ca356
Bump github/codeql-action from 3.26.6 to 3.26.7 (#1648)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.6 to 3.26.7.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4dd16135b6...8214744c54)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 08:13:24 -04:00
dependabot[bot]
73b1a4b40e
Bump golang.org/x/term from 0.23.0 to 0.24.0 (#1642)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.23.0 to 0.24.0.
- [Commits](https://github.com/golang/term/compare/v0.23.0...v0.24.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 08:33:09 -04:00
dependabot[bot]
e1ac188f49
Bump golang.org/x/text from 0.17.0 to 0.18.0 (#1641)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.17.0 to 0.18.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.17.0...v0.18.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 08:03:18 -04:00
dependabot[bot]
d739298160
Bump actions/upload-artifact from 4.3.6 to 4.4.0 (#1640)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.6 to 4.4.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](834a144ee9...50769540e7)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 10:10:52 -04:00
dependabot[bot]
bea792b136
Bump github/codeql-action from 3.26.5 to 3.26.6 (#1638)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.5 to 3.26.6.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](2c779ab0d0...4dd16135b6)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-29 09:39:58 -04:00
dependabot[bot]
52f28538f4
Bump github.com/lestrrat-go/strftime from 1.0.6 to 1.1.0 (#1637)
Bumps [github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime) from 1.0.6 to 1.1.0.
- [Release notes](https://github.com/lestrrat-go/strftime/releases)
- [Changelog](https://github.com/lestrrat-go/strftime/blob/master/Changes)
- [Commits](https://github.com/lestrrat-go/strftime/compare/v1.0.6...v1.1.0)

---
updated-dependencies:
- dependency-name: github.com/lestrrat-go/strftime
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 07:31:12 -04:00
Andrea Borruso
1fe2645989
Enable admonition extension (#1636)
In PR #1634 I have added an admonition note.
I assumed that the admonition extension was enabled, but it was not. I apologize John.

I have now enabled it as per the documentation:
https://squidfunk.github.io/mkdocs-material/reference/admonitions/?h=ad#admonitions
2024-08-27 12:02:27 -04:00
John Kerl
ab637328cd
Source-file update for PR 1634 (#1635) 2024-08-27 11:42:24 -04:00
Andrea Borruso
b63f66ff8c
A note about positional field names (#1634)
The inspiration comes from this question
https://stackoverflow.com/q/78908146/757714
2024-08-27 11:26:17 -04:00
Andrea Borruso
807775c519
Update extra.css (#1633)
removed a duplicate and corrected a typo
2024-08-27 08:44:20 -04:00
Andrea Borruso
d247fab73d
To have edit and copy code in each page (#1632) 2024-08-26 09:20:08 -04:00
Andrea Borruso
24e3c77280
To realize which chapter and section are active (#1631) 2024-08-26 09:14:53 -04:00
dependabot[bot]
ffa062adae
Bump github/codeql-action from 3.26.4 to 3.26.5 (#1630)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.4 to 3.26.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f0f3afee80...2c779ab0d0)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-26 08:00:44 -04:00
John Kerl
f33c0b2cd6
Error in splita/splitax when field contains a single non-string value (#1629) 2024-08-25 19:00:24 -04:00
John Kerl
73e2117b43
Misc. codespell findings (#1628) 2024-08-25 17:40:57 -04:00
John Kerl
1015f18e7b
Fix prepipe handling when filenames have whitespace (#1627)
* Fix prepipe handling when filenames have whitespace

* unit-test data

* Windows-only unit-test item

* Fix Windows fails; neaten
2024-08-25 17:40:07 -04:00
John Kerl
16a898cff4
Fix binary data in JSON output (#1626) 2024-08-25 15:00:51 -04:00
dependabot[bot]
60bdd6c922
Bump github/codeql-action from 3.26.3 to 3.26.4 (#1624)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.3 to 3.26.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](883d8588e5...f0f3afee80)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-22 09:03:18 -04:00
dependabot[bot]
f5010f4605
Bump github/codeql-action from 3.26.2 to 3.26.3 (#1623)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.2 to 3.26.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](429e197704...883d8588e5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-20 09:12:59 -04:00
dependabot[bot]
bdd26736a5
Bump codespell-project/actions-codespell from 2.0 to 2.1 (#1622)
Bumps [codespell-project/actions-codespell](https://github.com/codespell-project/actions-codespell) from 2.0 to 2.1.
- [Release notes](https://github.com/codespell-project/actions-codespell/releases)
- [Commits](94259cd8be...406322ec52)

---
updated-dependencies:
- dependency-name: codespell-project/actions-codespell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-19 07:42:56 -04:00
John Kerl
6bee4ebbf2
RS aliases for ASCII top-of-table control characters are misnamed (#1620)
* Fix misnames of ASCII control-character aliases

* artifacts from `make dev`
2024-08-16 10:25:25 -04:00
dependabot[bot]
7a2fa0bf07
Bump github/codeql-action from 3.26.1 to 3.26.2 (#1617)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.1 to 3.26.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](29d86d22a3...429e197704)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-15 09:03:54 -04:00
dependabot[bot]
753464d0f6
Bump github/codeql-action from 3.26.0 to 3.26.1 (#1615)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.0 to 3.26.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](eb055d739a...29d86d22a3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-14 08:03:57 -04:00
Eng Zer Jun
3966a6a0a1
lib/regex: use string version of regexp methods to reduce allocs (#1614)
Both `(*Regexp).Match` and `(*Regexp).FindAllSubmatchIndex` have
string-based equivalents: `(*Regexp).MatchString` and
`(*Regexp).FindAllStringSubmatchIndex`. We should use the string version
to avoid unnecessary `[]byte` conversions.

Benchmark:

var regex = regexp.MustCompile("foo.*")

func BenchmarkMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.Match([]byte("foo bar baz")); !match {
			b.Fail()
		}
	}
}

func BenchmarkMatchString(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.MatchString("foo bar baz"); !match {
			b.Fail()
		}
	}
}

func BenchmarkFindAllSubmatchIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.FindAllSubmatchIndex([]byte("foo bar baz"), -1); len(match) == 0 {
			b.Fail()
		}
	}
}

func BenchmarkFindAllStringSubmatchIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := regex.FindAllStringSubmatchIndex("foo bar baz", -1); len(match) == 0 {
			b.Fail()
		}
	}
}

goos: linux
goarch: amd64
pkg: github.com/johnkerl/miller/pkg/lib
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkMatch-16                         	 2198350	       517.5 ns/op	      16 B/op	       1 allocs/op
BenchmarkMatchString-16                   	 3143605	       371.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkFindAllSubmatchIndex-16          	  921711	      1199 ns/op	     273 B/op	       3 allocs/op
BenchmarkFindAllStringSubmatchIndex-16    	 1212321	       981.0 ns/op	     257 B/op	       2 allocs/op
PASS
coverage: 0.0% of statements
ok  	github.com/johnkerl/miller/pkg/lib	6.576s

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2024-08-09 13:09:53 -04:00
dependabot[bot]
dfe1ca1164
Bump golang.org/x/sys from 0.23.0 to 0.24.0 (#1613)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.23.0 to 0.24.0.
- [Commits](https://github.com/golang/sys/compare/v0.23.0...v0.24.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 08:58:49 -04:00
dependabot[bot]
cd91ab0a27
Bump golang.org/x/text from 0.16.0 to 0.17.0 (#1611)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.16.0 to 0.17.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.16.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 08:55:21 -04:00
dependabot[bot]
0e2ed5fbef
Bump github/codeql-action from 3.25.15 to 3.26.0 (#1610)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.15 to 3.26.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](afb54ba388...eb055d739a)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 08:07:32 -04:00
dependabot[bot]
afca7388f7
Bump actions/upload-artifact from 4.3.5 to 4.3.6 (#1609)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.5 to 4.3.6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](89ef406dd8...834a144ee9)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 08:06:10 -04:00
dependabot[bot]
247a86c998
Bump golang.org/x/term from 0.22.0 to 0.23.0 (#1612)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.22.0 to 0.23.0.
- [Commits](https://github.com/golang/term/compare/v0.22.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 08:05:39 -04:00
dependabot[bot]
93574580f9
Bump actions/upload-artifact from 4.3.4 to 4.3.5 (#1606)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.4 to 4.3.5.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](0b2256b8c0...89ef406dd8)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-05 08:50:49 -04:00
dependabot[bot]
018e3aa039
Bump golang.org/x/sys from 0.22.0 to 0.23.0 (#1605)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.22.0 to 0.23.0.
- [Commits](https://github.com/golang/sys/compare/v0.22.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-05 08:50:01 -04:00
dependabot[bot]
627e7bc510
Bump github/codeql-action from 3.25.14 to 3.25.15 (#1604)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.14 to 3.25.15.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](5cf07d8b70...afb54ba388)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 09:29:27 -04:00
dependabot[bot]
9bac6b4413
Bump github/codeql-action from 3.25.13 to 3.25.14 (#1603)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.13 to 3.25.14.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](2d790406f5...5cf07d8b70)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-25 08:08:28 -04:00
dependabot[bot]
c8c4759bb2
Bump github/codeql-action from 3.25.12 to 3.25.13 (#1602)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.12 to 3.25.13.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4fa2a79536...2d790406f5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 08:38:09 -04:00
dependabot[bot]
44c5594310
Bump github/codeql-action from 3.25.11 to 3.25.12 (#1598)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.11 to 3.25.12.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](b611370bb5...4fa2a79536)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-12 09:13:41 -04:00
dependabot[bot]
003c7aa44f
Bump actions/setup-go from 5.0.1 to 5.0.2 (#1597)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.0.1 to 5.0.2.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](cdcb360436...0a12ed9d6a)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-11 09:58:36 -04:00
dependabot[bot]
1029c960e0
Bump actions/upload-artifact from 4.3.3 to 4.3.4 (#1596)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.3 to 4.3.4.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](65462800fd...0b2256b8c0)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 09:00:57 -04:00
dependabot[bot]
ca6c09c6cf
Bump golang.org/x/term from 0.21.0 to 0.22.0 (#1594)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.21.0 to 0.22.0.
- [Commits](https://github.com/golang/term/compare/v0.21.0...v0.22.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-05 09:42:27 -04:00
dependabot[bot]
33cb41bc39
Bump golang.org/x/sys from 0.21.0 to 0.22.0 (#1595)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.21.0 to 0.22.0.
- [Commits](https://github.com/golang/sys/compare/v0.21.0...v0.22.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-05 08:18:22 -04:00
dependabot[bot]
13f4f7eb4a
Bump github/codeql-action from 3.25.10 to 3.25.11 (#1593)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.10 to 3.25.11.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](23acc5c183...b611370bb5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 08:47:42 -04:00
dependabot[bot]
95ade3c56f
Bump github/codeql-action from 3.25.9 to 3.25.10 (#1588)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.9 to 3.25.10.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](530d4feaa9...23acc5c183)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-14 11:07:16 -04:00
dependabot[bot]
2cd9793922
Bump github/codeql-action from 3.25.8 to 3.25.9 (#1587)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.8 to 3.25.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](2e230e8fe0...530d4feaa9)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 09:25:03 -04:00
dependabot[bot]
2ffbedf4c9
Bump actions/checkout from 4.1.6 to 4.1.7 (#1586)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.6 to 4.1.7.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](a5ac7e51b4...692973e3d9)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 09:24:44 -04:00
dependabot[bot]
97c299a491
Bump github.com/klauspost/compress from 1.17.8 to 1.17.9 (#1585)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.8 to 1.17.9.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.8...v1.17.9)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 10:10:52 -04:00
John Kerl
71d9388bff
Be smarter about auto-unflatten (#1584) 2024-06-08 20:58:26 -04:00
John Kerl
6520bf4758
Bash process substitution not working with put -f (#1583)
* Bash process substitution not working with `put -f`

* run `make dev`
2024-06-08 20:37:31 -04:00
John Kerl
dc21fa3cd5
Note IANA TSV support (#1582)
* Note IANA TSV support

* run `make docs`
2024-06-08 20:16:56 -04:00
John Kerl
202a79d0e2
On-line help for mlr summary --transpose (#1581)
* On-line help for `mlr summary --transpose`

* run `make dev`
2024-06-08 13:37:07 -04:00
John Kerl
8223903621
Support $NO_COLOR (#1580)
* Support `$NO_COLOR`

* run `make dev`
2024-06-08 13:08:15 -04:00
Andrew Onyshchuk
66abef6704
fraction bugfix (#1579) 2024-06-06 09:04:25 -04:00
dependabot[bot]
51ed2cfa1d
Bump golang.org/x/term from 0.20.0 to 0.21.0 (#1577)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.20.0 to 0.21.0.
- [Commits](https://github.com/golang/term/compare/v0.20.0...v0.21.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 08:14:41 -04:00
dependabot[bot]
d7df999d9b
Bump golang.org/x/sys from 0.20.0 to 0.21.0 (#1578)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.20.0 to 0.21.0.
- [Commits](https://github.com/golang/sys/compare/v0.20.0...v0.21.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 08:14:00 -04:00
dependabot[bot]
e61247b02c
Bump golang.org/x/text from 0.15.0 to 0.16.0 (#1576)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.15.0 to 0.16.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.15.0...v0.16.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 08:13:51 -04:00
dependabot[bot]
11d8a2647b
Bump github/codeql-action from 3.25.7 to 3.25.8 (#1575)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.7 to 3.25.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f079b84933...2e230e8fe0)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 07:51:42 -04:00
dependabot[bot]
45ea27bce2
Bump goreleaser/goreleaser-action from 5.1.0 to 6.0.0 (#1574)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 5.1.0 to 6.0.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](5742e2a039...286f3b13b1)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 07:51:33 -04:00
dependabot[bot]
571801b05c
Bump github/codeql-action from 3.25.6 to 3.25.7 (#1570)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.6 to 3.25.7.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](9fdb3e4972...f079b84933)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-31 09:28:19 -04:00
dependabot[bot]
589c5563c4
--- (#1568)
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 08:51:02 -04:00
dependabot[bot]
8c9d82f1f2
Bump github/codeql-action from 2.13.4 to 3.25.5 (#1567)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.13.4 to 3.25.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](cdcdbb5797...b7cec75265)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 09:43:26 -04:00
dependabot[bot]
8ca13caa14
Bump actions/checkout from 4.1.5 to 4.1.6 (#1566)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.5 to 4.1.6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](44c2b7a8a4...a5ac7e51b4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-17 10:48:04 -04:00
dependabot[bot]
cf6a80af4d
Bump goreleaser/goreleaser-action from 5.0.0 to 5.1.0 (#1563)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 5.0.0 to 5.1.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](7ec5c2b0c6...5742e2a039)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-13 08:46:57 -04:00
John Kerl
16ab199194
Add mad accumulator for stats1 DSL function (#1561)
* Add `mad` accumulator for `stats1` DSL function

* regression files

* make dev output
2024-05-11 15:55:27 -04:00
John Kerl
5ac48516f7
Add a stat DSL function (#1560)
* Add a `stat` DSL function [WIP]

* artifacts from `make dev`

* regression test
2024-05-09 18:39:44 -04:00
dependabot[bot]
956e65c118
Bump actions/checkout from 4.1.4 to 4.1.5 (#1557)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.4 to 4.1.5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](0ad4b8fada...44c2b7a8a4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-07 08:32:58 -04:00
dependabot[bot]
e0e7f3c7a9
Bump golang.org/x/term from 0.19.0 to 0.20.0 (#1555)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.19.0 to 0.20.0.
- [Commits](https://github.com/golang/term/compare/v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-06 23:04:08 -04:00
dependabot[bot]
f93089be3f
Bump golang.org/x/text from 0.14.0 to 0.15.0 (#1556)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.14.0 to 0.15.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.14.0...v0.15.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-06 08:11:20 -04:00
dependabot[bot]
729365d759
Bump golang.org/x/sys from 0.19.0 to 0.20.0 (#1554)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.19.0 to 0.20.0.
- [Commits](https://github.com/golang/sys/compare/v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-06 08:10:55 -04:00
dependabot[bot]
4e6f747a23
Bump actions/setup-go from 5.0.0 to 5.0.1 (#1553)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.0.0 to 5.0.1.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](0c52d547c9...cdcb360436)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-03 07:56:16 -04:00
dependabot[bot]
4ee3a59aab
Bump actions/checkout from 4.1.3 to 4.1.4 (#1552)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.3 to 4.1.4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](1d96c772d1...0ad4b8fada)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-25 09:33:33 -04:00
dependabot[bot]
97debc3030
Bump actions/upload-artifact from 4.3.2 to 4.3.3 (#1551)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.2 to 4.3.3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](1746f4ab65...65462800fd)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-23 08:09:40 -04:00
forcedebug
0e30619966
Fix mismatched method names in comments (#1549)
Signed-off-by: forcedebug <forcedebug@outlook.com>
2024-04-22 12:16:30 -04:00
dependabot[bot]
004fed3279
Bump actions/checkout from 4.1.2 to 4.1.3 (#1550)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.2 to 4.1.3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](9bb56186c3...1d96c772d1)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-22 08:23:06 -04:00
John Kerl
b3b097c40d
Try to build readthedocs .epub and .pdf (#1548) 2024-04-21 21:44:07 -04:00
dependabot[bot]
cb5265e796
Bump actions/upload-artifact from 4.3.1 to 4.3.2 (#1547)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.1 to 4.3.2.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](5d5d22a312...1746f4ab65)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-19 09:32:39 -04:00
camcui
12480c4ab5
chore: fix function name in comment (#1543)
Signed-off-by: camcui <cuishua@sina.cn>
2024-04-12 09:38:31 -04:00
John Kerl
e714738a7d
Fix typo in online help for --no-jlistwrap (#1541)
* Add --no-auto-unsparsify flag

* Fix typo in online help for `--no-jlistwrap`

* Artifacts from `make dev`
2024-04-11 08:12:45 -04:00
dependabot[bot]
03b8cce048
Bump github.com/klauspost/compress from 1.17.7 to 1.17.8 (#1538)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.7 to 1.17.8.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.7...v1.17.8)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-09 07:29:35 -04:00
dependabot[bot]
417009d257
Bump golang.org/x/term from 0.18.0 to 0.19.0 (#1536)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.18.0 to 0.19.0.
- [Commits](https://github.com/golang/term/compare/v0.18.0...v0.19.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-05 09:12:57 -04:00
dependabot[bot]
5f36b22f3f
Bump actions/cache from 4.0.1 to 4.0.2 (#1532)
Bumps [actions/cache](https://github.com/actions/cache) from 4.0.1 to 4.0.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](ab5e6d0c87...0c45773b62)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-20 08:54:19 -04:00
John Kerl
f6e378c8df build previous 2024-03-16 17:54:32 -04:00
John Kerl
b37c3a5e56 6.12.0 doc link 2024-03-16 17:51:17 -04:00
John Kerl
a0bead4093 miller 6.12.0 2024-03-16 17:19:05 -04:00
John Kerl
83c44e6d74
Add descriptions for put and filter verbs (#1529)
* Add more info in online help about what put/filter do

* `make dev` artifacts
2024-03-16 17:09:01 -04:00
John Kerl
f01bb92da7
Avoid spurious [] on JSON output in some cases (#1528)
* JSON empty vs `[]` handling [WIP]

* unit-test mods
2024-03-16 17:00:59 -04:00
dependabot[bot]
d0a1acecec
Bump actions/checkout from 4.1.1 to 4.1.2 (#1526)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.1 to 4.1.2.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](b4ffde65f4...9bb56186c3)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-12 09:18:53 -04:00
dependabot[bot]
78aa768cbe
Bump golang.org/x/term from 0.17.0 to 0.18.0 (#1522)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.17.0 to 0.18.0.
- [Commits](https://github.com/golang/term/compare/v0.17.0...v0.18.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-06 18:13:20 -05:00
dependabot[bot]
99e13f6105
Bump golang.org/x/sys from 0.17.0 to 0.18.0 (#1521)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.17.0 to 0.18.0.
- [Commits](https://github.com/golang/sys/compare/v0.17.0...v0.18.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-05 09:32:39 -05:00
dependabot[bot]
8d6455dfab
Bump github.com/stretchr/testify from 1.8.4 to 1.9.0 (#1516)
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.8.4 to 1.9.0.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.8.4...v1.9.0)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-04 09:11:04 -05:00
dependabot[bot]
e528b9e112
Bump actions/cache from 4.0.0 to 4.0.1 (#1511)
Bumps [actions/cache](https://github.com/actions/cache) from 4.0.0 to 4.0.1.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](13aacd865c...ab5e6d0c87)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-01 09:54:59 -05:00
John Kerl
aff4b9f32d
Improved file-not-found handling (#1508) 2024-02-26 00:12:31 -05:00
John Kerl
3201f9c675 Merge branch 'main' of https://github.com/johnkerl/miller 2024-02-25 21:56:55 -05:00
John Kerl
9004098499 python/make-tsv.py 2024-02-25 21:56:52 -05:00
John Kerl
fb1f7f8421
Enable record-hashing by default (#1507)
* Enable record-hashing by default

* comments
2024-02-25 21:51:41 -05:00
John Kerl
3ff43fa818
Miller produces no output on TSV with > 64K characters per line (#1505)
* Switch to bufio.Reader, first pass

* temp

* Simplify ILineReader by making it stateless

* Interface not necessary; ILineReader -> TLineReader

* neaten

* iterating
2024-02-25 15:50:50 -05:00
John Kerl
57b32c3e9b
Separate out ILineReader abstraction (#1504)
* Split up pkg/input/record_reader.go

* new ILineReader/TLineReader
2024-02-24 22:07:56 -05:00
dependabot[bot]
296ff87ae2
Bump github.com/klauspost/compress from 1.17.6 to 1.17.7 (#1502)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.6 to 1.17.7.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.6...v1.17.7)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-22 09:08:13 -05:00
John Kerl
7bd460a3b8
Support thousands separator in fmtnum (#1499)
* Support thousands separator in `fmtnum`

* doc bits
2024-02-18 14:01:46 -05:00
John Kerl
0424320199 make dev artifacts for sparsify 2024-02-18 13:54:42 -05:00
John Kerl
f5eaf290cf
mlr sparsify (#1498)
* mlr sparsify

* regression-test cases

* typofix

* Remove mods due to processor-architecture change
2024-02-18 10:56:26 -05:00
dependabot[bot]
cd6d42736f
Bump golang.org/x/term from 0.16.0 to 0.17.0 (#1494)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.16.0 to 0.17.0.
- [Commits](https://github.com/golang/term/compare/v0.16.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-08 09:34:39 -05:00
dependabot[bot]
56d6730f21
Bump github.com/klauspost/compress from 1.17.5 to 1.17.6 (#1492)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.17.5 to 1.17.6.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.17.5...v1.17.6)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-07 20:39:08 -05:00
dependabot[bot]
2ea00b0e40
Bump actions/upload-artifact from 4.3.0 to 4.3.1 (#1491)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.0 to 4.3.1.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](26f96dfa69...5d5d22a312)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 09:32:55 -05:00
John Kerl
62220ca0fa sort-link doc update 2024-02-05 09:39:49 -05:00
dependabot[bot]
3a2149b9ae
Bump github.com/klauspost/compress from 1.16.7 to 1.17.5 (#1486)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.16.7 to 1.17.5.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.16.7...v1.17.5)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-29 08:33:10 -05:00
John Kerl
c0e9be0e0c
6.11.0-dev (#1484)
* 6.11.0-dev

* 6.11.0-dev
2024-01-24 13:27:04 -05:00
dependabot[bot]
02ff56bd21
Bump actions/upload-artifact from 4.2.0 to 4.3.0 (#1483)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.2.0 to 4.3.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](694cdabd8b...26f96dfa69)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-24 09:40:39 -05:00
John Kerl
f26bc0d9a1 update release docs 2024-01-23 18:32:56 -05:00
John Kerl
6f24fb3999 miller.spec typofix 2024-01-23 17:35:31 -05:00
John Kerl
1834a925b3
Miller 6.11.0 (#1481)
* miller 6.11.0

* Artifacts from `make dev`
2024-01-23 17:31:58 -05:00
John Kerl
e5ec9f67bd
Implement all/by-regex field selection (-a/-r) for mlr sub, gsub, and ssub (#1480)
* Code-dedupe `sub`, `gsub`, and `ssub` verbs

* More dedupe

* Start with -a

* Implement -r

* unit-test cases

* Windows command-line parsing
2024-01-23 17:18:13 -05:00
John Kerl
81d11365a0
mlr reorder with regex support [WIP] (#1473)
* mlr reorder with regex support for field-name selection

* neaten

* -r -b/-a; unit-test cases
2024-01-21 15:17:33 -05:00
John Kerl
ac65675ab1
Auto-unsparsify CSV and TSV on output (#1479)
* Auto-unsparsify CSV

* Update unit-test cases

* More unit-test cases

* Key-change handling for CSV output

* Same for TSV, with unit-test and doc updates
2024-01-20 18:43:49 -05:00
John Kerl
af021f28d7
Support markdown format on input (#1478)
* Support markdown on input

* unit-test files

* doc mods

* Unit-test cases for I/O-format keystroke-savers

* -i/-o md as well as -i/-o markdown
2024-01-20 16:51:15 -05:00
John Kerl
2abb9b4729
Don't run regression tests twice in GitHub CI (#1477) 2024-01-20 14:24:12 -05:00
John Kerl
36b4654445
Fix typos in tests for PPRINT barred input (#1476) 2024-01-20 14:07:27 -05:00
John Kerl
bfc829a381
Internal name-neatens (#1475) 2024-01-20 13:36:28 -05:00
John Kerl
aff07efe3a typofix 2024-01-20 13:01:37 -05:00
John Kerl
794a754c36
Support PPRINT barred input (#1472)
* Support PPRINT barred input

* regression-test files

* output from `make dev`

* doc updates
2024-01-20 12:59:12 -05:00
dependabot[bot]
76408f3358
Bump actions/upload-artifact from 4.1.0 to 4.2.0 (#1471)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.1.0 to 4.2.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](1eb3cb2b3e...694cdabd8b)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-19 09:17:18 -05:00
dependabot[bot]
ee30154c6f
Bump actions/cache from 3.3.3 to 4.0.0 (#1470)
Bumps [actions/cache](https://github.com/actions/cache) from 3.3.3 to 4.0.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](e12d46a63a...13aacd865c)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-17 09:30:49 -05:00
dependabot[bot]
4c0bd62b64
Bump actions/upload-artifact from 4.0.0 to 4.1.0 (#1469)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.0.0 to 4.1.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](c7d193f32e...1eb3cb2b3e)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-15 15:24:47 -05:00
dependabot[bot]
f2be82b7bb
Bump actions/cache from 3.3.2 to 3.3.3 (#1468)
Bumps [actions/cache](https://github.com/actions/cache) from 3.3.2 to 3.3.3.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](704facf57e...e12d46a63a)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-12 09:42:38 -05:00
dependabot[bot]
664a84fadb
Bump golang.org/x/term from 0.15.0 to 0.16.0 (#1466)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.15.0 to 0.16.0.
- [Commits](https://github.com/golang/term/compare/v0.15.0...v0.16.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-05 07:41:39 -05:00
John Kerl
d2559b8387
Have clean_whitespace re-run type inference (#1464)
* Have `clean_whitespace` re-infer types

* make dev output

* unit-test files

* drive-by typofix

* make dev output
2024-01-01 18:39:27 -05:00
John Kerl
2f42c6f508
Fix #1462: remove limit of 1000 on dedupe field names (#1463)
* Fix #1462: remove limit of 1000 on dedupe field names

* make dev output
2024-01-01 17:50:56 -05:00
John Kerl
e3b98cd621
On-line help info for mlr join --lk "" (#1458)
* Doc info for `mlr join --lk ""`

* make dev output
2023-12-24 12:43:26 -05:00
John Kerl
0e3a54ed68
Implement mlr uniq -x (#1457)
* mlr uniq -x

* unit-test cases

* make dev
2023-12-23 16:20:11 -05:00
Eng Zer Jun
f4cf166358
Replace deprecated io/ioutil functions (#1452)
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

[1]: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-12-20 09:44:02 -05:00
John Kerl
c6b745537a
New strmatch/strmatchx DSL functions (#1448)
* New `match`/`matchx` DSL functions

* unit-test cases

* match/matchx -> strmatch/strmatchx

* help strings for strmatch and strmatchx

* update regex doc page re strmatch/strmatchx

* unit-test update
2023-12-19 14:34:54 -05:00
John Kerl
211b15ad4f make docs 2023-12-19 09:52:16 -05:00
John Kerl
4706b4bb78
Document and unit-test regex-capture reset logic (#1451)
* mlr --norc cat was erroring

* Document and unit-test regex-capture reset logic
2023-12-19 09:47:59 -05:00
John Kerl
b13adbe6c0
mlr --norc cat was erroring (#1450) 2023-12-19 09:33:34 -05:00
John Kerl
4053d7684c
Preserve regex captures across stack frames (#1447)
* privatize state.RegexCaptures

* stack frame for regex captures

* merge

* unit-test case

* docs re stack frames for regex captures

* more
2023-12-18 10:21:09 -05:00
John Kerl
1ae670fd4a
Rename internal regex functions (#1446) 2023-12-17 12:46:28 -05:00
dependabot[bot]
b5dbd7a751
Bump actions/upload-artifact from 3.1.3 to 4.0.0 (#1445)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3.1.3 to 4.0.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](a8a3f3ad30...c7d193f32e)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-15 09:42:47 -05:00
John Kerl
4e60ef58ae release docs including 6.9.0 and 6.10.0 2023-12-13 20:51:37 -05:00
John Kerl
856131f7a2 6.10.0-dev 2023-12-13 19:31:59 -05:00
John Kerl
c680f3316e add doc note re snag found on last commit 2023-12-13 19:04:48 -05:00
John Kerl
34abb952a4 update go 1.18 -> 1.19 in more spots 2023-12-13 19:00:57 -05:00
John Kerl
fbf320d88a update path in create_release_tarball 2023-12-13 18:46:16 -05:00
John Kerl
1f0e9be581 Merge branch 'main' of github.com:johnkerl/miller 2023-12-13 18:43:21 -05:00
John Kerl
9caa24d7f1
miller 6.10.0 (#1442)
* neaten

* miller 6.10.0
2023-12-13 18:43:00 -05:00
John Kerl
f1bc1dace9 neaten 2023-12-13 17:58:07 -05:00
John Kerl
8750d0e3c4
Update to Go 1.19 (#1441) 2023-12-11 17:38:13 -05:00
dependabot[bot]
b1e2438b28
Bump actions/setup-go from 4.1.0 to 5.0.0 (#1436)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 4.1.0 to 5.0.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](93397bea11...0c52d547c9)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-07 07:56:29 -05:00
John Kerl
bae1daf847
Absent variable on left side of boolean OR (||) expression makes it absent (#1434)
* Absent-handling with short-circuiting operators `&&` and `||`

* add a missing file

* artifacts from make dev

* type-errors

* doc content

* artifacts from make dev
2023-12-02 16:00:05 -05:00
dependabot[bot]
3a3595e404
Bump golang.org/x/term from 0.14.0 to 0.15.0 (#1432)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.14.0 to 0.15.0.
- [Commits](https://github.com/golang/term/compare/v0.14.0...v0.15.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-28 09:25:11 -05:00
John Kerl
18a9eaa377
Fix ragged-CSV auto-pad (#1428) 2023-11-19 23:53:53 -05:00
John Kerl
2bcf8813d3
Add a --files option (#1426)
* mlr --files

* doc mods
2023-11-11 19:09:02 -05:00
John Kerl
5b6a1d4713
JSONL output does not properly handle keys with quotes (#1425)
* mlr --l2j, --j2l

* make dev for previous commit

* fix #1424

* unit-test cases

* iterate
2023-11-11 18:58:49 -05:00
dependabot[bot]
f2a9ae5ca4
Bump golang.org/x/term from 0.13.0 to 0.14.0 (#1423)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.13.0 to 0.14.0.
- [Commits](https://github.com/golang/term/compare/v0.13.0...v0.14.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-08 08:26:13 -05:00
dependabot[bot]
dd12026fba
Bump golang.org/x/sys from 0.13.0 to 0.14.0 (#1420)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.13.0 to 0.14.0.
- [Commits](https://github.com/golang/sys/compare/v0.13.0...v0.14.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-06 09:35:40 -05:00
dependabot[bot]
e4882b11ed
Bump golang.org/x/text from 0.13.0 to 0.14.0 (#1419)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.13.0 to 0.14.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.13.0...v0.14.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-06 09:35:23 -05:00
Eng Zer Jun
4b34f80f6a
transformers/grep: avoid allocations with (*regexp.Regexp).MatchString (#1416)
We should use `(*regexp.Regexp).MatchString` instead of
`(*regexp.Regexp).Match([]byte(...))` when matching string to avoid
unnecessary `[]byte` conversions and reduce allocations.

Example benchmark:

var grepRegex = regexp.MustCompile("foo.*")

func BenchmarkMatch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := grepRegex.Match([]byte("foo bar baz")); !match {
			b.Fail()
		}
	}
}

func BenchmarkMatchString(b *testing.B) {
	for i := 0; i < b.N; i++ {
		if match := grepRegex.MatchString("foo bar baz"); !match {
			b.Fail()
		}
	}
}

goos: linux
goarch: amd64
pkg: github.com/johnkerl/miller/pkg/transformers
cpu: AMD Ryzen 7 PRO 4750U with Radeon Graphics
BenchmarkMatch-16          	 5700908	       210.3 ns/op	      16 B/op	       1 allocs/op
BenchmarkMatchString-16    	 8006731	       156.4 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/johnkerl/miller/pkg/transformers	2.857s

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-10-27 09:15:12 -04:00
John Kerl
6aab161cb0 neaten README.md 2023-10-24 09:12:47 -04:00
Ralph Ursprung
d3798c5aee
add winget to README (#1414)
@teo-tsirpanis added miller to `winget` with
microsoft/winget-pkgs#123507 (thanks!).
accordingly it should also be mentioned in the README so that people are
aware of it.

fixes #1331
2023-10-24 09:10:18 -04:00
dependabot[bot]
9a8951fc78
Bump actions/checkout from 4.1.0 to 4.1.1 (#1412)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](8ade135a41...b4ffde65f4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-18 09:36:53 -04:00
dependabot[bot]
a343d0f34c
Bump github.com/mattn/go-isatty from 0.0.19 to 0.0.20 (#1411)
Bumps [github.com/mattn/go-isatty](https://github.com/mattn/go-isatty) from 0.0.19 to 0.0.20.
- [Commits](https://github.com/mattn/go-isatty/compare/v0.0.19...v0.0.20)

---
updated-dependencies:
- dependency-name: github.com/mattn/go-isatty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-17 09:44:39 -04:00
dependabot[bot]
d785ea3e55
Bump golang.org/x/term from 0.12.0 to 0.13.0 (#1404)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.12.0 to 0.13.0.
- [Commits](https://github.com/golang/term/compare/v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-06 08:29:25 -04:00
dependabot[bot]
654577c776
Bump actions/checkout from 4.0.0 to 4.1.0 (#1400)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.0.0 to 4.1.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](3df4ab11eb...8ade135a41)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-25 09:02:02 -04:00
dependabot[bot]
d19b91ec6b
Bump goreleaser/goreleaser-action from 4.6.0 to 5.0.0 (#1396)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 4.6.0 to 5.0.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](5fdedb94ab...7ec5c2b0c6)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-12 08:36:06 -04:00
John Kerl
087f4bb4c9
Include null in any typemask (#1395) 2023-09-11 17:15:37 -04:00
John Kerl
5136507192
Name-neaten for #1392 (#1393) 2023-09-10 20:01:41 -04:00
John Kerl
39fa3a19bc
Better API example (#1392) 2023-09-10 19:47:42 -04:00
John Kerl
03eed305f9 doc tweaks 2023-09-10 17:23:50 -04:00
John Kerl
268a96d002
Export library code in pkg/ (#1391)
* Export library code in `pkg/`

* new doc page
2023-09-10 17:15:13 -04:00
dependabot[bot]
93b7c8eac0
Bump actions/cache from 3.3.1 to 3.3.2 (#1390)
Bumps [actions/cache](https://github.com/actions/cache) from 3.3.1 to 3.3.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](88522ab9f3...704facf57e)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-08 10:01:26 -04:00
dependabot[bot]
5519179122
Bump actions/upload-artifact from 3.1.2 to 3.1.3 (#1387)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3.1.2 to 3.1.3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](0b7f8abb15...a8a3f3ad30)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-07 09:09:23 -04:00
dependabot[bot]
2b77328b0f
Bump goreleaser/goreleaser-action from 4.4.0 to 4.6.0 (#1385)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 4.4.0 to 4.6.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](3fa32b8bb5...5fdedb94ab)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-06 09:45:47 -04:00
dependabot[bot]
587b7ce313
Bump actions/checkout from 3.6.0 to 4.0.0 (#1383)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3.6.0 to 4.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](f43a0e5ff2...3df4ab11eb)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-05 09:36:21 -04:00
dependabot[bot]
67bd565a53
Bump golang.org/x/term from 0.11.0 to 0.12.0 (#1380)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.11.0 to 0.12.0.
- [Commits](https://github.com/golang/term/compare/v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-04 20:29:51 -04:00
dependabot[bot]
9ab9c2f4e8
Bump golang.org/x/sys from 0.11.0 to 0.12.0 (#1381)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.11.0 to 0.12.0.
- [Commits](https://github.com/golang/sys/compare/v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-04 17:22:41 -04:00
dependabot[bot]
717189b6b1
Bump golang.org/x/text from 0.12.0 to 0.13.0 (#1382)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.12.0 to 0.13.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-04 17:21:23 -04:00
John Kerl
80bb82df6b
macos -> darwin in .goreleaser.yml (#1377) 2023-08-31 10:51:07 -04:00
John Kerl
640dbdc730
Remove replacements from .goreleaser.yaml (#1376) 2023-08-31 10:15:29 -04:00
John Kerl
acc10cdc37 miller 6.9.0 2023-08-31 09:00:29 -04:00
John Kerl
0493a0debd
Fatal-on-data-error mlr -x option (#1373)
* Fatal-on-data-error `mlr -x` option [WIP]

* arithmetic.go error-reason propagation

* more

* more

* more

* renames

* doc page

* namefix

* fix broken test

* make dev
2023-08-30 19:39:22 -04:00
John Kerl
879f272f79
Typofix in uif/uof percentiles (#1375)
* typofix in uif/uof percentiles

* fix regression-test data
2023-08-30 11:13:35 -04:00
John Kerl
2fd353c6be docmods for typofix 2023-08-30 09:00:24 -04:00
John Kerl
4c26b479f0 typofix 2023-08-30 07:32:04 -04:00
John Kerl
5146dd7f90
New contains DSL function (#1374)
* New `contains` DSL function

* unit-test files, and docs
2023-08-27 21:46:24 -04:00
John Kerl
5b29169b08
Update 2015-era Python sketch to Python 3 (#1372) 2023-08-27 10:08:34 -04:00
John Kerl
71171bc04c
Treat empty like absent in + - * (#1371)
* empty plus value is value

* unit-test cases

* make-docs output

* docs files

* on-line table for null-handling arithmetic rules

* doc mods
2023-08-26 23:41:50 -04:00
John Kerl
c7b4ed59d0 Merge branch 'main' of https://github.com/johnkerl/miller 2023-08-26 22:49:52 -04:00
John Kerl
67d16c89c1 typofix 2023-08-26 22:49:28 -04:00
John Kerl
069c068298
Summing up empty data (#1370)
* empty plus value is value

* unit-test cases

* make-docs output

* docs files

* on-line table for null-handling arithmetic rules

* doc mods
2023-08-26 21:24:34 -04:00
John Kerl
fccb7c63bb doc-neaten 2023-08-26 16:51:44 -04:00
John Kerl
fb3e3d15cd make dev 2023-08-26 16:47:19 -04:00
John Kerl
44e3a62373 typofix 2023-08-26 16:44:39 -04:00
John Kerl
077fc3702d more doc-neatens for percentiles on-line help 2023-08-26 16:41:37 -04:00
John Kerl
4cfb0ba112 neaten online help for the percentiles function 2023-08-26 16:30:21 -04:00
John Kerl
deb5d692a8 typofixes 2023-08-26 16:23:48 -04:00
John Kerl
d341cc6dd3
DSL functions for summary stats over arrays / maps (#1364)
* DSL stats functions [WIP]

* refactor

* move percentile computation to bifs module; iterate

* mode and antimode

* percentile iterate

* percentile sketching

* neaten

* unit-test iterate

* unify old & new min & max functions

* unit-test cases

* code-dedupe between mode and antimode

* make mode/antimode ties deterministic via first-found-wins rule

* online help strings for new stats DSL functions

* artifacts from `make dev`

* help info on how min/max now recurse into collections

* artifacts from `make dev`

* typofix
2023-08-26 16:02:30 -04:00
dependabot[bot]
392b34fd04
Bump actions/checkout from 3.5.3 to 3.6.0 (#1369)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3.5.3 to 3.6.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](c85c95e3d7...f43a0e5ff2)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-25 09:06:03 -04:00
John Kerl
4405f732a1 make-dev artifacts from previous commit 2023-08-23 16:19:37 -04:00
John Kerl
deda2a967e 1366 follow-up 2023-08-23 16:09:40 -04:00
Mr. Lance E Sloan
e2338195ba
filename options for split (iss. #1365) (#1366)
* #1365 - filename options for `split`

* Don't use joiner string when prefix is empty.
* Add option to specify joiner string.
* Add option to not URL-escape file names.

* #1365 - update documentation

* #1365 - don't URL-escape file name prefix

I **_thought_** it'd be cool to apply URL-escaping to the file name prefix as well, just in case it included spaces or other characters.  I forgot that a common use for the prefix is to specify a directory path that will contain the file.  When the slashes ("`/`") of the path are URL-escaped, they become "`%2F`" and the directories will not be created.  So, I moved the prefix handling code to come after the URL-escaping.

* #1365 - new `split` options for CLI help output

* #1365 - fix escape/suffix logic error

Trying to make the `return` statement cleaner, I thought it'd be good to add the file name suffix immediately after the file name is URL-escaped.  I'd forgotten that the suffix will not be added if the new `-e` option is used to skip URL-escaping.  So, I put the suffix back where I had it.

* #1365 - add `split` to the "10 minutes" document

Not strictly part of this issue, but as I was checking for docs that I should update as a result of my changes, I noticed this document showed how to split data using the `put` and `tee` combination, but not about the `split` verb.

* #1365 - updated manpage

When I ran `make dev`, generating `data-diving-examples.md` failed.  The two `manpage.txt` files ended up empty, but `mlr.1` seems to be correct.

---------

Co-authored-by: Mr. Lance E Sloan (sloanlance) <sloanlance@users.noreply.github.com>
2023-08-23 16:08:48 -04:00
Eng Zer Jun
12f3b14ce6
Remove redundant nil check (#1367)
From the Go docs [1]:

  "1. For a nil slice, the number of iterations is 0."
  "3. If the map is nil, the number of iterations is 0."

Therefore, an additional nil check for before the loop is unnecessary.

[1]: https://go.dev/ref/spec#For_range

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-08-23 10:18:22 -04:00
John Kerl
9ad9e213da fix codespell ci 2023-08-23 09:55:57 -04:00
John Kerl
aed6de2adb fix some broken links in README-dev.md 2023-08-21 15:33:33 -04:00
John Kerl
2107d520fa
Can't use ${field_name} if it contains UTF-8 characters also encodeable as Latin-1 (#1363)
* unit-test data

* docgen

* windows unit-test accommodations
2023-08-20 12:20:15 -04:00
John Kerl
9d1d2e07ca
Do wildcard globbing on Windows (#1362)
* Glob wildcards on Windows

* test/cases/globbing/0001
2023-08-19 17:40:35 -04:00
John Kerl
793f52c470
sub, gsub, and ssub verbs (#1361)
* sub, gsub, and ssub verbs

* doc mods

* content for verbs reference page

* test/cases/verb-sub-gsub-ssub/
2023-08-19 17:23:01 -04:00
John Kerl
d4a3bf99b2
Support ZSTD compression in-process (#1360)
* Support ZSTD compression in-process

* doc mods

* unit-test cases

* doc-gen artifacts
2023-08-19 15:22:59 -04:00
John Kerl
8b22708c27
Support comments in mlr -s files (#1359)
* Support comments in `mlr -s` files

* doc mods

* artifacts from `make dev`

* neaten
2023-08-19 13:32:09 -04:00
dependabot[bot]
c1572f4787
Bump golang.org/x/term from 0.10.0 to 0.11.0 (#1348)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.10.0 to 0.11.0.
- [Commits](https://github.com/golang/term/compare/v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-19 12:22:54 -04:00
dependabot[bot]
e62a09e9b9
Bump goreleaser/goreleaser-action from 4.3.0 to 4.4.0 (#1354)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 4.3.0 to 4.4.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](336e29918d...3fa32b8bb5)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-11 00:53:57 -04:00
John Kerl
52db2bf422
Small typos in documentation of mlr nest (#1352)
* Typofix in `nest` documentation

* update test/cases/cli-help

* artifacts from `make dev`
2023-08-09 10:50:26 -04:00
dependabot[bot]
fcd201d147
Bump actions/setup-go from 4.0.1 to 4.1.0 (#1351)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 4.0.1 to 4.1.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](fac708d667...93397bea11)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-09 08:09:08 -04:00
dependabot[bot]
f409aa4fd2
Bump golang.org/x/text from 0.11.0 to 0.12.0 (#1349)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.11.0 to 0.12.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-07 11:42:13 -04:00
dependabot[bot]
ad10d16f4e
Bump golang.org/x/sys from 0.10.0 to 0.11.0 (#1347)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.10.0 to 0.11.0.
- [Commits](https://github.com/golang/sys/compare/v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-07 07:57:37 -04:00
Benson Muite
7aa9483528
Update Fedora link (#1339) 2023-07-12 08:27:23 -04:00
dependabot[bot]
3e23153aac
Bump golang.org/x/term from 0.9.0 to 0.10.0 (#1338)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.9.0 to 0.10.0.
- [Commits](https://github.com/golang/term/compare/v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-06 08:40:13 -04:00
dependabot[bot]
1f69807836
Bump golang.org/x/sys from 0.9.0 to 0.10.0 (#1336)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.9.0 to 0.10.0.
- [Commits](https://github.com/golang/sys/compare/v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-05 09:52:53 -04:00
dependabot[bot]
ff65820214
Bump golang.org/x/text from 0.10.0 to 0.11.0 (#1337)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.10.0 to 0.11.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-05 07:36:09 -04:00
John Kerl
b30aceae36
Add %s format specifier for strftime (#1335) 2023-07-04 17:00:02 -04:00
John Kerl
3baebea7a3
Add %N and %O for strfntime (#1334)
* Add `%N` and `%O` for strfntime

* Unit-test mods

* artifacts from `make dev`
2023-07-02 15:49:41 -04:00
John Kerl
3e5c3e2398
Add empty-key check to mlr check (#1330)
* Add empty-key check to `mlr check`

* Update `mlr check --help`

* Update to on-line help
2023-06-25 19:12:26 -04:00
John Kerl
dff2206b62 todo 2023-06-25 15:40:06 -04:00
John Kerl
d72ef826fb
Add DSL functions for integer nanoseconds since the epoch (#1326)
* DSL functions for 64-bit nano-epoch timestamps

* strfntime

* nsec2gmt; move sec/nsec pairs adjacent to one another

* update on-line help

* artifacts from `make dev`

* unit-test files
2023-06-24 17:05:15 -04:00
dependabot[bot]
4c0731d395
Bump golang.org/x/text from 0.9.0 to 0.10.0 (#1322)
Bumps [golang.org/x/text](https://github.com/golang/text) from 0.9.0 to 0.10.0.
- [Release notes](https://github.com/golang/text/releases)
- [Commits](https://github.com/golang/text/compare/v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: golang.org/x/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-13 10:01:02 -04:00
dependabot[bot]
2086c154fd
Bump goreleaser/goreleaser-action from 4.2.0 to 4.3.0 (#1320)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 4.2.0 to 4.3.0.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](f82d6c1c34...336e29918d)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-13 09:41:22 -04:00
dependabot[bot]
be68f5fc90
Bump golang.org/x/term from 0.8.0 to 0.9.0 (#1321)
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.8.0 to 0.9.0.
- [Commits](https://github.com/golang/term/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: golang.org/x/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-13 09:40:20 -04:00
dependabot[bot]
adeab1153b
Bump github/codeql-action from 2.3.6 to 2.13.4 (#1318)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.3.6 to 2.13.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](83f0fe6c49...cdcdbb5797)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-12 08:28:12 -04:00
dependabot[bot]
d5c03e8a8b
Bump actions/checkout from 3.5.2 to 3.5.3 (#1319)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3.5.2 to 3.5.3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](8e5e7e5ab8...c85c95e3d7)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-12 08:27:54 -04:00
John Kerl
c5ceb20a4e
Fix mlr grep docs re OFS/OPS (#1309)
* Fix `mlr grep` doc re OFS/OPS

* make-dev artifacts
2023-06-06 00:18:51 -04:00
John Kerl
4050f566fa fix mis-spelling for head docs 2023-06-04 18:01:03 -04:00
John Kerl
ab4705ab7a
Update readthedocs notes in the how-to-release page (#1308) 2023-06-04 17:53:42 -04:00
John Kerl
21fb5f9cd6 release 6.8.0 docs 2023-06-04 17:12:57 -04:00
John Kerl
e158e1c616 post-6.8.0 2023-06-04 16:38:18 -04:00
1401 changed files with 19943 additions and 7015 deletions

View file

@ -36,11 +36,11 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@8e5e7e5ab8b370d6c329ec480221332ada57f0ab
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@83f0fe6c4988d98a455712a27f0255212bba9bd4
uses: github/codeql-action/init@cdefb33c0f6224e58673d9004f47f7cb3e328b89
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@ -51,7 +51,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@83f0fe6c4988d98a455712a27f0255212bba9bd4
uses: github/codeql-action/autobuild@cdefb33c0f6224e58673d9004f47f7cb3e328b89
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@ -65,4 +65,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@83f0fe6c4988d98a455712a27f0255212bba9bd4
uses: github/codeql-action/analyze@cdefb33c0f6224e58673d9004f47f7cb3e328b89

View file

@ -21,7 +21,7 @@ jobs:
steps:
# Check out the code base
- name: Check out code
uses: actions/checkout@8e5e7e5ab8b370d6c329ec480221332ada57f0ab
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8
with:
# Full git history is needed to get a proper list of changed files within `super-linter`
fetch-depth: 0
@ -29,8 +29,17 @@ jobs:
# Run linter against code base
# https://github.com/codespell-project/codespell
- name: Codespell
uses: codespell-project/actions-codespell@94259cd8be02ad2903ba34a22d9c13de21a74461
uses: codespell-project/actions-codespell@8f01853be192eb0f849a5c7d721450e7a467c579
with:
check_filenames: true
ignore_words_file: .codespellignore
skip: "*.csv,*.dkvp,*.txt,*.js,*.html,*.map,*.z,./tags,./test/cases,./docs/src/shapes-of-data.md.in,./docs/src/shapes-of-data.md,test/input/latin1.xtab"
# As of August 2023 or so, Codespell started exiting with status 1 just _examining_ the
# latin1.xtab file which is (intentionally) not UTF-8. Before, it said
#
# Warning: WARNING: Cannot decode file using encoding "utf-8": ./test/input/latin1.xtab
# WARNING: Trying next encoding "iso-8859-1"
#
# but would exit 0. After, it started exiting with a 1. This is annoying as it makes
# every PR red in CI. So we have to use warning mode now.
only_warn: 1

View file

@ -15,18 +15,18 @@ jobs:
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@8e5e7e5ab8b370d6c329ec480221332ada57f0ab
- uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8
- name: Set up Go
uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753
uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5
with:
go-version: 1.18
go-version: 1.24
- name: Build
run: make build
- name: Test
run: make check
- name: Unit tests
run: make unit-test
- name: Regression tests
# We run these with a convoluted path to ensure the tests don't
@ -41,7 +41,7 @@ jobs:
if: matrix.os == 'windows-latest'
run: mkdir -p bin/${{matrix.os}} && cp mlr.exe bin/${{matrix.os}}
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce
- uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: mlr-${{matrix.os}}
path: bin/${{matrix.os}}/*

29
.github/workflows/release-snap.yaml vendored Normal file
View file

@ -0,0 +1,29 @@
name: Release for Snap
on:
push:
tags:
- v*
workflow_dispatch:
jobs:
snap:
strategy:
matrix:
os: [ubuntu-latest, ubuntu-24.04-arm]
runs-on: ${{ matrix.os }}
steps:
- name: Checkout code
uses: actions/checkout@v6
- name: Build snap
uses: snapcore/action-build@v1
id: build
- name: Publish to Snap Store
uses: snapcore/action-publish@v1
env:
SNAPCRAFT_STORE_CREDENTIALS: ${{ secrets.SNAPCRAFT_TOKEN }}
with:
snap: ${{ steps.build.outputs.snap }}
# release: stable # or edge, beta, candidate
release: stable

View file

@ -1,4 +1,4 @@
name: Release
name: Release for GitHub
on:
push:
tags:
@ -6,7 +6,7 @@ on:
workflow_dispatch:
env:
GO_VERSION: 1.18.10
GO_VERSION: 1.24.5
jobs:
release:
@ -17,19 +17,19 @@ jobs:
runs-on: ${{ matrix.platform }}
steps:
- name: Set up Go
uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753
uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5
with:
go-version: ${{ env.GO_VERSION }}
id: go
- name: Check out code into the Go module directory
uses: actions/checkout@8e5e7e5ab8b370d6c329ec480221332ada57f0ab
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8
with:
fetch-depth: 0
# https://github.com/marketplace/actions/cache
- name: Cache Go modules
uses: actions/cache@88522ab9f39a2ea568f7027eddc7d8d8bc9d59c8
uses: actions/cache@8b402f58fbc84540c8b491a91e594a4576fec3d7
with:
path: |
~/.cache/go-build
@ -40,7 +40,7 @@ jobs:
# https://goreleaser.com/ci/actions/
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@f82d6c1c344bcacabba2c841718984797f664a6b
uses: goreleaser/goreleaser-action@e435ccd777264be153ace6237001ef4d979d3a7a
#if: startsWith(github.ref, 'refs/tags/v')
with:
version: latest

View file

@ -0,0 +1,28 @@
name: 🧪 Snap Builds
on:
push:
branches: '*'
pull_request:
branches: '*'
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [20.x]
steps:
- uses: actions/checkout@v6
- uses: snapcore/action-build@v1
id: build
- uses: diddlesnaps/snapcraft-review-action@v1
with:
snap: ${{ steps.build.outputs.snap }}
isClassic: 'false'
# Plugs and Slots declarations to override default denial (requires store assertion to publish)
# plugs: ./plug-declaration.json
# slots: ./slot-declaration.json

View file

@ -70,8 +70,6 @@ archives:
format_overrides:
- goos: windows
format: zip
replacements:
darwin: macos
name_template: '{{ .ProjectName }}-{{ .Version }}-{{ .Os }}-{{ .Arch }}{{ if .Arm }}v{{ .Arm }}{{ end }}'
files:
- LICENSE.txt

View file

@ -17,3 +17,5 @@ python:
mkdocs:
configuration: docs/mkdocs.yml
formats: all

View file

@ -7,10 +7,13 @@ INSTALLDIR=$(PREFIX)/bin
# This must remain the first target in this file, which is what 'make' with no
# arguments will run.
build:
go build github.com/johnkerl/miller/cmd/mlr
go build github.com/johnkerl/miller/v6/cmd/mlr
@echo "Build complete. The Miller executable is ./mlr (or .\mlr.exe on Windows)."
@echo "You can use 'make check' to run tests".
quiet:
@go build github.com/johnkerl/miller/v6/cmd/mlr
# For interactive use, 'mlr regtest' offers more options and transparency.
check: unit-test regression-test
@echo "Tests complete. You can use 'make install' if you like, optionally preceded"
@ -30,25 +33,25 @@ install: build
# ----------------------------------------------------------------
# Unit tests (small number)
unit-test ut: build
go test github.com/johnkerl/miller/internal/pkg/...
go test github.com/johnkerl/miller/v6/pkg/...
ut-lib:build
go test github.com/johnkerl/miller/internal/pkg/lib...
go test github.com/johnkerl/miller/v6/pkg/lib...
ut-scan:build
go test github.com/johnkerl/miller/internal/pkg/scan/...
go test github.com/johnkerl/miller/v6/pkg/scan/...
ut-mlv:build
go test github.com/johnkerl/miller/internal/pkg/mlrval/...
go test github.com/johnkerl/miller/v6/pkg/mlrval/...
ut-bifs:build
go test github.com/johnkerl/miller/internal/pkg/bifs/...
go test github.com/johnkerl/miller/v6/pkg/bifs/...
ut-input:build
go test github.com/johnkerl/miller/internal/pkg/input/...
go test github.com/johnkerl/miller/v6/pkg/input/...
bench:build
go test -run=nonesuch -bench=. github.com/johnkerl/miller/internal/pkg/...
go test -run=nonesuch -bench=. github.com/johnkerl/miller/v6/pkg/...
bench-mlv:build
go test -run=nonesuch -bench=. github.com/johnkerl/miller/internal/pkg/mlrval/...
go test -run=nonesuch -bench=. github.com/johnkerl/miller/v6/pkg/mlrval/...
bench-input:build
go test -run=nonesuch -bench=. github.com/johnkerl/miller/internal/pkg/input/...
go test -run=nonesuch -bench=. github.com/johnkerl/miller/v6/pkg/input/...
# ----------------------------------------------------------------
# Regression tests (large number)
@ -56,7 +59,7 @@ bench-input:build
# See ./regression_test.go for information on how to get more details
# for debugging. TL;DR is for CI jobs, we have 'go test -v'; for
# interactive use, instead of 'go test -v' simply use 'mlr regtest
# -vvv' or 'mlr regtest -s 20'. See also internal/pkg/terminals/regtest.
# -vvv' or 'mlr regtest -s 20'. See also pkg/terminals/regtest.
regression-test: build
go test -v regression_test.go
@ -65,7 +68,7 @@ regression-test: build
# go fmt ./... finds experimental C files which we want to ignore.
fmt format:
-go fmt ./cmd/...
-go fmt ./internal/pkg/...
-go fmt ./pkg/...
-go fmt ./regression_test.go
# ----------------------------------------------------------------
@ -98,7 +101,8 @@ dev:
make -C docs
@echo DONE
docs:
docs: build
make -C docs/src forcebuild
make -C docs
# ----------------------------------------------------------------
@ -110,7 +114,7 @@ it: build check
so: install
mlr:
go build github.com/johnkerl/miller/cmd/mlr
go build github.com/johnkerl/miller/v6/cmd/mlr
# ----------------------------------------------------------------
# Please see comments in ./create-release-tarball as well as

View file

@ -61,10 +61,10 @@ During the coding of Miller, I've been guided by the following:
* Names of files, variables, functions, etc. should be fully spelled out (e.g. `NewEvaluableLeafNode`), except for a small number of most-used names where a longer name would cause unnecessary line-wraps (e.g. `Mlrval` instead of `MillerValue` since this appears very very often).
* Code should not be too clever. This includes some reasonable amounts of code duplication from time to time, to keep things inline, rather than lasagna code.
* Things should be transparent. For example, the `-v` in `mlr -n put -v '$y = 3 + 0.1 * $x'` shows you the abstract syntax tree derived from the DSL expression.
* Comments should be robust with respect to reasonably anticipated changes. For example, one package should cross-link to another in its comments, but I try to avoid mentioning specific filenames too much in the comments and README files since these may change over time. I make an exception for stable points such as [cmd/mlr/main.go](./cmd/mlr/main.go), [mlr.bnf](./internal/pkg/parsing/mlr.bnf), [stream.go](./internal/pkg/stream/stream.go), etc.
* Comments should be robust with respect to reasonably anticipated changes. For example, one package should cross-link to another in its comments, but I try to avoid mentioning specific filenames too much in the comments and README files since these may change over time. I make an exception for stable points such as [cmd/mlr/main.go](./cmd/mlr/main.go), [mlr.bnf](./pkg/parsing/mlr.bnf), [stream.go](./pkg/stream/stream.go), etc.
* *Miller should be pleasant to write.*
* It should be quick to answer the question *Did I just break anything?* -- hence `mlr regtest` functionality.
* It should be quick to find out what to do next as you iteratively develop -- see for example [cst/README.md](./internal/pkg/dsl/cst/README.md).
* It should be quick to find out what to do next as you iteratively develop -- see for example [cst/README.md](./pkg/dsl/cst/README.md).
* *The language should be an asset, not a liability.*
* One of the reasons I chose Go is that (personally anyway) I find it to be reasonably efficient, well-supported with standard libraries, straightforward, and fun. I hope you enjoy it as much as I have.
@ -83,10 +83,10 @@ sequence of key-value pairs. The basic **stream** operation is:
So, in broad overview, the key packages are:
* [internal/pkg/stream](./internal/pkg/stream) -- connect input -> transforms -> output via Go channels
* [internal/pkg/input](./internal/pkg/input) -- read input records
* [internal/pkg/transformers](./internal/pkg/transformers) -- transform input records to output records
* [internal/pkg/output](./internal/pkg/output) -- write output records
* [pkg/stream](./pkg/stream) -- connect input -> transforms -> output via Go channels
* [pkg/input](./pkg/input) -- read input records
* [pkg/transformers](./pkg/transformers) -- transform input records to output records
* [pkg/output](./pkg/output) -- write output records
* The rest are details to support this.
## Directory-structure details
@ -95,33 +95,34 @@ So, in broad overview, the key packages are:
* Miller dependencies are all in the Go standard library, except two:
* GOCC lexer/parser code-generator from [github.com/goccmack/gocc](https://github.com/goccmack/gocc):
* Forked at [github.com/johnkerl/gocc](github.com/johnkerl/gocc).
* This package defines the grammar for Miller's domain-specific language (DSL) for the Miller `put` and `filter` verbs. And, GOCC is a joy to use. :)
* It is used on the terms of its open-source license.
* [golang.org/x/term](https://pkg.go.dev/golang.org/x/term):
* Just a one-line Miller callsite for is-a-terminal checking for the [Miller REPL](./internal/pkg/terminals/repl/README.md).
* Just a one-line Miller callsite for is-a-terminal checking for the [Miller REPL](./pkg/terminals/repl/README.md).
* It is used on the terms of its open-source license.
* See also [./go.mod](go.mod). Setup:
* `go get github.com/goccmack/gocc`
* `go get github.com/johnkerl/gocc`
* `go get golang.org/x/term`
### Miller per se
* The main entry point is [cmd/mlr/main.go](./cmd/mlr/main.go); everything else in [internal/pkg](./internal/pkg).
* [internal/pkg/entrypoint](./internal/pkg/entrypoint): All the usual contents of `main()` are here, for ease of testing.
* [internal/pkg/platform](./internal/pkg/platform): Platform-dependent code, which as of early 2021 is the command-line parser. Handling single quotes and double quotes is different on Windows unless particular care is taken, which is what this package does.
* [internal/pkg/lib](./internal/pkg/lib):
* Implementation of the [`Mlrval`](./internal/pkg/types/mlrval.go) datatype which includes string/int/float/boolean/void/absent/error types. These are used for record values, as well as expression/variable values in the Miller `put`/`filter` DSL. See also below for more details.
* [`Mlrmap`](./internal/pkg/types/mlrmap.go) is the sequence of key-value pairs which represents a Miller record. The key-lookup mechanism is optimized for Miller read/write usage patterns -- please see [mlrmap.go](./internal/pkg/types/mlrmap.go) for more details.
* [`context`](./internal/pkg/types/context.go) supports AWK-like variables such as `FILENAME`, `NF`, `NR`, and so on.
* [internal/pkg/cli](./internal/pkg/cli) is the flag-parsing logic for supporting Miller's command-line interface. When you type something like `mlr --icsv --ojson put '$sum = $a + $b' then filter '$sum > 1000' myfile.csv`, it's the CLI parser which makes it possible for Miller to construct a CSV record-reader, a transformer-chain of `put` then `filter`, and a JSON record-writer.
* [internal/pkg/climain](./internal/pkg/climain) contains a layer which invokes `internal/pkg/cli`, which was split out to avoid a Go package-import cycle.
* [internal/pkg/stream](./internal/pkg/stream) is as above -- it uses Go channels to pipe together file-reads, to record-reading/parsing, to a chain of record-transformers, to record-writing/formatting, to terminal standard output.
* [internal/pkg/input](./internal/pkg/input) is as above -- one record-reader type per supported input file format, and a factory method.
* [internal/pkg/output](./internal/pkg/output) is as above -- one record-writer type per supported output file format, and a factory method.
* [internal/pkg/transformers](./internal/pkg/transformers) contains the abstract record-transformer interface datatype, as well as the Go-channel chaining mechanism for piping one transformer into the next. It also contains all the concrete record-transformers such as `cat`, `tac`, `sort`, `put`, and so on.
* [internal/pkg/parsing](./internal/pkg/parsing) contains a single source file, `mlr.bnf`, which is the lexical/semantic grammar file for the Miller `put`/`filter` DSL using the GOCC framework. All subdirectories of `internal/pkg/parsing/` are autogen code created by GOCC's processing of `mlr.bnf`. If you need to edit `mlr.bnf`, please use [tools/build-dsl](./tools/build-dsl) to autogenerate Go code from it (using the GOCC tool). (This takes several minutes to run.)
* [internal/pkg/dsl](./internal/pkg/dsl) contains [`ast_types.go`](internal/pkg/dsl/ast_types.go) which is the abstract syntax tree datatype shared between GOCC and Miller. I didn't use a `internal/pkg/dsl/ast` naming convention, although that would have been nice, in order to avoid a Go package-dependency cycle.
* [internal/pkg/dsl/cst](./internal/pkg/dsl/cst) is the concrete syntax tree, constructed from an AST produced by GOCC. The CST is what is actually executed on every input record when you do things like `$z = $x * 0.3 * $y`. Please see the [internal/pkg/dsl/cst/README.md](./internal/pkg/dsl/cst/README.md) for more information.
* The main entry point is [cmd/mlr/main.go](./cmd/mlr/main.go); everything else in [pkg](./pkg).
* [pkg/entrypoint](./pkg/entrypoint): All the usual contents of `main()` are here, for ease of testing.
* [pkg/platform](./pkg/platform): Platform-dependent code, which as of early 2021 is the command-line parser. Handling single quotes and double quotes is different on Windows unless particular care is taken, which is what this package does.
* [pkg/lib](./pkg/lib):
* Implementation of the [`Mlrval`](./pkg/types/mlrval.go) datatype which includes string/int/float/boolean/void/absent/error types. These are used for record values, as well as expression/variable values in the Miller `put`/`filter` DSL. See also below for more details.
* [`Mlrmap`](./pkg/types/mlrmap.go) is the sequence of key-value pairs which represents a Miller record. The key-lookup mechanism is optimized for Miller read/write usage patterns -- please see [mlrmap.go](./pkg/types/mlrmap.go) for more details.
* [`context`](./pkg/types/context.go) supports AWK-like variables such as `FILENAME`, `NF`, `NR`, and so on.
* [pkg/cli](./pkg/cli) is the flag-parsing logic for supporting Miller's command-line interface. When you type something like `mlr --icsv --ojson put '$sum = $a + $b' then filter '$sum > 1000' myfile.csv`, it's the CLI parser which makes it possible for Miller to construct a CSV record-reader, a transformer-chain of `put` then `filter`, and a JSON record-writer.
* [pkg/climain](./pkg/climain) contains a layer which invokes `pkg/cli`, which was split out to avoid a Go package-import cycle.
* [pkg/stream](./pkg/stream) is as above -- it uses Go channels to pipe together file-reads, to record-reading/parsing, to a chain of record-transformers, to record-writing/formatting, to terminal standard output.
* [pkg/input](./pkg/input) is as above -- one record-reader type per supported input file format, and a factory method.
* [pkg/output](./pkg/output) is as above -- one record-writer type per supported output file format, and a factory method.
* [pkg/transformers](./pkg/transformers) contains the abstract record-transformer interface datatype, as well as the Go-channel chaining mechanism for piping one transformer into the next. It also contains all the concrete record-transformers such as `cat`, `tac`, `sort`, `put`, and so on.
* [pkg/parsing](./pkg/parsing) contains a single source file, `mlr.bnf`, which is the lexical/semantic grammar file for the Miller `put`/`filter` DSL using the GOCC framework. All subdirectories of `pkg/parsing/` are autogen code created by GOCC's processing of `mlr.bnf`. If you need to edit `mlr.bnf`, please use [tools/build-dsl](./tools/build-dsl) to autogenerate Go code from it (using the GOCC tool). (This takes several minutes to run.)
* [pkg/dsl](./pkg/dsl) contains [`ast_types.go`](pkg/dsl/ast_types.go) which is the abstract syntax tree datatype shared between GOCC and Miller. I didn't use a `pkg/dsl/ast` naming convention, although that would have been nice, in order to avoid a Go package-dependency cycle.
* [pkg/dsl/cst](./pkg/dsl/cst) is the concrete syntax tree, constructed from an AST produced by GOCC. The CST is what is actually executed on every input record when you do things like `$z = $x * 0.3 * $y`. Please see the [pkg/dsl/cst/README.md](./pkg/dsl/cst/README.md) for more information.
## Nil-record conventions
@ -153,7 +154,7 @@ nil through the reader/transformer/writer sequence.
## More about mlrvals
[`Mlrval`](./internal/pkg/types/mlrval.go) is the datatype of record values, as well as expression/variable values in the Miller `put`/`filter` DSL. It includes string/int/float/boolean/void/absent/error types, not unlike PHP's `zval`.
[`Mlrval`](./pkg/types/mlrval.go) is the datatype of record values, as well as expression/variable values in the Miller `put`/`filter` DSL. It includes string/int/float/boolean/void/absent/error types, not unlike PHP's `zval`.
* Miller's `absent` type is like Javascript's `undefined` -- it's for times when there is no such key, as in a DSL expression `$out = $foo` when the input record is `$x=3,y=4` -- there is no `$foo` so `$foo` has `absent` type. Nothing is written to the `$out` field in this case. See also [here](https://miller.readthedocs.io/en/latest/reference-main-null-data) for more information.
* Miller's `void` type is like Javascript's `null` -- it's for times when there is a key with no value, as in `$out = $x` when the input record is `$x=,$y=4`. This is an overlap with `string` type, since a void value looks like an empty string. I've gone back and forth on this (including when I was writing the C implementation) -- whether to retain `void` as a distinct type from empty-string, or not. I ended up keeping it as it made the `Mlrval` logic easier to understand.
@ -161,7 +162,7 @@ nil through the reader/transformer/writer sequence.
* Miller's number handling makes auto-overflow from int to float transparent, while preserving the possibility of 64-bit bitwise arithmetic.
* This is different from JavaScript, which has only double-precision floats and thus no support for 64-bit numbers (note however that there is now [`BigInt`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt)).
* This is also different from C and Go, wherein casts are necessary -- without which int arithmetic overflows.
* See also [here](https://miller.readthedocs.io/en/latest/reference-main-arithmetic) for the semantics of Miller arithmetic, which the [`Mlrval`](./internal/pkg/types/mlrval.go) class implements.
* See also [here](https://miller.readthedocs.io/en/latest/reference-main-arithmetic) for the semantics of Miller arithmetic, which the [`Mlrval`](./pkg/types/mlrval.go) class implements.
## Performance optimizations
@ -179,8 +180,8 @@ See also [./README-profiling.md](./README-profiling.md) and [https://miller.read
In summary:
* #765, #774, and #787 were low-hanging fruit.
* #424 was a bit more involved, and reveals that memory allocation -- not just GC -- needs to be handled more mindfully in Go than in C.
* #779 was a bit more involved, and reveals that Go's elegant goroutine/channel processing model comes with the caveat that channelized data should not be organized in many, small pieces.
* #809 was also bit more involved, and reveals that library functions are convenient, but profiling and analysis can sometimes reveal an opportunity for an impact, custom solution.
* #786 was a massive refactor involving about 10KLOC -- in hindsight it would have been best to do this work at the start of the Go port, not at the end.
* [#765](https://github.com/johnkerl/miller/pull/765), [#774](https://github.com/johnkerl/miller/pull/774), and [#787](https://github.com/johnkerl/miller/pull/787) were low-hanging fruit.
* [#424](https://github.com/johnkerl/miller/pull/424) was a bit more involved, and reveals that memory allocation -- not just GC -- needs to be handled more mindfully in Go than in C.
* [#779](https://github.com/johnkerl/miller/pull/779) was a bit more involved, and reveals that Go's elegant goroutine/channel processing model comes with the caveat that channelized data should not be organized in many, small pieces.
* [#809](https://github.com/johnkerl/miller/pull/809) was also bit more involved, and reveals that library functions are convenient, but profiling and analysis can sometimes reveal an opportunity for an impact, custom solution.
* [#786](https://github.com/johnkerl/miller/pull/786) was a massive refactor involving about 10KLOC -- in hindsight it would have been best to do this work at the start of the Go port, not at the end.

View file

@ -29,6 +29,7 @@ key-value-pair data in a variety of data formats.
* [Miller in 10 minutes](https://miller.readthedocs.io/en/latest/10min)
* [A Guide To Command-Line Data Manipulation](https://www.smashingmagazine.com/2022/12/guide-command-line-data-manipulation-cli-miller)
* [A quick tutorial on Miller](https://www.ict4g.net/adolfo/notes/data-analysis/miller-quick-tutorial.html)
* [Miller Exercises](https://github.com/GuilloteauQ/miller-exercises)
* [Tools to manipulate CSV files from the Command Line](https://www.ict4g.net/adolfo/notes/data-analysis/tools-to-manipulate-csv.html)
* [www.togaware.com/linux/survivor/CSV_Files.html](https://www.togaware.com/linux/survivor/CSV_Files.html)
* [MLR for CSV manipulation](https://guillim.github.io/terminal/2018/06/19/MLR-for-CSV-manipulation.html)
@ -45,31 +46,28 @@ key-value-pair data in a variety of data formats.
* [Active issues](https://github.com/johnkerl/miller/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc)
# Installing
There's a good chance you can get Miller pre-built for your system:
[![Ubuntu](https://img.shields.io/badge/distros-ubuntu-db4923.svg)](https://launchpad.net/ubuntu/+source/miller)
[![Ubuntu 16.04 LTS](https://img.shields.io/badge/distros-ubuntu1604lts-db4923.svg)](https://launchpad.net/ubuntu/xenial/+package/miller)
[![Fedora](https://img.shields.io/badge/distros-fedora-173b70.svg)](https://apps.fedoraproject.org/packages/miller)
[![Fedora](https://img.shields.io/badge/distros-fedora-173b70.svg)](https://packages.fedoraproject.org/pkgs/miller/miller/)
[![Debian](https://img.shields.io/badge/distros-debian-c70036.svg)](https://packages.debian.org/stable/miller)
[![Gentoo](https://img.shields.io/badge/distros-gentoo-4e4371.svg)](https://packages.gentoo.org/packages/sys-apps/miller)
[![Pro-Linux](https://img.shields.io/badge/distros-prolinux-3a679d.svg)](http://www.pro-linux.de/cgi-bin/DBApp/check.cgi?ShowApp..20427.100)
[![Arch Linux](https://img.shields.io/badge/distros-archlinux-1792d0.svg)](https://aur.archlinux.org/packages/miller-git)
[![NetBSD](https://img.shields.io/badge/distros-netbsd-f26711.svg)](http://pkgsrc.se/textproc/miller)
[![FreeBSD](https://img.shields.io/badge/distros-freebsd-8c0707.svg)](https://www.freshports.org/textproc/miller/)
[![Anaconda](https://img.shields.io/badge/distros-anaconda-63ad41.svg)](https://anaconda.org/conda-forge/miller/)
[![Snap](https://img.shields.io/badge/distros-snap-d85f33.svg)](https://snapcraft.io/miller)
[![Homebrew/MacOSX](https://img.shields.io/badge/distros-homebrew-ba832b.svg)](https://formulae.brew.sh/formula/miller)
[![MacPorts/MacOSX](https://img.shields.io/badge/distros-macports-1376ec.svg)](https://www.macports.org/ports.php?by=name&substr=miller)
[![Chocolatey](https://img.shields.io/badge/distros-chocolatey-red.svg)](https://chocolatey.org/packages/miller)
[![WinGet](https://img.shields.io/badge/distros-winget-392f55.svg)](https://github.com/microsoft/winget-pkgs/tree/master/manifests/m/Miller/Miller)
|OS|Installation command|
|---|---|
|Linux|`yum install miller`<br/> `apt-get install miller`|
|Linux|`yum install miller`<br/> `apt-get install miller`<br/> `snap install miller`|
|Mac|`brew install miller`<br/>`port install miller`|
|Windows|`choco install miller`|
|Windows|`choco install miller`<br/>`winget install Miller.Miller`<br/>`scoop install main/miller`|
See also [README-versions.md](./README-versions.md) for a full list of package versions. Note that long-term-support (LtS) releases will likely be on older versions.
@ -93,6 +91,7 @@ See also [building from source](https://miller.readthedocs.io/en/latest/build.ht
[![Multi-platform build status](https://github.com/johnkerl/miller/actions/workflows/go.yml/badge.svg)](https://github.com/johnkerl/miller/actions/workflows/go.yml)
[![CodeQL status](https://github.com/johnkerl/miller/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/johnkerl/miller/actions/workflows/codeql-analysis.yml)
[![Codespell status](https://github.com/johnkerl/miller/actions/workflows/codespell.yml/badge.svg)](https://github.com/johnkerl/miller/actions/workflows/codespell.yml)
[![🧪 Snap Builds](https://github.com/johnkerl/miller/actions/workflows/test-snap-can-build.yml/badge.svg)](https://github.com/johnkerl/miller/actions/workflows/test-snap-can-build.yml)
<!--
[![Release status](https://github.com/johnkerl/miller/actions/workflows/release.yml/badge.svg)](https://github.com/johnkerl/miller/actions/workflows/release.yml)
-->
@ -109,9 +108,9 @@ See also [building from source](https://miller.readthedocs.io/en/latest/build.ht
* To install: `make install`. This installs the executable `/usr/local/bin/mlr` and manual page `/usr/local/share/man/man1/mlr.1` (so you can do `man mlr`).
* You can do `./configure --prefix=/some/install/path` before `make install` if you want to install somewhere other than `/usr/local`.
* Without `make`:
* To build: `go build github.com/johnkerl/miller/cmd/mlr`.
* To run tests: `go test github.com/johnkerl/miller/internal/pkg/...` and `mlr regtest`.
* To install: `go install github.com/johnkerl/miller/cmd/mlr` will install to _GOPATH_`/bin/mlr`.
* To build: `go build github.com/johnkerl/miller/v6/cmd/mlr`.
* To run tests: `go test github.com/johnkerl/miller/v6/pkg/...` and `mlr regtest`.
* To install: `go install github.com/johnkerl/miller/v6/cmd/mlr@latest` will install to _GOPATH_`/bin/mlr`.
* See also the doc page on [building from source](https://miller.readthedocs.io/en/latest/build).
* For more developer information please see [README-dev.md](./README-dev.md).

View file

@ -3,15 +3,18 @@ package main
import (
"fmt"
"github.com/johnkerl/miller/internal/pkg/colorizer"
"github.com/johnkerl/miller/v6/pkg/colorizer"
)
const boldString = "\u001b[1m"
const underlineString = "\u001b[4m"
const reversedString = "\u001b[7m"
const redString = "\u001b[1;31m"
const blueString = "\u001b[1;34m"
const defaultString = "\u001b[0m"
const (
boldString = "\u001b[1m"
reversedString = "\u001b[7m"
redString = "\u001b[1;31m"
blueString = "\u001b[1;34m"
defaultString = "\u001b[0m"
// underlineString = "\u001b[4m"
)
func main() {
fmt.Printf("Hello, world!\n")

View file

@ -28,9 +28,9 @@ mkdir -p $dir
# ----------------------------------------------------------------
# Run the parser-generator
# Build the bin/gocc executable:
go get github.com/goccmack/gocc
#go get github.com/johnkerl/gocc
# Build the bin/gocc executable (use my fork for performance):
# get github.com/goccmack/gocc
go get github.com/johnkerl/gocc
bingocc="$GOPATH/bin/gocc"
if [ ! -x "$bingocc" ]; then

View file

@ -1,5 +1,5 @@
module one
go 1.16
go 1.24
require github.com/goccmack/gocc v0.0.0-20210322175033-34358ebe5808 // indirect
toolchain go1.24.5

View file

@ -1,26 +0,0 @@
github.com/goccmack/gocc v0.0.0-20210322175033-34358ebe5808 h1:MBgZdx/wBJWTR2Q79mQfP6c8uXdQiu5JowfEz3KhFac=
github.com/goccmack/gocc v0.0.0-20210322175033-34358ebe5808/go.mod h1:dWhnuKE5wcnGTExA2DH6Iicu21YnWwOPMrc/GyhtbCk=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/mod v0.3.0 h1:RM4zey1++hCTbCVQfnWeKs9/IEsaBLA8vTkd0WVtmH4=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210119212857-b64e53b001e4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.1.0/go.mod h1:xkSsbof2nBLbhDlRMhhhyNLN/zl3eTqcnHD5viDpcZ0=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 h1:go1bK/D/BFZV2I8cIQd1NKEZ+0owSTG1fDTci4IqFcE=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=

View file

@ -28,9 +28,9 @@ mkdir -p $dir
# ----------------------------------------------------------------
# Run the parser-generator
# Build the bin/gocc executable:
go get github.com/goccmack/gocc
#go get github.com/johnkerl/gocc
# Build the bin/gocc executable (use my fork for performance):
# go get github.com/goccmack/gocc
go get github.com/johnkerl/gocc
bingocc="$GOPATH/bin/gocc"
if [ ! -x "$bingocc" ]; then
exit 1

View file

@ -1,5 +1,5 @@
module two
go 1.16
go 1.24
require github.com/goccmack/gocc v0.0.0-20210322175033-34358ebe5808 // indirect
toolchain go1.24.5

View file

@ -1,26 +0,0 @@
github.com/goccmack/gocc v0.0.0-20210322175033-34358ebe5808 h1:MBgZdx/wBJWTR2Q79mQfP6c8uXdQiu5JowfEz3KhFac=
github.com/goccmack/gocc v0.0.0-20210322175033-34358ebe5808/go.mod h1:dWhnuKE5wcnGTExA2DH6Iicu21YnWwOPMrc/GyhtbCk=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/mod v0.3.0 h1:RM4zey1++hCTbCVQfnWeKs9/IEsaBLA8vTkd0WVtmH4=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210119212857-b64e53b001e4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.1.0/go.mod h1:xkSsbof2nBLbhDlRMhhhyNLN/zl3eTqcnHD5viDpcZ0=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 h1:go1bK/D/BFZV2I8cIQd1NKEZ+0owSTG1fDTci4IqFcE=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=

View file

@ -11,7 +11,7 @@ import (
"strings"
"time"
"github.com/johnkerl/miller/internal/pkg/entrypoint"
"github.com/johnkerl/miller/v6/pkg/entrypoint"
"github.com/pkg/profile" // for trace.out
)

View file

@ -8,7 +8,7 @@ import (
"fmt"
"os"
"github.com/johnkerl/miller/internal/pkg/scan"
"github.com/johnkerl/miller/v6/pkg/scan"
)
func main() {

View file

@ -3,7 +3,7 @@
// ================================================================
/*
go build github.com/johnkerl/miller/cmd/sizes
go build github.com/johnkerl/miller/v6/cmd/sizes
*/
package main
@ -11,7 +11,7 @@ package main
import (
"fmt"
"github.com/johnkerl/miller/internal/pkg/mlrval"
"github.com/johnkerl/miller/v6/pkg/mlrval"
)
func main() {

View file

@ -91,7 +91,7 @@ $tar \
./go.mod \
./go.sum \
./cmd \
./internal \
./pkg \
./regression_test.go \
./man \
./test \

5
delve.txt Normal file
View file

@ -0,0 +1,5 @@
dlv exec ./mlr -- --csv --from x.csv sub -a def ghi
break main.main
# or wherever
restart
continue

View file

@ -8,6 +8,8 @@ theme:
code: Lato Mono
features:
- navigation.top
- content.action.edit
- content.action.view
custom_dir: overrides
repo_url: https://github.com/johnkerl/miller
repo_name: miller
@ -109,11 +111,16 @@ nav:
- "Auxiliary commands": "reference-main-auxiliary-commands.md"
- "Manual page": "manpage.md"
- "Building from source": "build.md"
- "Miller as a library": "miller-as-library.md"
- "How to create a new release": "how-to-release.md"
- "Documents for previous releases": "release-docs.md"
- "Glossary": "glossary.md"
- "What's new in Miller 6": "new-in-miller-6.md"
markdown_extensions:
- toc:
- toc:
permalink: true
- admonition
- pymdownx.details
- pymdownx.superfences

View file

@ -20,7 +20,7 @@ Quick links:
Let's take a quick look at some of the most useful Miller verbs -- file-format-aware, name-index-empowered equivalents of standard system commands.
For most of this section we'll use our [example.csv](./example.csv).
For most of this section, we'll use our [example.csv](./example.csv).
`mlr cat` is like system `cat` (or `type` on Windows) -- it passes the data through unmodified:
@ -909,3 +909,40 @@ yellow,triangle,true,1,11,43.6498,9.8870
purple,triangle,false,5,51,81.2290,8.5910
purple,triangle,false,7,65,80.1405,5.8240
</pre>
Alternatively, the `split` verb can do the same thing:
<pre class="pre-highlight-non-pair">
<b>mlr --csv --from example.csv split -g shape</b>
</pre>
<pre class="pre-highlight-in-pair">
<b>cat split_circle.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
color,shape,flag,k,index,quantity,rate
red,circle,true,3,16,13.8103,2.9010
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
</pre>
<pre class="pre-highlight-in-pair">
<b>cat split_square.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
color,shape,flag,k,index,quantity,rate
red,square,true,2,15,79.2778,0.0130
red,square,false,4,48,77.5542,7.4670
red,square,false,6,64,77.1991,9.5310
purple,square,false,10,91,72.3735,8.2430
</pre>
<pre class="pre-highlight-in-pair">
<b>cat split_triangle.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
purple,triangle,false,5,51,81.2290,8.5910
purple,triangle,false,7,65,80.1405,5.8240
</pre>

View file

@ -4,7 +4,7 @@
Let's take a quick look at some of the most useful Miller verbs -- file-format-aware, name-index-empowered equivalents of standard system commands.
For most of this section we'll use our [example.csv](./example.csv).
For most of this section, we'll use our [example.csv](./example.csv).
`mlr cat` is like system `cat` (or `type` on Windows) -- it passes the data through unmodified:
@ -434,3 +434,21 @@ GENMD-EOF
GENMD-RUN-COMMAND
cat triangle.csv
GENMD-EOF
Alternatively, the `split` verb can do the same thing:
GENMD-RUN-COMMAND
mlr --csv --from example.csv split -g shape
GENMD-EOF
GENMD-RUN-COMMAND
cat split_circle.csv
GENMD-EOF
GENMD-RUN-COMMAND
cat split_square.csv
GENMD-EOF
GENMD-RUN-COMMAND
cat split_triangle.csv
GENMD-EOF

View file

@ -18,7 +18,7 @@ Quick links:
Please also see [Installation](installing-miller.md) for information about pre-built executables.
You will need to first install Go version 1.15 or higher: please see [https://go.dev](https://go.dev).
You will need to first install Go ([this version](https://github.com/johnkerl/miller/blob/main/go.mod#L17)): please see [https://go.dev](https://go.dev).
## Miller license
@ -31,16 +31,16 @@ Two-clause BSD license [https://github.com/johnkerl/miller/blob/master/LICENSE.t
* `cd mlr-i.j.k`
* `cd go`
* `make` creates the `./mlr` (or `.\mlr.exe` on Windows) executable
* Without `make`: `go build github.com/johnkerl/miller/cmd/mlr`
* Without `make`: `go build github.com/johnkerl/miller/v6/cmd/mlr`
* `make check` runs tests
* Without `make`: `go test github.com/johnkerl/miller/internal/pkg/...` and `mlr regtest`
* Without `make`: `go test github.com/johnkerl/miller/v6/pkg/...` and `mlr regtest`
* `make install` installs the `mlr` executable and the `mlr` manpage
* Without make: `go install github.com/johnkerl/miller/cmd/mlr` will install to _GOPATH_`/bin/mlr`
* Without make: `go install github.com/johnkerl/miller/v6/cmd/mlr` will install to _GOPATH_`/bin/mlr`
## From git clone
* `git clone https://github.com/johnkerl/miller`
* `make`/`go build github.com/johnkerl/miller/cmd/mlr` as above
* `make`/`go build github.com/johnkerl/miller/v6/cmd/mlr` as above
## In case of problems

View file

@ -2,7 +2,7 @@
Please also see [Installation](installing-miller.md) for information about pre-built executables.
You will need to first install Go version 1.15 or higher: please see [https://go.dev](https://go.dev).
You will need to first install Go ([this version](https://github.com/johnkerl/miller/blob/main/go.mod#L17)): please see [https://go.dev](https://go.dev).
## Miller license
@ -15,16 +15,16 @@ Two-clause BSD license [https://github.com/johnkerl/miller/blob/master/LICENSE.t
* `cd mlr-i.j.k`
* `cd go`
* `make` creates the `./mlr` (or `.\mlr.exe` on Windows) executable
* Without `make`: `go build github.com/johnkerl/miller/cmd/mlr`
* Without `make`: `go build github.com/johnkerl/miller/v6/cmd/mlr`
* `make check` runs tests
* Without `make`: `go test github.com/johnkerl/miller/internal/pkg/...` and `mlr regtest`
* Without `make`: `go test github.com/johnkerl/miller/v6/pkg/...` and `mlr regtest`
* `make install` installs the `mlr` executable and the `mlr` manpage
* Without make: `go install github.com/johnkerl/miller/cmd/mlr` will install to _GOPATH_`/bin/mlr`
* Without make: `go install github.com/johnkerl/miller/v6/cmd/mlr` will install to _GOPATH_`/bin/mlr`
## From git clone
* `git clone https://github.com/johnkerl/miller`
* `make`/`go build github.com/johnkerl/miller/cmd/mlr` as above
* `make`/`go build github.com/johnkerl/miller/v6/cmd/mlr` as above
## In case of problems

View file

@ -50,7 +50,7 @@ and the `--csv` part will automatically be understood. If you do want to process
* You can include any command-line flags, except the "terminal" ones such as `--help`.
* The `--prepipe`, `--load`, and `--mload` flags aren't allowed in `.mlrrc` as they control code execution, and could result in your scripts running things you don't expect if you receive data from someone with a `./.mlrrc` in it. You can use `--prepipe-bz2`, `--prepipe-gunzip`, and `--prepipe-zcat` in `.mlrrc`, though.
* The `--prepipe`, `--load`, and `--mload` flags aren't allowed in `.mlrrc` as they control code execution, and could result in your scripts running things you don't expect if you receive data from someone with a `./.mlrrc` in it. You can use `--prepipe-bz2`, `--prepipe-gunzip`, `--prepipe-zcat`, and `--prepipe-zstdcat` in `.mlrrc`, though.
* The formatting rule is you need to put one flag beginning with `--` per line: for example, `--csv` on one line and `--nr-progress-mod 1000` on a separate line.

View file

@ -34,7 +34,7 @@ and the `--csv` part will automatically be understood. If you do want to process
* You can include any command-line flags, except the "terminal" ones such as `--help`.
* The `--prepipe`, `--load`, and `--mload` flags aren't allowed in `.mlrrc` as they control code execution, and could result in your scripts running things you don't expect if you receive data from someone with a `./.mlrrc` in it. You can use `--prepipe-bz2`, `--prepipe-gunzip`, and `--prepipe-zcat` in `.mlrrc`, though.
* The `--prepipe`, `--load`, and `--mload` flags aren't allowed in `.mlrrc` as they control code execution, and could result in your scripts running things you don't expect if you receive data from someone with a `./.mlrrc` in it. You can use `--prepipe-bz2`, `--prepipe-gunzip`, `--prepipe-zcat`, and `--prepipe-zstdcat` in `.mlrrc`, though.
* The formatting rule is you need to put one flag beginning with `--` per line: for example, `--csv` on one line and `--nr-progress-mod 1000` on a separate line.

View file

@ -26,7 +26,7 @@ Vertical-tabular format is good for a quick look at CSV data layout -- seeing wh
<b>wc -l data/flins.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
36635 data/flins.csv
36635 data/flins.csv
</pre>
<pre class="pre-highlight-in-pair">
@ -227,7 +227,7 @@ Peek at the data:
<b>wc -l data/colored-shapes.dkvp</b>
</pre>
<pre class="pre-non-highlight-in-pair">
10078 data/colored-shapes.dkvp
10078 data/colored-shapes.dkvp
</pre>
<pre class="pre-highlight-in-pair">

6
docs/src/data-error.csv Normal file
View file

@ -0,0 +1,6 @@
x
1
2
3
text
4
1 x
2 1
3 2
4 3
5 text
6 4

View file

@ -0,0 +1,2 @@
data/a.csv
data/b.csv

View file

@ -0,0 +1,2 @@
a,b.,.c,.,d..e,f.g
1,2,3,4,5,6
1 a b. .c . d..e f.g
2 1 2 3 4 5 6

View file

@ -0,0 +1,5 @@
[
{ "a": 1, "b": 2, "c": 3 },
{ "a": 4, "b": 5, "c": 6 },
{ "a": 7, "X": 8, "c": 9 }
]

View file

@ -0,0 +1,6 @@
[
{ "a": 1, "b": 2, "c": 3 },
{ "a": 4, "b": 5, "c": 6, "d": 7 },
{ "a": 7, "b": 8 },
{ "a": 9, "b": 10, "c": 11 }
]

View file

@ -68,7 +68,7 @@ date,qoh
<b>wc -l data/miss-date.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
1372 data/miss-date.csv
1372 data/miss-date.csv
</pre>
Since there are 1372 lines in the data file, some automation is called for. To find the missing dates, you can convert the dates to seconds since the epoch using `strptime`, then compute adjacent differences (the `cat -n` simply inserts record-counters):

View file

@ -1,5 +1,5 @@
#!/usr/bin/env mlr -s
--c2p
filter '$quantity != 20'
filter '$quantity != 20' # Here is a comment
then count-distinct -f shape
then fraction -f count

View file

@ -236,3 +236,8 @@ img {
--md-footer-fg-color: #800000;
--md-footer-fg-color: #eae2cb;
}
.md-nav__link--active {
text-decoration: underline;
}

View file

@ -16,7 +16,7 @@ Quick links:
</div>
# Features
Miller is like awk, sed, cut, join, and sort for **name-indexed data such as
Miller is like awk, sed, cut, join, and sort for **name-indexed data, such as
CSV, TSV, JSON, and JSON Lines**. You get to work with your data using named
fields, without needing to count positional column indices.
@ -36,9 +36,9 @@ including but not limited to the familiar CSV, TSV, JSON, and JSON Lines.
* Miller complements SQL **databases**: you can slice, dice, and reformat data on the client side on its way into or out of a database. (See [SQL Examples](sql-examples.md).) You can also reap some of the benefits of databases for quick, setup-free one-off tasks when you just need to query some data in disk files in a hurry.
* Miller also goes beyond the classic Unix tools by stepping fully into our modern, **no-SQL** world: its essential record-heterogeneity property allows Miller to operate on data where records with different schema (field names) are interleaved.
* Miller also goes beyond the classic Unix tools by stepping fully into our modern, **no-SQL** world: its essential record-heterogeneity property allows Miller to operate on data where records with different schemas (field names) are interleaved.
* Miller is **streaming**: most operations need only a single record in memory at a time, rather than ingesting all input before producing any output. For those operations which require deeper retention (`sort`, `tac`, `stats1`), Miller retains only as much data as needed. This means that whenever functionally possible, you can operate on files which are larger than your system's available RAM, and you can use Miller in **tail -f** contexts.
* Miller is **streaming**: most operations need only a single record in memory at a time, rather than ingesting all input before producing any output. For those operations that require deeper retention (`sort`, `tac`, `stats1`), Miller retains only as much data as needed. This means that whenever functionally possible, you can operate on files that are larger than your system's available RAM, and you can use Miller in **tail -f** contexts.
* Miller is **pipe-friendly** and interoperates with the Unix toolkit
@ -46,10 +46,10 @@ including but not limited to the familiar CSV, TSV, JSON, and JSON Lines.
* Miller does **conversion** between formats
* Miller's **processing is format-aware**: e.g. CSV `sort` and `tac` keep header lines first
* Miller's **processing is format-aware**: e.g., CSV `sort` and `tac` keep header lines first
* Miller has high-throughput **performance** on par with the Unix toolkit
* Not unlike [jq](https://stedolan.github.io/jq/) (for JSON), Miller is written in Go which is a portable, modern language, and Miller has no runtime dependencies. You can download or compile a single binary, `scp` it to a faraway machine, and expect it to work.
* Not unlike [jq](https://stedolan.github.io/jq/) (for JSON), Miller is written in Go, which is a portable, modern language, and Miller has no runtime dependencies. You can download or compile a single binary, `scp` it to a faraway machine, and expect it to work.
Releases and release notes: [https://github.com/johnkerl/miller/releases](https://github.com/johnkerl/miller/releases).

View file

@ -1,6 +1,6 @@
# Features
Miller is like awk, sed, cut, join, and sort for **name-indexed data such as
Miller is like awk, sed, cut, join, and sort for **name-indexed data, such as
CSV, TSV, JSON, and JSON Lines**. You get to work with your data using named
fields, without needing to count positional column indices.
@ -20,9 +20,9 @@ including but not limited to the familiar CSV, TSV, JSON, and JSON Lines.
* Miller complements SQL **databases**: you can slice, dice, and reformat data on the client side on its way into or out of a database. (See [SQL Examples](sql-examples.md).) You can also reap some of the benefits of databases for quick, setup-free one-off tasks when you just need to query some data in disk files in a hurry.
* Miller also goes beyond the classic Unix tools by stepping fully into our modern, **no-SQL** world: its essential record-heterogeneity property allows Miller to operate on data where records with different schema (field names) are interleaved.
* Miller also goes beyond the classic Unix tools by stepping fully into our modern, **no-SQL** world: its essential record-heterogeneity property allows Miller to operate on data where records with different schemas (field names) are interleaved.
* Miller is **streaming**: most operations need only a single record in memory at a time, rather than ingesting all input before producing any output. For those operations which require deeper retention (`sort`, `tac`, `stats1`), Miller retains only as much data as needed. This means that whenever functionally possible, you can operate on files which are larger than your system's available RAM, and you can use Miller in **tail -f** contexts.
* Miller is **streaming**: most operations need only a single record in memory at a time, rather than ingesting all input before producing any output. For those operations that require deeper retention (`sort`, `tac`, `stats1`), Miller retains only as much data as needed. This means that whenever functionally possible, you can operate on files that are larger than your system's available RAM, and you can use Miller in **tail -f** contexts.
* Miller is **pipe-friendly** and interoperates with the Unix toolkit
@ -30,10 +30,10 @@ including but not limited to the familiar CSV, TSV, JSON, and JSON Lines.
* Miller does **conversion** between formats
* Miller's **processing is format-aware**: e.g. CSV `sort` and `tac` keep header lines first
* Miller's **processing is format-aware**: e.g., CSV `sort` and `tac` keep header lines first
* Miller has high-throughput **performance** on par with the Unix toolkit
* Not unlike [jq](https://stedolan.github.io/jq/) (for JSON), Miller is written in Go which is a portable, modern language, and Miller has no runtime dependencies. You can download or compile a single binary, `scp` it to a faraway machine, and expect it to work.
* Not unlike [jq](https://stedolan.github.io/jq/) (for JSON), Miller is written in Go, which is a portable, modern language, and Miller has no runtime dependencies. You can download or compile a single binary, `scp` it to a faraway machine, and expect it to work.
Releases and release notes: [https://github.com/johnkerl/miller/releases](https://github.com/johnkerl/miller/releases).

View file

@ -20,7 +20,7 @@ Miller handles name-indexed data using several formats: some you probably know
by name, such as CSV, TSV, JSON, and JSON Lines -- and other formats you're likely already
seeing and using in your structured data.
Additionally, Miller gives you the option of including comments within your data.
Additionally, Miller gives you the option to include comments within your data.
## Examples
@ -69,7 +69,7 @@ PPRINT: pretty-printed tabular
| 4 5 6 | Record 2: "apple":"4", "bat":"5", "cog":"6"
+---------------------+
Markdown tabular (supported for output only):
Markdown tabular:
+-----------------------+
| | apple | bat | cog | |
| | --- | --- | --- | |
@ -102,21 +102,27 @@ NIDX: implicitly numerically indexed (Unix-toolkit style)
## CSV/TSV/ASV/USV/etc.
When `mlr` is invoked with the `--csv` or `--csvlite` option, key names are found on the first record and values are taken from subsequent records. This includes the case of CSV-formatted files. See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.
When `mlr` is invoked with the `--csv` or `--csvlite` option, key names are found on the first record, and values are taken from subsequent records. This includes the case of CSV-formatted files. See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.
Miller has record separator `RS` and field separator `FS`, just as `awk` does. (See also the [separators page](reference-main-separators.md).)
**TSV (tab-separated values):** `FS` is tab and `RS` is newline (or carriage return + linefeed for
Windows). On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return,
newline, tab, and backslash, respectively. On output, the reverse is done -- for example, if a field
has an embedded newline, that newline is replaced by `\n`.
**CSV (comma-separated values):** Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180).
* This includes CRLF line terminators by default, regardless of platform.
* Any cell containing a comma or a carriage return within it must be double-quoted.
**TSV (tab-separated values):** Miller's `--tsv` supports [IANA TSV](https://www.iana.org/assignments/media-types/text/tab-separated-values).
* `FS` is tab and `RS` is newline (or carriage return + linefeed for Windows).
* On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return, newline, tab, and backslash, respectively.
* On output, the reverse is done -- for example, if a field has an embedded newline, that newline is replaced by `\n`.
* A tab within a cell must be encoded as `\t`.
* A carriage return within a cell must be encoded as `\n`.
**ASV (ASCII-separated values):** the flags `--asv`, `--iasv`, `--oasv`, `--asvlite`, `--iasvlite`, and `--oasvlite` are analogous except they use ASCII FS and RS `0x1f` and `0x1e`, respectively.
**USV (Unicode-separated values):** likewise, the flags `--usv`, `--iusv`, `--ousv`, `--usvlite`, `--iusvlite`, and `--ousvlite` use Unicode FS and RS `U+241F` (UTF-8 `0x0xe2909f`) and `U+241E` (UTF-8 `0xe2909e`), respectively.
Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180). This includes CRLF line-terminators by default, regardless of platform.
Here are the differences between CSV and CSV-lite:
* CSV-lite naively splits lines on newline, and fields on comma -- embedded commas and newlines are not escaped in any way.
@ -125,30 +131,98 @@ Here are the differences between CSV and CSV-lite:
* CSV does not allow heterogeneous data; CSV-lite does (see also [Record Heterogeneity](record-heterogeneity.md)).
* TSV-lite is simply CSV-lite with field separator set to tab instead of comma.
In particular, no encode/decode of `\r`, `\n`, `\t`, or `\\` is done.
* TSV-lite is simply CSV-lite with the field separator set to tab instead of a comma.
In particular, no encoding/decoding of `\r`, `\n`, `\t`, or `\\` is done.
* CSV-lite allows changing FS and/or RS to any values, perhaps multi-character.
* CSV-lite and TSV-lite handle schema changes ("schema" meaning "ordered list of field names in a given record") by adding a newline and re-emitting the header. CSV and TSV, by contrast, do the following:
* If there are too few keys, but these match the header, empty fields are emitted.
* If there are too many keys, but these match the header up to the number of header fields, the extra fields are emitted.
* If keys don't match the header, this is an error.
<pre class="pre-highlight-in-pair">
<b>cat data/under-over.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
[
{ "a": 1, "b": 2, "c": 3 },
{ "a": 4, "b": 5, "c": 6, "d": 7 },
{ "a": 7, "b": 8 },
{ "a": 9, "b": 10, "c": 11 }
]
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --ijson --ocsvlite cat data/under-over.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b,c
1,2,3
a,b,c,d
4,5,6,7
a,b
7,8
a,b,c
9,10,11
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --ijson --ocsvlite cat data/key-change.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b,c
1,2,3
4,5,6
a,X,c
7,8,9
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --ijson --ocsv cat data/under-over.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b,c
1,2,3
4,5,6,7
7,8,
9,10,11
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --ijson --ocsv cat data/key-change.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b,c
1,2,3
4,5,6
mlr: CSV schema change: first keys "a,b,c"; current keys "a,X,c"
mlr: exiting due to data error.
</pre>
* In short, use-cases for CSV-lite and TSV-lite are often found when dealing with CSV/TSV files which are formatted in some non-standard way -- you have a little more flexibility available to you. (As an example of this flexibility: ASV and USV are nothing more than CSV-lite with different values for FS and RS.)
CSV, TSV, CSV-lite, and TSV-lite have in common the `--implicit-csv-header` flag for input and the `--headerless-csv-output` flag for output.
See also the [`--lazy-quotes` flag](reference-main-flag-list.md#csv-only-flags) which can help with CSV files which are not fully compliant with RFC-4180.
See also the [`--lazy-quotes` flag](reference-main-flag-list.md#csv-only-flags), which can help with CSV files that are not fully compliant with RFC-4180.
## JSON
[JSON](https://json.org) is a format which supports scalars (numbers, strings,
boolean, etc.) as well as "objects" (maps) and "arrays" (lists), while Miller
booleans, etc.) as well as "objects" (maps) and "arrays" (lists), while Miller
is a tool for handling **tabular data** only. By *tabular JSON* I mean the
data is either a sequence of one or more objects, or an array consisting of one
or more objects. Miller treats JSON objects as name-indexed records.
This means Miller cannot (and should not) handle arbitrary JSON. In practice,
though, Miller can handle single JSON objects as well as list of them. The only
kinds of JSON that are unmillerable are single scalars (e.g. file contents `3`)
and arrays of non-object (e.g. file contents `[1,2,3,4,5]`). Check out
[jq](https://stedolan.github.io/jq/) for a tool which handles all valid JSON.
though, Miller can handle single JSON objects as well as lists of them. The only
kinds of JSON that are unmillerable are single scalars (e.g., file contents `3`)
and arrays of non-object (e.g., file contents `[1,2,3,4,5]`). Check out
[jq](https://stedolan.github.io/jq/) for a tool that handles all valid JSON.
In short, if you have tabular data represented in JSON -- lists of objects,
either with or without outermost `[...]` -- [then Miller can handle that for
@ -262,7 +336,7 @@ input as well as output in JSON format, JSON structure is preserved throughout t
]
</pre>
But if the input format is JSON and the output format is not (or vice versa) then key-concatenation applies:
But if the input format is JSON and the output format is not (or vice versa), then key-concatenation applies:
<pre class="pre-highlight-in-pair">
<b>mlr --ijson --opprint head -n 4 data/json-example-2.json</b>
@ -281,7 +355,7 @@ Use `--jflatsep yourseparatorhere` to specify the string used for key concatenat
### JSON-in-CSV
It's quite common to have CSV data which contains stringified JSON as a column.
It's quite common to have CSV data that contains stringified JSON as a column.
See the [JSON parse and stringify section](reference-main-data-types.md#json-parse-and-stringify) for ways to
decode these in Miller.
@ -336,7 +410,7 @@ records; using `--ojsonl`, you get no outermost `[...]`, and one line per record
## PPRINT: Pretty-printed tabular
Miller's pretty-print format is like CSV, but column-aligned. For example, compare
Miller's pretty-print format is similar to CSV, but with column alignment. For example, compare
<pre class="pre-highlight-in-pair">
<b>mlr --ocsv cat data/small</b>
@ -362,11 +436,11 @@ eks wye 4 0.381399 0.134188
wye pan 5 0.573288 0.863624
</pre>
Note that while Miller is a line-at-a-time processor and retains input lines in memory only where necessary (e.g. for sort), pretty-print output requires it to accumulate all input lines (so that it can compute maximum column widths) before producing any output. This has two consequences: (a) pretty-print output won't work on `tail -f` contexts, where Miller will be waiting for an end-of-file marker which never arrives; (b) pretty-print output for large files is constrained by available machine memory.
Note that while Miller is a line-at-a-time processor and retains input lines in memory only where necessary (e.g., for sort), pretty-print output requires it to accumulate all input lines (so that it can compute maximum column widths) before producing any output. This has two consequences: (a) Pretty-print output will not work in `tail -f` contexts, where Miller will be waiting for an end-of-file marker that never arrives; (b) Pretty-print output for large files is constrained by the available machine memory.
See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.
For output only (this isn't supported in the input-scanner as of 5.0.0) you can use `--barred` with pprint output format:
Since Miller 5.0.0, you can use `--barred` or `--barred-output` with pprint output format:
<pre class="pre-highlight-in-pair">
<b>mlr --opprint --barred cat data/small</b>
@ -383,6 +457,37 @@ For output only (this isn't supported in the input-scanner as of 5.0.0) you can
+-----+-----+---+----------+----------+
</pre>
Since Miller 6.11.0, you can use `--barred-input` with pprint input format:
<pre class="pre-highlight-in-pair">
<b>mlr -o pprint --barred cat data/small | mlr -i pprint --barred-input -o json filter '$b == "pan"'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
[
{
"a": "pan",
"b": "pan",
"i": 1,
"x": 0.346791,
"y": 0.726802
},
{
"a": "eks",
"b": "pan",
"i": 2,
"x": 0.758679,
"y": 0.522151
},
{
"a": "wye",
"b": "pan",
"i": 5,
"x": 0.573288,
"y": 0.863624
}
]
</pre>
## Markdown tabular
Markdown format looks like this:
@ -400,11 +505,12 @@ Markdown format looks like this:
| wye | pan | 5 | 0.573288 | 0.863624 |
</pre>
which renders like this when dropped into various web tools (e.g. github comments):
which renders like this when dropped into various web tools (e.g. github.comments):
![pix/omd.png](pix/omd.png)
As of Miller 4.3.0, markdown format is supported only for output, not input.
As of Miller 4.3.0, markdown format is supported only for output, not input; as of Miller 6.11.0, markdown format
is supported for input as well.
## XTAB: Vertical tabular
@ -488,7 +594,7 @@ a=eks,b=wye,i=4,x=0.381399,y=0.134188
a=wye,b=pan,i=5,x=0.573288,y=0.863624
</pre>
Such data are easy to generate, e.g. in Ruby with
Such data is easy to generate, e.g., in Ruby with
<pre class="pre-non-highlight-non-pair">
puts "host=#{hostname},seconds=#{t2-t1},message=#{msg}"
@ -510,7 +616,7 @@ logger.log("type=3,user=$USER,date=$date\n");
Fields lacking an IPS will have positional index (starting at 1) used as the key, as in NIDX format. For example, `dish=7,egg=8,flint` is parsed as `"dish" => "7", "egg" => "8", "3" => "flint"` and `dish,egg,flint` is parsed as `"1" => "dish", "2" => "egg", "3" => "flint"`.
As discussed in [Record Heterogeneity](record-heterogeneity.md), Miller handles changes of field names within the same data stream. But using DKVP format this is particularly natural. One of my favorite use-cases for Miller is in application/server logs, where I log all sorts of lines such as
As discussed in [Record Heterogeneity](record-heterogeneity.md), Miller handles changes of field names within the same data stream. But using DKVP format, this is particularly natural. One of my favorite use-cases for Miller is in application/server logs, where I log all sorts of lines such as
<pre class="pre-non-highlight-non-pair">
resource=/path/to/file,loadsec=0.45,ok=true
@ -518,10 +624,9 @@ record_count=100, resource=/path/to/file
resource=/some/other/path,loadsec=0.97,ok=false
</pre>
etc. and I just log them as needed. Then later, I can use `grep`, `mlr --opprint group-like`, etc.
to analyze my logs.
etc., and I log them as needed. Then later, I can use `grep`, `mlr --opprint group-like`, etc. to analyze my logs.
See the [separators page](reference-main-separators.md) regarding how to specify separators other than the default equals-sign and comma.
See the [separators page](reference-main-separators.md) regarding how to specify separators other than the default equals sign and comma.
## NIDX: Index-numbered (toolkit style)
@ -604,19 +709,19 @@ While you can do format conversion using `mlr --icsv --ojson cat myfile.csv`, th
FORMAT-CONVERSION KEYSTROKE-SAVER FLAGS
As keystroke-savers for format-conversion you may use the following.
The letters c, t, j, l, d, n, x, p, and m refer to formats CSV, TSV, DKVP, NIDX,
JSON, JSON Lines, XTAB, PPRINT, and markdown, respectively. Note that markdown
format is available for output only.
JSON, JSON Lines, XTAB, PPRINT, and markdown, respectively.
| In\out | CSV | TSV | JSON | JSONL | DKVP | NIDX | XTAB | PPRINT | Markdown |
+--------+-------+-------+--------+--------+--------+--------+--------+----------+
| CSV | | --c2t | --c2j | --c2l | --c2d | --c2n | --c2x | --c2p | --c2m |
| TSV | --t2c | | --t2j | --t2l | --t2d | --t2n | --t2x | --t2p | --t2m |
| JSON | --j2c | --j2t | | --j2l | --j2d | --j2n | --j2x | --j2p | --j2m |
| JSONL | --l2c | --l2t | | | --l2d | --l2n | --l2x | --l2p | --l2m |
| DKVP | --d2c | --d2t | --d2j | --d2l | | --d2n | --d2x | --d2p | --d2m |
| NIDX | --n2c | --n2t | --n2j | --n2l | --n2d | | --n2x | --n2p | --n2m |
| XTAB | --x2c | --x2t | --x2j | --x2l | --x2d | --x2n | | --x2p | --x2m |
| PPRINT | --p2c | --p2t | --p2j | --p2l | --p2d | --p2n | --p2x | | --p2m |
| In\out | CSV | TSV | JSON | JSONL | DKVP | NIDX | XTAB | PPRINT | Markdown |
+----------+----------+----------+----------+-------+-------+-------+-------+--------+----------|
| CSV | --c2c,-c | --c2t | --c2j | --c2l | --c2d | --c2n | --c2x | --c2p | --c2m |
| TSV | --t2c | --t2t,-t | --t2j | --t2l | --t2d | --t2n | --t2x | --t2p | --t2m |
| JSON | --j2c | --j2t | --j2j,-j | --j2l | --j2d | --j2n | --j2x | --j2p | --j2m |
| JSONL | --l2c | --l2t | --l2j | --l2l | --l2d | --l2n | --l2x | --l2p | --l2m |
| DKVP | --d2c | --d2t | --d2j | --d2l | --d2d | --d2n | --d2x | --d2p | --d2m |
| NIDX | --n2c | --n2t | --n2j | --n2l | --n2d | --n2n | --n2x | --n2p | --n2m |
| XTAB | --x2c | --x2t | --x2j | --x2l | --x2d | --x2n | --x2x | --x2p | --x2m |
| PPRINT | --p2c | --p2t | --p2j | --p2l | --p2d | --p2n | --p2x | -p2p | --p2m |
| Markdown | --m2c | --m2t | --m2j | --m2l | --m2d | --m2n | --m2x | --m2p | |
-p Keystroke-saver for `--nidx --fs space --repifs`.
-T Keystroke-saver for `--nidx --fs tab`.
@ -624,7 +729,7 @@ format is available for output only.
## Comments in data
You can include comments within your data files, and either have them ignored, or passed directly through to the standard output as soon as they are encountered:
You can include comments within your data files, and either have them ignored or passed directly through to the standard output as soon as they are encountered:
<pre class="pre-highlight-in-pair">
<b>mlr help comments-in-data-flags</b>
@ -652,12 +757,14 @@ Notes:
within the input.
--pass-comments-with {string}
Immediately print commented lines within input, with
specified prefix.
specified prefix. For CSV input format, the prefix
must be a single character.
--skip-comments Ignore commented lines (prefixed by `#`) within the
input.
--skip-comments-with {string}
Ignore commented lines within input, with specified
prefix.
prefix. For CSV input format, the prefix must be a
single character.
</pre>
Examples:

View file

@ -4,7 +4,7 @@ Miller handles name-indexed data using several formats: some you probably know
by name, such as CSV, TSV, JSON, and JSON Lines -- and other formats you're likely already
seeing and using in your structured data.
Additionally, Miller gives you the option of including comments within your data.
Additionally, Miller gives you the option to include comments within your data.
## Examples
@ -14,21 +14,27 @@ GENMD-EOF
## CSV/TSV/ASV/USV/etc.
When `mlr` is invoked with the `--csv` or `--csvlite` option, key names are found on the first record and values are taken from subsequent records. This includes the case of CSV-formatted files. See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.
When `mlr` is invoked with the `--csv` or `--csvlite` option, key names are found on the first record, and values are taken from subsequent records. This includes the case of CSV-formatted files. See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.
Miller has record separator `RS` and field separator `FS`, just as `awk` does. (See also the [separators page](reference-main-separators.md).)
**TSV (tab-separated values):** `FS` is tab and `RS` is newline (or carriage return + linefeed for
Windows). On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return,
newline, tab, and backslash, respectively. On output, the reverse is done -- for example, if a field
has an embedded newline, that newline is replaced by `\n`.
**CSV (comma-separated values):** Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180).
* This includes CRLF line terminators by default, regardless of platform.
* Any cell containing a comma or a carriage return within it must be double-quoted.
**TSV (tab-separated values):** Miller's `--tsv` supports [IANA TSV](https://www.iana.org/assignments/media-types/text/tab-separated-values).
* `FS` is tab and `RS` is newline (or carriage return + linefeed for Windows).
* On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return, newline, tab, and backslash, respectively.
* On output, the reverse is done -- for example, if a field has an embedded newline, that newline is replaced by `\n`.
* A tab within a cell must be encoded as `\t`.
* A carriage return within a cell must be encoded as `\n`.
**ASV (ASCII-separated values):** the flags `--asv`, `--iasv`, `--oasv`, `--asvlite`, `--iasvlite`, and `--oasvlite` are analogous except they use ASCII FS and RS `0x1f` and `0x1e`, respectively.
**USV (Unicode-separated values):** likewise, the flags `--usv`, `--iusv`, `--ousv`, `--usvlite`, `--iusvlite`, and `--ousvlite` use Unicode FS and RS `U+241F` (UTF-8 `0x0xe2909f`) and `U+241E` (UTF-8 `0xe2909e`), respectively.
Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180). This includes CRLF line-terminators by default, regardless of platform.
Here are the differences between CSV and CSV-lite:
* CSV-lite naively splits lines on newline, and fields on comma -- embedded commas and newlines are not escaped in any way.
@ -37,30 +43,55 @@ Here are the differences between CSV and CSV-lite:
* CSV does not allow heterogeneous data; CSV-lite does (see also [Record Heterogeneity](record-heterogeneity.md)).
* TSV-lite is simply CSV-lite with field separator set to tab instead of comma.
In particular, no encode/decode of `\r`, `\n`, `\t`, or `\\` is done.
* TSV-lite is simply CSV-lite with the field separator set to tab instead of a comma.
In particular, no encoding/decoding of `\r`, `\n`, `\t`, or `\\` is done.
* CSV-lite allows changing FS and/or RS to any values, perhaps multi-character.
* CSV-lite and TSV-lite handle schema changes ("schema" meaning "ordered list of field names in a given record") by adding a newline and re-emitting the header. CSV and TSV, by contrast, do the following:
* If there are too few keys, but these match the header, empty fields are emitted.
* If there are too many keys, but these match the header up to the number of header fields, the extra fields are emitted.
* If keys don't match the header, this is an error.
GENMD-RUN-COMMAND
cat data/under-over.json
GENMD-EOF
GENMD-RUN-COMMAND
mlr --ijson --ocsvlite cat data/under-over.json
GENMD-EOF
GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr --ijson --ocsvlite cat data/key-change.json
GENMD-EOF
GENMD-RUN-COMMAND
mlr --ijson --ocsv cat data/under-over.json
GENMD-EOF
GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr --ijson --ocsv cat data/key-change.json
GENMD-EOF
* In short, use-cases for CSV-lite and TSV-lite are often found when dealing with CSV/TSV files which are formatted in some non-standard way -- you have a little more flexibility available to you. (As an example of this flexibility: ASV and USV are nothing more than CSV-lite with different values for FS and RS.)
CSV, TSV, CSV-lite, and TSV-lite have in common the `--implicit-csv-header` flag for input and the `--headerless-csv-output` flag for output.
See also the [`--lazy-quotes` flag](reference-main-flag-list.md#csv-only-flags) which can help with CSV files which are not fully compliant with RFC-4180.
See also the [`--lazy-quotes` flag](reference-main-flag-list.md#csv-only-flags), which can help with CSV files that are not fully compliant with RFC-4180.
## JSON
[JSON](https://json.org) is a format which supports scalars (numbers, strings,
boolean, etc.) as well as "objects" (maps) and "arrays" (lists), while Miller
booleans, etc.) as well as "objects" (maps) and "arrays" (lists), while Miller
is a tool for handling **tabular data** only. By *tabular JSON* I mean the
data is either a sequence of one or more objects, or an array consisting of one
or more objects. Miller treats JSON objects as name-indexed records.
This means Miller cannot (and should not) handle arbitrary JSON. In practice,
though, Miller can handle single JSON objects as well as list of them. The only
kinds of JSON that are unmillerable are single scalars (e.g. file contents `3`)
and arrays of non-object (e.g. file contents `[1,2,3,4,5]`). Check out
[jq](https://stedolan.github.io/jq/) for a tool which handles all valid JSON.
though, Miller can handle single JSON objects as well as lists of them. The only
kinds of JSON that are unmillerable are single scalars (e.g., file contents `3`)
and arrays of non-object (e.g., file contents `[1,2,3,4,5]`). Check out
[jq](https://stedolan.github.io/jq/) for a tool that handles all valid JSON.
In short, if you have tabular data represented in JSON -- lists of objects,
either with or without outermost `[...]` -- [then Miller can handle that for
@ -98,7 +129,7 @@ GENMD-RUN-COMMAND
mlr --json head -n 2 data/json-example-2.json
GENMD-EOF
But if the input format is JSON and the output format is not (or vice versa) then key-concatenation applies:
But if the input format is JSON and the output format is not (or vice versa), then key-concatenation applies:
GENMD-RUN-COMMAND
mlr --ijson --opprint head -n 4 data/json-example-2.json
@ -110,7 +141,7 @@ Use `--jflatsep yourseparatorhere` to specify the string used for key concatenat
### JSON-in-CSV
It's quite common to have CSV data which contains stringified JSON as a column.
It's quite common to have CSV data that contains stringified JSON as a column.
See the [JSON parse and stringify section](reference-main-data-types.md#json-parse-and-stringify) for ways to
decode these in Miller.
@ -139,7 +170,7 @@ records; using `--ojsonl`, you get no outermost `[...]`, and one line per record
## PPRINT: Pretty-printed tabular
Miller's pretty-print format is like CSV, but column-aligned. For example, compare
Miller's pretty-print format is similar to CSV, but with column alignment. For example, compare
GENMD-RUN-COMMAND
mlr --ocsv cat data/small
@ -149,16 +180,22 @@ GENMD-RUN-COMMAND
mlr --opprint cat data/small
GENMD-EOF
Note that while Miller is a line-at-a-time processor and retains input lines in memory only where necessary (e.g. for sort), pretty-print output requires it to accumulate all input lines (so that it can compute maximum column widths) before producing any output. This has two consequences: (a) pretty-print output won't work on `tail -f` contexts, where Miller will be waiting for an end-of-file marker which never arrives; (b) pretty-print output for large files is constrained by available machine memory.
Note that while Miller is a line-at-a-time processor and retains input lines in memory only where necessary (e.g., for sort), pretty-print output requires it to accumulate all input lines (so that it can compute maximum column widths) before producing any output. This has two consequences: (a) Pretty-print output will not work in `tail -f` contexts, where Miller will be waiting for an end-of-file marker that never arrives; (b) Pretty-print output for large files is constrained by the available machine memory.
See [Record Heterogeneity](record-heterogeneity.md) for how Miller handles changes of field names within a single data stream.
For output only (this isn't supported in the input-scanner as of 5.0.0) you can use `--barred` with pprint output format:
Since Miller 5.0.0, you can use `--barred` or `--barred-output` with pprint output format:
GENMD-RUN-COMMAND
mlr --opprint --barred cat data/small
GENMD-EOF
Since Miller 6.11.0, you can use `--barred-input` with pprint input format:
GENMD-RUN-COMMAND
mlr -o pprint --barred cat data/small | mlr -i pprint --barred-input -o json filter '$b == "pan"'
GENMD-EOF
## Markdown tabular
Markdown format looks like this:
@ -167,11 +204,12 @@ GENMD-RUN-COMMAND
mlr --omd cat data/small
GENMD-EOF
which renders like this when dropped into various web tools (e.g. github comments):
which renders like this when dropped into various web tools (e.g. github.comments):
![pix/omd.png](pix/omd.png)
As of Miller 4.3.0, markdown format is supported only for output, not input.
As of Miller 4.3.0, markdown format is supported only for output, not input; as of Miller 6.11.0, markdown format
is supported for input as well.
## XTAB: Vertical tabular
@ -242,7 +280,7 @@ GENMD-RUN-COMMAND
mlr cat data/small
GENMD-EOF
Such data are easy to generate, e.g. in Ruby with
Such data is easy to generate, e.g., in Ruby with
GENMD-CARDIFY
puts "host=#{hostname},seconds=#{t2-t1},message=#{msg}"
@ -264,7 +302,7 @@ GENMD-EOF
Fields lacking an IPS will have positional index (starting at 1) used as the key, as in NIDX format. For example, `dish=7,egg=8,flint` is parsed as `"dish" => "7", "egg" => "8", "3" => "flint"` and `dish,egg,flint` is parsed as `"1" => "dish", "2" => "egg", "3" => "flint"`.
As discussed in [Record Heterogeneity](record-heterogeneity.md), Miller handles changes of field names within the same data stream. But using DKVP format this is particularly natural. One of my favorite use-cases for Miller is in application/server logs, where I log all sorts of lines such as
As discussed in [Record Heterogeneity](record-heterogeneity.md), Miller handles changes of field names within the same data stream. But using DKVP format, this is particularly natural. One of my favorite use-cases for Miller is in application/server logs, where I log all sorts of lines such as
GENMD-CARDIFY
resource=/path/to/file,loadsec=0.45,ok=true
@ -272,10 +310,9 @@ record_count=100, resource=/path/to/file
resource=/some/other/path,loadsec=0.97,ok=false
GENMD-EOF
etc. and I just log them as needed. Then later, I can use `grep`, `mlr --opprint group-like`, etc.
to analyze my logs.
etc., and I log them as needed. Then later, I can use `grep`, `mlr --opprint group-like`, etc. to analyze my logs.
See the [separators page](reference-main-separators.md) regarding how to specify separators other than the default equals-sign and comma.
See the [separators page](reference-main-separators.md) regarding how to specify separators other than the default equals sign and comma.
## NIDX: Index-numbered (toolkit style)
@ -323,7 +360,7 @@ GENMD-EOF
## Comments in data
You can include comments within your data files, and either have them ignored, or passed directly through to the standard output as soon as they are encountered:
You can include comments within your data files, and either have them ignored or passed directly through to the standard output as soon as they are encountered:
GENMD-RUN-COMMAND
mlr help comments-in-data-flags

View file

@ -348,6 +348,50 @@ a.1,a.3,a.5
]
</pre>
## Non-inferencing cases
An additional heuristic is that if a field name starts with a `.`, ends with
a `.`, or has two or more consecutive `.` characters, no attempt is made
to unflatten it on conversion from non-JSON to JSON.
<pre class="pre-highlight-in-pair">
<b>cat data/flatten-dots.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b.,.c,.,d..e,f.g
1,2,3,4,5,6
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --icsv --oxtab cat data/flatten-dots.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a 1
b. 2
.c 3
. 4
d..e 5
f.g 6
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --icsv --ojson cat data/flatten-dots.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
[
{
"a": 1,
"b.": 2,
".c": 3,
".": 4,
"d..e": 5,
"f": {
"g": 6
}
}
]
</pre>
## Manual control
To see what our options are for manually controlling flattening and

View file

@ -156,6 +156,24 @@ GENMD-RUN-COMMAND
mlr --c2j cat data/non-consecutive.csv
GENMD-EOF
## Non-inferencing cases
An additional heuristic is that if a field name starts with a `.`, ends with
a `.`, or has two or more consecutive `.` characters, no attempt is made
to unflatten it on conversion from non-JSON to JSON.
GENMD-RUN-COMMAND
cat data/flatten-dots.csv
GENMD-EOF
GENMD-RUN-COMMAND
mlr --icsv --oxtab cat data/flatten-dots.csv
GENMD-EOF
GENMD-RUN-COMMAND
mlr --icsv --ojson cat data/flatten-dots.csv
GENMD-EOF
## Manual control
To see what our options are for manually controlling flattening and

View file

@ -905,3 +905,8 @@ See also the [arrays page](reference-main-arrays.md), as well as the page on
A [data-compression format supported by Miller](reference-main-compressed-data.md).
Files compressed using ZLIB compression normally end in `.z`.
## ZSTD / .zst
A [data-compression format supported by Miller](reference-main-compressed-data.md).
Files compressed using ZSTD compression normally end in`.zst`.

View file

@ -889,3 +889,8 @@ See also the [arrays page](reference-main-arrays.md), as well as the page on
A [data-compression format supported by Miller](reference-main-compressed-data.md).
Files compressed using ZLIB compression normally end in `.z`.
## ZSTD / .zst
A [data-compression format supported by Miller](reference-main-compressed-data.md).
Files compressed using ZSTD compression normally end in`.zst`.

View file

@ -22,19 +22,25 @@ In this example I am using version 6.2.0 to 6.3.0; of course that will change fo
* Update version found in `mlr --version` and `man mlr`:
* Edit `internal/pkg/version/version.go` from `6.2.0-dev` to `6.3.0`.
* Edit `pkg/version/version.go` from `6.2.0-dev` to `6.3.0`.
* Edit `miller.spec`: `Version`, and `changelog` entry
* Run `make dev` in the Miller repo base directory
* The ordering in this makefile rule is important: the first build creates `mlr`; the second runs `mlr` to create `manpage.txt`; the third includes `manpage.txt` into one of its outputs.
* Commit and push.
* If Go version is being updated: edit all three of
* `go.mod`
* `.github/workflows/go.yml`
* `.github/workflows/release.yml`
* Create the release tarball:
* `make release_tarball`
* This creates `miller-6.3.0.tar.gz` which we'll upload to GitHub, the URL of which will be in our `miller.spec`
* Prepare the source RPM following [README-RPM.md](https://github.com/johnkerl/miller/blob/main/README-RPM.md).
* Create the Github release tag:
* Create the GitHub release tag:
* Don't forget the `v` in `v6.3.0`
* Write the release notes -- save as a pre-release until below
@ -42,12 +48,19 @@ In this example I am using version 6.2.0 to 6.3.0; of course that will change fo
* Thanks to [PR 822](https://github.com/johnkerl/miller/pull/822) which introduces [goreleaser](https://github.com/johnkerl/miller/blob/main/.goreleaser.yml) there are versions for many platforms auto-built and auto-attached to the GitHub release.
* Attach the release tarball and SRPM. Double-check assets were successfully uploaded.
* Publish the release in pre-release mode, until all CI jobs finish successfully. Note that gorelease will create and attach the rest of the binaries.
* Before marking the release as public, download an executable from among the generated binaries and make sure its `mlr version` prints what you expect -- else, restart this process.
* Before marking the release as public, download an executable from among the generated binaries and make sure its `mlr version` prints what you expect -- else, restart this process. MacOS: `xattr -d com.apple.quarantine ./mlr` first.
* Then mark the release as public.
* Check the release-specific docs:
* Build the release-specific docs:
* Look at [https://miller.readthedocs.io](https://miller.readthedocs.io) for new-version docs, after a few minutes' propagation time.
* Note: the GitHub release above created a tag `v6.3.0` which is correct. Here we'll create a branch named `6.3.0` which is also correct.
* Create a branch `6.3.0` (not `v6.3.0`). Locally: `git checkout -b 6.3.0`, then `git push`.
* Edit `docs/mkdocs.yml`, replacing "Miller Dev Documentation" with "Miller 6.3.0 Documentation". Commit and push.
* At the Miller Read the Docs admin page, [https://readthedocs.org/projects/miller](https://readthedocs.org/projects/miller), in the Versions tab, scroll down to _Activate a version_, then activate 6.3.0.
* In the Admin tab, in Advanced Settings, set the Default Version and Default Branch both to 6.3.0. Scroll to the end of the page and poke Save.
* In the Builds tab, if they're not already building, build 6.3.0 as well as latest.
* Verify that [https://miller.readthedocs.io/en/6.3.0](https://miller.readthedocs.io/en/6.3.0) now exists.
* Verify that [https://miller.readthedocs.io/en/latest](https://miller.readthedocs.io/en/latest) (with hard page-reload) shows _Miller 6.8.0 Documentation_ in the upper left of the doc pages.
* Notify:
@ -62,6 +75,6 @@ In this example I am using version 6.2.0 to 6.3.0; of course that will change fo
* Afterwork:
* Edit `internal/pkg/version/version.go` to change version from `6.3.0` to `6.3.0-dev`.
* Edit `pkg/version/version.go` to change version from `6.3.0` to `6.3.0-dev`.
* `make dev`
* Commit and push.

View file

@ -6,19 +6,25 @@ In this example I am using version 6.2.0 to 6.3.0; of course that will change fo
* Update version found in `mlr --version` and `man mlr`:
* Edit `internal/pkg/version/version.go` from `6.2.0-dev` to `6.3.0`.
* Edit `pkg/version/version.go` from `6.2.0-dev` to `6.3.0`.
* Edit `miller.spec`: `Version`, and `changelog` entry
* Run `make dev` in the Miller repo base directory
* The ordering in this makefile rule is important: the first build creates `mlr`; the second runs `mlr` to create `manpage.txt`; the third includes `manpage.txt` into one of its outputs.
* Commit and push.
* If Go version is being updated: edit all three of
* `go.mod`
* `.github/workflows/go.yml`
* `.github/workflows/release.yml`
* Create the release tarball:
* `make release_tarball`
* This creates `miller-6.3.0.tar.gz` which we'll upload to GitHub, the URL of which will be in our `miller.spec`
* Prepare the source RPM following [README-RPM.md](https://github.com/johnkerl/miller/blob/main/README-RPM.md).
* Create the Github release tag:
* Create the GitHub release tag:
* Don't forget the `v` in `v6.3.0`
* Write the release notes -- save as a pre-release until below
@ -26,12 +32,19 @@ In this example I am using version 6.2.0 to 6.3.0; of course that will change fo
* Thanks to [PR 822](https://github.com/johnkerl/miller/pull/822) which introduces [goreleaser](https://github.com/johnkerl/miller/blob/main/.goreleaser.yml) there are versions for many platforms auto-built and auto-attached to the GitHub release.
* Attach the release tarball and SRPM. Double-check assets were successfully uploaded.
* Publish the release in pre-release mode, until all CI jobs finish successfully. Note that gorelease will create and attach the rest of the binaries.
* Before marking the release as public, download an executable from among the generated binaries and make sure its `mlr version` prints what you expect -- else, restart this process.
* Before marking the release as public, download an executable from among the generated binaries and make sure its `mlr version` prints what you expect -- else, restart this process. MacOS: `xattr -d com.apple.quarantine ./mlr` first.
* Then mark the release as public.
* Check the release-specific docs:
* Build the release-specific docs:
* Look at [https://miller.readthedocs.io](https://miller.readthedocs.io) for new-version docs, after a few minutes' propagation time.
* Note: the GitHub release above created a tag `v6.3.0` which is correct. Here we'll create a branch named `6.3.0` which is also correct.
* Create a branch `6.3.0` (not `v6.3.0`). Locally: `git checkout -b 6.3.0`, then `git push`.
* Edit `docs/mkdocs.yml`, replacing "Miller Dev Documentation" with "Miller 6.3.0 Documentation". Commit and push.
* At the Miller Read the Docs admin page, [https://readthedocs.org/projects/miller](https://readthedocs.org/projects/miller), in the Versions tab, scroll down to _Activate a version_, then activate 6.3.0.
* In the Admin tab, in Advanced Settings, set the Default Version and Default Branch both to 6.3.0. Scroll to the end of the page and poke Save.
* In the Builds tab, if they're not already building, build 6.3.0 as well as latest.
* Verify that [https://miller.readthedocs.io/en/6.3.0](https://miller.readthedocs.io/en/6.3.0) now exists.
* Verify that [https://miller.readthedocs.io/en/latest](https://miller.readthedocs.io/en/latest) (with hard page-reload) shows _Miller 6.8.0 Documentation_ in the upper left of the doc pages.
* Notify:
@ -46,6 +59,6 @@ In this example I am using version 6.2.0 to 6.3.0; of course that will change fo
* Afterwork:
* Edit `internal/pkg/version/version.go` to change version from `6.3.0` to `6.3.0-dev`.
* Edit `pkg/version/version.go` to change version from `6.3.0` to `6.3.0-dev`.
* `make dev`
* Commit and push.

View file

@ -16,20 +16,20 @@ Quick links:
</div>
# Introduction
**Miller is a command-line tool for querying, shaping, and reformatting data files in various formats including CSV, TSV, JSON, and JSON Lines.**
**Miller is a command-line tool for querying, shaping, and reformatting data files in various formats, including CSV, TSV, JSON, and JSON Lines.**
**The big picture:** Even well into the 21st century, our world is full of text-formatted data like CSV. Google _CSV memes_, for example. We need tooling to _thrive in this world_, nimbly manipulating data which is in CSVs. And we need tooling to _move beyond CSV_, to be able to pull data out and into other storage and processing systems. Miller is designed for both these goals.
**The big picture:** Even well into the 21st century, our world is full of text-formatted data such as CSV. Google _CSV memes_, for example. We need tooling to _thrive in this world_, nimbly manipulating data which is in CSVs. And we need tooling to _move beyond CSV_, to be able to pull data out and into other storage and processing systems. Miller is designed for both of these goals.
In several senses, Miller is more than one tool:
**Format conversion:** You can convert CSV files to JSON, or vice versa, or
pretty-print your data horizontally or vertically to make it easier to read.
**Data manipulation:** With a few keystrokes you can remove columns you don't care about -- or, make new ones.
**Data manipulation:** With a few keystrokes, you can remove columns you don't care about -- or make new ones.
**Pre-processing/post-processing vs standalone use:** You can use Miller to clean data files and put them into standard formats, perhaps in preparation to load them into a database or a hands-off data-processing pipeline. Or you can use it post-process and summary database-query output. As well, you can use Miller to explore and analyze your data interactively.
**Pre-processing/post-processing vs standalone use:** You can use Miller to clean data files and put them into standard formats, perhaps in preparation for loading them into a database or a hands-off data-processing pipeline. Or you can use it post-process and summarize database-query output. As well, you can use Miller to explore and analyze your data interactively.
**Compact verbs vs programming language:** For low-keystroking you can do things like
**Compact verbs vs programming language:** For low-keystroking, you can do things like
<pre class="pre-highlight-non-pair">
<b>mlr --csv sort -f name input.csv</b>
@ -39,16 +39,16 @@ pretty-print your data horizontally or vertically to make it easier to read.
<b>mlr --json head -n 1 myfile.json</b>
</pre>
The `sort`, `head`, etc are called *verbs*. They're analogs of familiar command-line tools like `sort`, `head`, and so on -- but they're aware of name-indexed, multi-line file formats like CSV, TSV, and JSON. In addition, though, using Miller's `put` verb you can use programming-language statements for expressions like
The `sort`, `head`, etc., are called *verbs*. They're analogs of familiar command-line tools like `sort`, `head`, and so on -- but they're aware of name-indexed, multi-line file formats like CSV, TSV, and JSON. In addition, though, using Miller's `put` verb, you can use programming-language statements for expressions like
<pre class="pre-highlight-non-pair">
<b>mlr --csv put '$rate = $units / $seconds' input.csv</b>
</pre>
which allow you to succintly express your own logic.
which allow you to express your own logic succinctly.
**Multiple domains:** People use Miller for data analysis, data science, software engineering, devops/system-administration, journalism, scientific research, and more.
In the following you can see how CSV, TSV, tabular, JSON, and other **file formats** share a common theme which is **lists of key-value-pairs**. Miller embraces this common theme.
In the following, you can see how CSV, TSV, tabular, JSON, and other **file formats** share a common theme which is **lists of key-value-pairs**. Miller embraces this common theme.
![coverart/cover-combined.png](coverart/cover-combined.png)

View file

@ -1,19 +1,19 @@
# Introduction
**Miller is a command-line tool for querying, shaping, and reformatting data files in various formats including CSV, TSV, JSON, and JSON Lines.**
**Miller is a command-line tool for querying, shaping, and reformatting data files in various formats, including CSV, TSV, JSON, and JSON Lines.**
**The big picture:** Even well into the 21st century, our world is full of text-formatted data like CSV. Google _CSV memes_, for example. We need tooling to _thrive in this world_, nimbly manipulating data which is in CSVs. And we need tooling to _move beyond CSV_, to be able to pull data out and into other storage and processing systems. Miller is designed for both these goals.
**The big picture:** Even well into the 21st century, our world is full of text-formatted data such as CSV. Google _CSV memes_, for example. We need tooling to _thrive in this world_, nimbly manipulating data which is in CSVs. And we need tooling to _move beyond CSV_, to be able to pull data out and into other storage and processing systems. Miller is designed for both of these goals.
In several senses, Miller is more than one tool:
**Format conversion:** You can convert CSV files to JSON, or vice versa, or
pretty-print your data horizontally or vertically to make it easier to read.
**Data manipulation:** With a few keystrokes you can remove columns you don't care about -- or, make new ones.
**Data manipulation:** With a few keystrokes, you can remove columns you don't care about -- or make new ones.
**Pre-processing/post-processing vs standalone use:** You can use Miller to clean data files and put them into standard formats, perhaps in preparation to load them into a database or a hands-off data-processing pipeline. Or you can use it post-process and summary database-query output. As well, you can use Miller to explore and analyze your data interactively.
**Pre-processing/post-processing vs standalone use:** You can use Miller to clean data files and put them into standard formats, perhaps in preparation for loading them into a database or a hands-off data-processing pipeline. Or you can use it post-process and summarize database-query output. As well, you can use Miller to explore and analyze your data interactively.
**Compact verbs vs programming language:** For low-keystroking you can do things like
**Compact verbs vs programming language:** For low-keystroking, you can do things like
GENMD-SHOW-COMMAND
mlr --csv sort -f name input.csv
@ -23,16 +23,16 @@ GENMD-SHOW-COMMAND
mlr --json head -n 1 myfile.json
GENMD-EOF
The `sort`, `head`, etc are called *verbs*. They're analogs of familiar command-line tools like `sort`, `head`, and so on -- but they're aware of name-indexed, multi-line file formats like CSV, TSV, and JSON. In addition, though, using Miller's `put` verb you can use programming-language statements for expressions like
The `sort`, `head`, etc., are called *verbs*. They're analogs of familiar command-line tools like `sort`, `head`, and so on -- but they're aware of name-indexed, multi-line file formats like CSV, TSV, and JSON. In addition, though, using Miller's `put` verb, you can use programming-language statements for expressions like
GENMD-SHOW-COMMAND
mlr --csv put '$rate = $units / $seconds' input.csv
GENMD-EOF
which allow you to succintly express your own logic.
which allow you to express your own logic succinctly.
**Multiple domains:** People use Miller for data analysis, data science, software engineering, devops/system-administration, journalism, scientific research, and more.
In the following you can see how CSV, TSV, tabular, JSON, and other **file formats** share a common theme which is **lists of key-value-pairs**. Miller embraces this common theme.
In the following, you can see how CSV, TSV, tabular, JSON, and other **file formats** share a common theme which is **lists of key-value-pairs**. Miller embraces this common theme.
![coverart/cover-combined.png](coverart/cover-combined.png)

View file

@ -21,7 +21,7 @@ You can install Miller for various platforms as follows.
Download a binary:
* You can get binaries for several platforms on the [releases page](https://github.com/johnkerl/miller/releases).
* You can get latest (head) builds for Linux, MacOS, and Windows by visiting [https://github.com/johnkerl/miller/actions](https://github.com/johnkerl/miller/actions), selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)
* You can get the latest (head) builds for Linux, MacOS, and Windows by visiting [https://github.com/johnkerl/miller/actions](https://github.com/johnkerl/miller/actions), selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)
* See also the [build page](build.md) if you prefer to build from source.
Using a package manager:
@ -30,6 +30,7 @@ Using a package manager:
* MacOS: `brew update` and `brew install miller`, or `sudo port selfupdate` and `sudo port install miller`, depending on your preference of [Homebrew](https://brew.sh) or [MacPorts](https://macports.org).
* Windows: `choco install miller` using [Chocolatey](https://chocolatey.org).
* Note: Miller 6 was released 2022-01-09; [several platforms](https://github.com/johnkerl/miller/blob/main/README-versions.md) may have Miller 5 available.
* As of Miller 6.16.0, you can do `snap install miller`. Note however that the executable is named `miller`, _not_ `mlr`. See also [https://snapcraft.io/miller](https://snapcraft.io/miller).
See also:
@ -37,7 +38,7 @@ See also:
* [@jauderho](https://github.com/jauderho)'s [docker images](https://hub.docker.com/r/jauderho/miller/tags) as discussed in [GitHub Discussions](https://github.com/johnkerl/miller/discussions/851#discussioncomment-1943255)
* Example invocation: `docker run --rm -i jauderho/miller:latest --csv sort -f shape < ./example.csv`
Note that the [Miller releases page](https://github.com/johnkerl/miller/releases), `brew`, `macports`, `chocolatey`, and `conda` tend to have current versions; `yum` and `apt-get` may have outdate versions depending on your platform.
Note that the [Miller releases page](https://github.com/johnkerl/miller/releases), `brew`, `macports`, `chocolatey`, and `conda` tend to have current versions; `yum` and `apt-get` may have outdated versions depending on your platform.
As a first check, you should be able to run `mlr --version` at your system's command prompt and see something like the following:
@ -50,7 +51,7 @@ mlr 6.0.0
A note on documentation:
* If you downloaded the Miller binary from a tagged release, or installed it using a package manager, you should see a version like `mlr 6.0.0` or `mlr 5.10.3` -- please see the [release docs page](release-docs.md) to find the documentation for your version.
* If you downloaded the Miller binary from a tagged release or installed it using a package manager, you should see a version like `mlr 6.0.0` or `mlr 5.10.3` -- please see the [release docs page](release-docs.md) to find the documentation for your version.
* If you installed from source or using a recent build artifact from GitHub Actions, you should see a version like `mlr 6.0.0-dev` -- [https://miller.readthedocs.io](https://miller.readthedocs.io) is the correct reference, since it contains information for the latest contributions to the [Miller repository](https://github.com/johnkerl/miller).
As a second check, given [example.csv](./example.csv) you should be able to do
@ -89,6 +90,6 @@ yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
</pre>
If you run into issues on these checks, please check out the resources on the [community page](community.md) for help.
If you encounter issues with these checks, please refer to the resources on the [community page](community.md) for help.
Otherwise, let's go on to [Miller in 10 minutes](10min.md)!

View file

@ -5,7 +5,7 @@ You can install Miller for various platforms as follows.
Download a binary:
* You can get binaries for several platforms on the [releases page](https://github.com/johnkerl/miller/releases).
* You can get latest (head) builds for Linux, MacOS, and Windows by visiting [https://github.com/johnkerl/miller/actions](https://github.com/johnkerl/miller/actions), selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)
* You can get the latest (head) builds for Linux, MacOS, and Windows by visiting [https://github.com/johnkerl/miller/actions](https://github.com/johnkerl/miller/actions), selecting the latest build, and clicking _Artifacts_. (These are retained for 5 days after each commit.)
* See also the [build page](build.md) if you prefer to build from source.
Using a package manager:
@ -14,6 +14,7 @@ Using a package manager:
* MacOS: `brew update` and `brew install miller`, or `sudo port selfupdate` and `sudo port install miller`, depending on your preference of [Homebrew](https://brew.sh) or [MacPorts](https://macports.org).
* Windows: `choco install miller` using [Chocolatey](https://chocolatey.org).
* Note: Miller 6 was released 2022-01-09; [several platforms](https://github.com/johnkerl/miller/blob/main/README-versions.md) may have Miller 5 available.
* As of Miller 6.16.0, you can do `snap install miller`. Note however that the executable is named `miller`, _not_ `mlr`. See also [https://snapcraft.io/miller](https://snapcraft.io/miller).
See also:
@ -21,7 +22,7 @@ See also:
* [@jauderho](https://github.com/jauderho)'s [docker images](https://hub.docker.com/r/jauderho/miller/tags) as discussed in [GitHub Discussions](https://github.com/johnkerl/miller/discussions/851#discussioncomment-1943255)
* Example invocation: `docker run --rm -i jauderho/miller:latest --csv sort -f shape < ./example.csv`
Note that the [Miller releases page](https://github.com/johnkerl/miller/releases), `brew`, `macports`, `chocolatey`, and `conda` tend to have current versions; `yum` and `apt-get` may have outdate versions depending on your platform.
Note that the [Miller releases page](https://github.com/johnkerl/miller/releases), `brew`, `macports`, `chocolatey`, and `conda` tend to have current versions; `yum` and `apt-get` may have outdated versions depending on your platform.
As a first check, you should be able to run `mlr --version` at your system's command prompt and see something like the following:
@ -32,7 +33,7 @@ GENMD-EOF
A note on documentation:
* If you downloaded the Miller binary from a tagged release, or installed it using a package manager, you should see a version like `mlr 6.0.0` or `mlr 5.10.3` -- please see the [release docs page](release-docs.md) to find the documentation for your version.
* If you downloaded the Miller binary from a tagged release or installed it using a package manager, you should see a version like `mlr 6.0.0` or `mlr 5.10.3` -- please see the [release docs page](release-docs.md) to find the documentation for your version.
* If you installed from source or using a recent build artifact from GitHub Actions, you should see a version like `mlr 6.0.0-dev` -- [https://miller.readthedocs.io](https://miller.readthedocs.io) is the correct reference, since it contains information for the latest contributions to the [Miller repository](https://github.com/johnkerl/miller).
As a second check, given [example.csv](./example.csv) you should be able to do
@ -45,6 +46,6 @@ GENMD-RUN-COMMAND
mlr --icsv --opprint cat example.csv
GENMD-EOF
If you run into issues on these checks, please check out the resources on the [community page](community.md) for help.
If you encounter issues with these checks, please refer to the resources on the [community page](community.md) for help.
Otherwise, let's go on to [Miller in 10 minutes](10min.md)!

View file

@ -18,7 +18,7 @@ Quick links:
## Short format specifiers, including --c2p
In our examples so far we've often made use of `mlr --icsv --opprint` or `mlr --icsv --ojson`. These are such frequently occurring patterns that they have short options like `--c2p` and `--c2j`:
In our examples so far, we've often made use of `mlr --icsv --opprint` or `mlr --icsv --ojson`. These are such frequently occurring patterns that they have short options like `--c2p` and `--c2j`:
<pre class="pre-highlight-in-pair">
<b>mlr --c2p head -n 2 example.csv</b>
@ -59,7 +59,7 @@ You can get the full list [here](file-formats.md#data-conversion-keystroke-saver
## File names up front, including --from
Already we saw that you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
Already, we saw that you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
<pre class="pre-highlight-in-pair">
<b>mlr --c2p --from example.csv sort -nr index then head -n 3</b>
@ -87,6 +87,16 @@ If there's more than one input file, you can use `--mfrom`, then however many fi
<b>mlr --c2p --mfrom data/*.csv -- sort -n index</b>
</pre>
Alternatively, you may place filenames within another file, one per line:
<pre class="pre-highlight-non-pair">
<b>cat data/filenames.txt</b>
</pre>
<pre class="pre-highlight-non-pair">
<b>mlr --c2p --files data/filenames.txt cat</b>
</pre>
## Shortest flags for CSV, TSV, and JSON
The following have even shorter versions:
@ -100,7 +110,7 @@ I think `mlr --csv ...` explains itself better than `mlr -c ...`. Nonetheless, t
## .mlrrc file
If you want the default file format for Miller to be CSV, you can simply put `--csv` on a line by itself in your `~/.mlrrc` file. Then instead of `mlr --csv cat example.csv` you can just do `mlr cat example.csv`. This is just a personal default, though, so `mlr --opprint cat example.csv` will use default CSV format for input, and PPRINT (tabular) for output.
If you want the default file format for Miller to be CSV, you can put `--csv` on a line by itself in your `~/.mlrrc` file. Then, instead of `mlr --csv cat example.csv` you can just do `mlr cat example.csv`. This is just a personal default, though, so `mlr --opprint cat example.csv` will use default CSV format for input, and PPRINT (tabular) for output.
You can read more about this at the [Customization](customization.md) page.
@ -116,6 +126,6 @@ fraction -f count \
filename-which-varies.csv
</pre>
Typing this out can get a bit old, if the only thing that changes for you is the filename.
Typing this out can get a bit old if the only thing that changes for you is the filename.
See [Scripting with Miller](scripting.md) for some keystroke-saving options.

View file

@ -2,7 +2,7 @@
## Short format specifiers, including --c2p
In our examples so far we've often made use of `mlr --icsv --opprint` or `mlr --icsv --ojson`. These are such frequently occurring patterns that they have short options like `--c2p` and `--c2j`:
In our examples so far, we've often made use of `mlr --icsv --opprint` or `mlr --icsv --ojson`. These are such frequently occurring patterns that they have short options like `--c2p` and `--c2j`:
GENMD-RUN-COMMAND
mlr --c2p head -n 2 example.csv
@ -16,7 +16,7 @@ You can get the full list [here](file-formats.md#data-conversion-keystroke-saver
## File names up front, including --from
Already we saw that you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
Already, we saw that you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
GENMD-RUN-COMMAND
mlr --c2p --from example.csv sort -nr index then head -n 3
@ -32,6 +32,16 @@ GENMD-SHOW-COMMAND
mlr --c2p --mfrom data/*.csv -- sort -n index
GENMD-EOF
Alternatively, you may place filenames within another file, one per line:
GENMD-SHOW-COMMAND
cat data/filenames.txt
GENMD-EOF
GENMD-SHOW-COMMAND
mlr --c2p --files data/filenames.txt cat
GENMD-EOF
## Shortest flags for CSV, TSV, and JSON
The following have even shorter versions:
@ -45,7 +55,7 @@ I think `mlr --csv ...` explains itself better than `mlr -c ...`. Nonetheless, t
## .mlrrc file
If you want the default file format for Miller to be CSV, you can simply put `--csv` on a line by itself in your `~/.mlrrc` file. Then instead of `mlr --csv cat example.csv` you can just do `mlr cat example.csv`. This is just a personal default, though, so `mlr --opprint cat example.csv` will use default CSV format for input, and PPRINT (tabular) for output.
If you want the default file format for Miller to be CSV, you can put `--csv` on a line by itself in your `~/.mlrrc` file. Then, instead of `mlr --csv cat example.csv` you can just do `mlr cat example.csv`. This is just a personal default, though, so `mlr --opprint cat example.csv` will use default CSV format for input, and PPRINT (tabular) for output.
You can read more about this at the [Customization](customization.md) page.
@ -61,6 +71,6 @@ fraction -f count \
filename-which-varies.csv
GENMD-EOF
Typing this out can get a bit old, if the only thing that changes for you is the filename.
Typing this out can get a bit old if the only thing that changes for you is the filename.
See [Scripting with Miller](scripting.md) for some keystroke-saving options.

View file

@ -152,7 +152,7 @@ $ helm list | mlr --itsv --ojson head -n 1
]
</pre>
A solution here is Miller's
A solution here is Miller's
[clean-whitespace verb](reference-verbs.md#clean-whitespace):
<pre class="pre-non-highlight-non-pair">

View file

@ -136,7 +136,7 @@ $ helm list | mlr --itsv --ojson head -n 1
]
GENMD-EOF
A solution here is Miller's
A solution here is Miller's
[clean-whitespace verb](reference-verbs.md#clean-whitespace):
GENMD-CARDIFY

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,227 @@
<!--- PLEASE DO NOT EDIT DIRECTLY. EDIT THE .md.in FILE PLEASE. --->
<div>
<span class="quicklinks">
Quick links:
&nbsp;
<a class="quicklink" href="../reference-main-flag-list/index.html">Flags</a>
&nbsp;
<a class="quicklink" href="../reference-verbs/index.html">Verbs</a>
&nbsp;
<a class="quicklink" href="../reference-dsl-builtin-functions/index.html">Functions</a>
&nbsp;
<a class="quicklink" href="../glossary/index.html">Glossary</a>
&nbsp;
<a class="quicklink" href="../release-docs/index.html">Release docs</a>
</span>
</div>
# Miller as a library
Very initially and experimentally, as of Miller 6.9.1, Go developers will be able to access Miller source
code --- moved from `internal/pkg/` to `pkg/` --- within their own Go projects.
Caveat emptor: Miller's backward-compatibility guarantees are at the CLI level; API is not guaranteed stable.
For this reason, please be careful with your version pins.
I'm happy to discuss this new area further at the [discussions page](https://github.com/johnkerl/miller/discussions).
## Setup
```
$ mkdir use-mlr
$ cd cd use-mlr
$ go mod init github.com/johnkerl/miller-library-example
go: creating new go.mod: module github.com/johnkerl/miller-library-example
# One of:
$ go get github.com/johnkerl/miller
$ go get github.com/johnkerl/miller@0f27a39a9f92d4c633dd29d99ad203e95a484dd3
# Etc.
$ go mod tidy
```
## One example use
<pre class="pre-non-highlight-non-pair">
package main
import (
"fmt"
"github.com/johnkerl/miller/v6/pkg/bifs"
"github.com/johnkerl/miller/v6/pkg/mlrval"
)
func main() {
a := mlrval.FromInt(2)
b := mlrval.FromInt(60)
c := bifs.BIF_pow(a, b)
fmt.Println(c.String())
}
</pre>
```
$ go build main1.go
$ ./main1
1152921504606846976
```
Or simply:
```
$ go run main1.go
1152921504606846976
```
## Another example use
<pre class="pre-non-highlight-non-pair">
// This is an example of using Miller as a library.
package main
import (
"bufio"
"container/list"
"fmt"
"os"
"github.com/johnkerl/miller/v6/pkg/bifs"
"github.com/johnkerl/miller/v6/pkg/cli"
"github.com/johnkerl/miller/v6/pkg/input"
"github.com/johnkerl/miller/v6/pkg/output"
"github.com/johnkerl/miller/v6/pkg/types"
)
// Put your record-processing logic here.
func custom_record_processor(irac *types.RecordAndContext) (*types.RecordAndContext, error) {
irec := irac.Record
v := irec.Get("i")
if v == nil {
return nil, fmt.Errorf("did not find key \"i\" at filename %s record number %d",
irac.Context.FILENAME, irac.Context.FNR,
)
}
v2 := bifs.BIF_times(v, v)
irec.PutReference("i2", v2)
return irac, nil
}
// Put your various options here.
func custom_options() *cli.TOptions {
return &cli.TOptions{
ReaderOptions: cli.TReaderOptions{
InputFileFormat: "csv",
IFS: ",",
IRS: "\n",
RecordsPerBatch: 1,
},
WriterOptions: cli.TWriterOptions{
OutputFileFormat: "json",
},
}
}
// This function you don't need to modify.
func run_custom_processor(
fileNames []string,
options *cli.TOptions,
record_processor func(irac *types.RecordAndContext) (*types.RecordAndContext, error),
) error {
outputStream := os.Stdout
outputIsStdout := true
// Since Go is concurrent, the context struct needs to be duplicated and
// passed through the channels along with each record.
initialContext := types.NewContext()
// Instantiate the record-reader.
// RecordsPerBatch is tracked separately from ReaderOptions since join/repl
// may use batch size of 1.
recordReader, err := input.Create(&options.ReaderOptions, options.ReaderOptions.RecordsPerBatch)
if err != nil {
return err
}
// Set up the channels for the record-reader.
readerChannel := make(chan *list.List, 2) // list of *types.RecordAndContext
inputErrorChannel := make(chan error, 1)
// Not needed in this example
readerDownstreamDoneChannel := make(chan bool, 1)
// Instantiate the record-writer
recordWriter, err := output.Create(&options.WriterOptions)
if err != nil {
return err
}
bufferedOutputStream := bufio.NewWriter(outputStream)
// Start the record-reader.
go recordReader.Read(
fileNames, *initialContext, readerChannel, inputErrorChannel, readerDownstreamDoneChannel)
// Loop through the record stream.
var retval error
done := false
for !done {
select {
case ierr := &lt;-inputErrorChannel:
retval = ierr
break
case iracs := &lt;-readerChannel:
// Handle the record batch
for e := iracs.Front(); e != nil; e = e.Next() {
irac := e.Value.(*types.RecordAndContext)
if irac.Record != nil {
orac, err := record_processor(irac)
if err != nil {
retval = err
done = true
break
}
recordWriter.Write(orac.Record, bufferedOutputStream, outputIsStdout)
}
if irac.OutputString != "" {
fmt.Fprintln(bufferedOutputStream, irac.OutputString)
}
if irac.EndOfStream {
done = true
}
}
break
}
}
bufferedOutputStream.Flush()
return retval
}
func main() {
options := custom_options()
err := run_custom_processor(os.Args[1:], options, custom_record_processor)
if err != nil {
fmt.Fprintf(os.Stderr, "%v\n", err)
}
}
</pre>
<pre class="pre-non-highlight-non-pair">
host,status
apoapsis.east.our.org,up
nadir.west.our.org,down
</pre>
```
$ go build main2.go
{"a": "pan", "b": "pan", "i": 1, "x": 0.3467901443380824, "y": 0.7268028627434533, "i2": 1}
{"a": "eks", "b": "pan", "i": 2, "x": 0.7586799647899636, "y": 0.5221511083334797, "i2": 4}
{"a": "wye", "b": "wye", "i": 3, "x": 0.20460330576630303, "y": 0.33831852551664776, "i2": 9}
{"a": "eks", "b": "wye", "i": 4, "x": 0.38139939387114097, "y": 0.13418874328430463, "i2": 16}
{"a": "wye", "b": "pan", "i": 5, "x": 0.5732889198020006, "y": 0.8636244699032729, "i2": 25}$ ./main2 data/small.csv
```

View file

@ -0,0 +1,58 @@
# Miller as a library
Very initially and experimentally, as of Miller 6.9.1, Go developers will be able to access Miller source
code --- moved from `internal/pkg/` to `pkg/` --- within their own Go projects.
Caveat emptor: Miller's backward-compatibility guarantees are at the CLI level; API is not guaranteed stable.
For this reason, please be careful with your version pins.
I'm happy to discuss this new area further at the [discussions page](https://github.com/johnkerl/miller/discussions).
## Setup
```
$ mkdir use-mlr
$ cd cd use-mlr
$ go mod init github.com/johnkerl/miller-library-example
go: creating new go.mod: module github.com/johnkerl/miller-library-example
# One of:
$ go get github.com/johnkerl/miller
$ go get github.com/johnkerl/miller@0f27a39a9f92d4c633dd29d99ad203e95a484dd3
# Etc.
$ go mod tidy
```
## One example use
GENMD-INCLUDE-ESCAPED(miller-as-library/main1.go)
```
$ go build main1.go
$ ./main1
1152921504606846976
```
Or simply:
```
$ go run main1.go
1152921504606846976
```
## Another example use
GENMD-INCLUDE-ESCAPED(miller-as-library/main2.go)
GENMD-INCLUDE-ESCAPED(data/hostnames.csv)
```
$ go build main2.go
{"a": "pan", "b": "pan", "i": 1, "x": 0.3467901443380824, "y": 0.7268028627434533, "i2": 1}
{"a": "eks", "b": "pan", "i": 2, "x": 0.7586799647899636, "y": 0.5221511083334797, "i2": 4}
{"a": "wye", "b": "wye", "i": 3, "x": 0.20460330576630303, "y": 0.33831852551664776, "i2": 9}
{"a": "eks", "b": "wye", "i": 4, "x": 0.38139939387114097, "y": 0.13418874328430463, "i2": 16}
{"a": "wye", "b": "pan", "i": 5, "x": 0.5732889198020006, "y": 0.8636244699032729, "i2": 25}$ ./main2 data/small.csv
```

View file

@ -0,0 +1,15 @@
package main
import (
"fmt"
"github.com/johnkerl/miller/v6/pkg/bifs"
"github.com/johnkerl/miller/v6/pkg/mlrval"
)
func main() {
a := mlrval.FromInt(2)
b := mlrval.FromInt(60)
c := bifs.BIF_pow(a, b)
fmt.Println(c.String())
}

View file

@ -0,0 +1,132 @@
// This is an example of using Miller as a library.
package main
import (
"bufio"
"container/list"
"fmt"
"os"
"github.com/johnkerl/miller/v6/pkg/bifs"
"github.com/johnkerl/miller/v6/pkg/cli"
"github.com/johnkerl/miller/v6/pkg/input"
"github.com/johnkerl/miller/v6/pkg/output"
"github.com/johnkerl/miller/v6/pkg/types"
)
// Put your record-processing logic here.
func custom_record_processor(irac *types.RecordAndContext) (*types.RecordAndContext, error) {
irec := irac.Record
v := irec.Get("i")
if v == nil {
return nil, fmt.Errorf("did not find key \"i\" at filename %s record number %d",
irac.Context.FILENAME, irac.Context.FNR,
)
}
v2 := bifs.BIF_times(v, v)
irec.PutReference("i2", v2)
return irac, nil
}
// Put your various options here.
func custom_options() *cli.TOptions {
return &cli.TOptions{
ReaderOptions: cli.TReaderOptions{
InputFileFormat: "csv",
IFS: ",",
IRS: "\n",
RecordsPerBatch: 1,
},
WriterOptions: cli.TWriterOptions{
OutputFileFormat: "json",
},
}
}
// This function you don't need to modify.
func run_custom_processor(
fileNames []string,
options *cli.TOptions,
record_processor func(irac *types.RecordAndContext) (*types.RecordAndContext, error),
) error {
outputStream := os.Stdout
outputIsStdout := true
// Since Go is concurrent, the context struct needs to be duplicated and
// passed through the channels along with each record.
initialContext := types.NewContext()
// Instantiate the record-reader.
// RecordsPerBatch is tracked separately from ReaderOptions since join/repl
// may use batch size of 1.
recordReader, err := input.Create(&options.ReaderOptions, options.ReaderOptions.RecordsPerBatch)
if err != nil {
return err
}
// Set up the channels for the record-reader.
readerChannel := make(chan *list.List, 2) // list of *types.RecordAndContext
inputErrorChannel := make(chan error, 1)
// Not needed in this example
readerDownstreamDoneChannel := make(chan bool, 1)
// Instantiate the record-writer
recordWriter, err := output.Create(&options.WriterOptions)
if err != nil {
return err
}
bufferedOutputStream := bufio.NewWriter(outputStream)
// Start the record-reader.
go recordReader.Read(
fileNames, *initialContext, readerChannel, inputErrorChannel, readerDownstreamDoneChannel)
// Loop through the record stream.
var retval error
done := false
for !done {
select {
case ierr := <-inputErrorChannel:
retval = ierr
break
case iracs := <-readerChannel:
// Handle the record batch
for e := iracs.Front(); e != nil; e = e.Next() {
irac := e.Value.(*types.RecordAndContext)
if irac.Record != nil {
orac, err := record_processor(irac)
if err != nil {
retval = err
done = true
break
}
recordWriter.Write(orac.Record, bufferedOutputStream, outputIsStdout)
}
if irac.OutputString != "" {
fmt.Fprintln(bufferedOutputStream, irac.OutputString)
}
if irac.EndOfStream {
done = true
}
}
break
}
}
bufferedOutputStream.Flush()
return retval
}
func main() {
options := custom_options()
err := run_custom_processor(os.Args[1:], options, custom_record_processor)
if err != nil {
fmt.Fprintf(os.Stderr, "%v\n", err)
}
}

View file

@ -0,0 +1,111 @@
package main
import (
"bufio"
"container/list"
"errors"
"fmt"
"os"
"github.com/johnkerl/miller/v6/pkg/cli"
"github.com/johnkerl/miller/v6/pkg/input"
"github.com/johnkerl/miller/v6/pkg/output"
"github.com/johnkerl/miller/v6/pkg/transformers"
"github.com/johnkerl/miller/v6/pkg/types"
)
func convert_csv_to_json(fileNames []string) error {
options := &cli.TOptions{
ReaderOptions: cli.TReaderOptions{
InputFileFormat: "csv",
IFS: ",",
IRS: "\n",
RecordsPerBatch: 1,
},
WriterOptions: cli.TWriterOptions{
OutputFileFormat: "json",
},
}
outputStream := os.Stdout
outputIsStdout := true
// Since Go is concurrent, the context struct needs to be duplicated and
// passed through the channels along with each record.
initialContext := types.NewContext()
// Instantiate the record-reader.
// RecordsPerBatch is tracked separately from ReaderOptions since join/repl
// may use batch size of 1.
recordReader, err := input.Create(&options.ReaderOptions, options.ReaderOptions.RecordsPerBatch)
if err != nil {
return err
}
// Instantiate the record-writer
recordWriter, err := output.Create(&options.WriterOptions)
if err != nil {
return err
}
cat, err := transformers.NewTransformerCat(
false, // doCounters bool,
"", // counterFieldName string,
nil, // groupByFieldNames []string,
false, // doFileName bool,
false, // doFileNum bool,
)
if err != nil {
return err
}
recordTransformers := []transformers.IRecordTransformer{cat}
// Set up the reader-to-transformer and transformer-to-writer channels.
readerChannel := make(chan *list.List, 2) // list of *types.RecordAndContext
writerChannel := make(chan *list.List, 1) // list of *types.RecordAndContext
// We're done when a fatal error is registered on input (file not found,
// etc) or when the record-writer has written all its output. We use
// channels to communicate both of these conditions.
inputErrorChannel := make(chan error, 1)
doneWritingChannel := make(chan bool, 1)
dataProcessingErrorChannel := make(chan bool, 1)
readerDownstreamDoneChannel := make(chan bool, 1)
// Start the reader, transformer, and writer. Let them run until fatal input
// error or end-of-processing happens.
bufferedOutputStream := bufio.NewWriter(outputStream)
go recordReader.Read(fileNames, *initialContext, readerChannel, inputErrorChannel, readerDownstreamDoneChannel)
go transformers.ChainTransformer(readerChannel, readerDownstreamDoneChannel, recordTransformers,
writerChannel, options)
go output.ChannelWriter(writerChannel, recordWriter, &options.WriterOptions, doneWritingChannel,
dataProcessingErrorChannel, bufferedOutputStream, outputIsStdout)
var retval error
done := false
for !done {
select {
case ierr := <-inputErrorChannel:
retval = ierr
break
case <-dataProcessingErrorChannel:
retval = errors.New("exiting due to data error") // details already printed
break
case <-doneWritingChannel:
done = true
break
}
}
bufferedOutputStream.Flush()
return retval
}
func main() {
err := convert_csv_to_json(os.Args[1:])
if err != nil {
fmt.Fprintf(os.Stderr, "%v\n", err)
}
}

View file

@ -18,7 +18,7 @@ Quick links:
## Native builds as of Miller 6
Miller was originally developed for Unix-like operating systems including Linux and MacOS. Since Miller 5.2.0 which was the first version to support Windows at all, that support has been partial. But as of version 6.0.0, Miller builds directly on Windows.
Miller was originally developed for Unix-like operating systems, including Linux and MacOS. Since Miller 5.2.0, which was the first version to support Windows at all, that support has been partial. But as of version 6.0.0, Miller builds directly on Windows.
**The experience is now almost the same on Windows as it is on Linux, NetBSD/FreeBSD, and MacOS.**
@ -28,7 +28,7 @@ See [Installation](installing-miller.md) for how to get a copy of `mlr.exe`.
## Setup
Simply place `mlr.exe` somewhere within your `PATH` variable.
Place `mlr.exe` somewhere within your `PATH` variable.
![pix/miller-windows.png](pix/miller-windows.png)
@ -38,7 +38,7 @@ To use Miller from within MSYS2/Cygwin, also make sure `mlr.exe` is within the `
## Differences
The Windows-support code within Miller makes effort to support Linux/Unix/MacOS-like command-line syntax including single-quoting of expressions for `mlr put` and `mlr filter` -- and in the examples above, this often works. However, there are still some cases where more complex expressions aren't successfully parsed from the Windows prompt, even though they are from MSYS2:
The Windows-support code within Miller makes an effort to support Linux/Unix/MacOS-like command-line syntax, including single-quoting of expressions for `mlr put` and `mlr filter` -- and in the examples above, this often works. However, there are still some cases where more complex expressions aren't successfully parsed from the Windows prompt, even though they are from MSYS2:
![pix/miller-windows-complex.png](pix/miller-windows-complex.png)

View file

@ -2,7 +2,7 @@
## Native builds as of Miller 6
Miller was originally developed for Unix-like operating systems including Linux and MacOS. Since Miller 5.2.0 which was the first version to support Windows at all, that support has been partial. But as of version 6.0.0, Miller builds directly on Windows.
Miller was originally developed for Unix-like operating systems, including Linux and MacOS. Since Miller 5.2.0, which was the first version to support Windows at all, that support has been partial. But as of version 6.0.0, Miller builds directly on Windows.
**The experience is now almost the same on Windows as it is on Linux, NetBSD/FreeBSD, and MacOS.**
@ -12,7 +12,7 @@ See [Installation](installing-miller.md) for how to get a copy of `mlr.exe`.
## Setup
Simply place `mlr.exe` somewhere within your `PATH` variable.
Place `mlr.exe` somewhere within your `PATH` variable.
![pix/miller-windows.png](pix/miller-windows.png)
@ -22,7 +22,7 @@ To use Miller from within MSYS2/Cygwin, also make sure `mlr.exe` is within the `
## Differences
The Windows-support code within Miller makes effort to support Linux/Unix/MacOS-like command-line syntax including single-quoting of expressions for `mlr put` and `mlr filter` -- and in the examples above, this often works. However, there are still some cases where more complex expressions aren't successfully parsed from the Windows prompt, even though they are from MSYS2:
The Windows-support code within Miller makes an effort to support Linux/Unix/MacOS-like command-line syntax, including single-quoting of expressions for `mlr put` and `mlr filter` -- and in the examples above, this often works. However, there are still some cases where more complex expressions aren't successfully parsed from the Windows prompt, even though they are from MSYS2:
![pix/miller-windows-complex.png](pix/miller-windows-complex.png)

View file

@ -16,11 +16,11 @@ Quick links:
</div>
# Intro to Miller's programming language
In the [Miller in 10 minutes](10min.md) page we took a tour of some of Miller's most-used [verbs](reference-verbs.md) including `cat`, `head`, `tail`, `cut`, and `sort`. These are analogs of familiar system commands, but empowered by field-name indexing and file-format awareness: the system `sort` command only knows about lines and column names like `1,2,3,4`, while `mlr sort` knows about CSV/TSV/JSON/etc records, and field names like `color,shape,flag,index`.
On the [Miller in 10 minutes](10min.md) page, we took a tour of some of Miller's most-used [verbs](reference-verbs.md), including `cat`, `head`, `tail`, `cut`, and `sort`. These are analogs of familiar system commands, but empowered by field-name indexing and file-format awareness: the system `sort` command only knows about lines and column names like `1,2,3,4`, while `mlr sort` knows about CSV/TSV/JSON/etc records, and field names like `color,shape,flag,index`.
We also caught a glimpse of Miller's `put` and `filter` verbs. These two are special since they let you express statements using Miller's programming language. It's a *embedded domain-specific language* since it's inside Miller: often referred to simply as the *Miller DSL*.
We also caught a glimpse of Miller's `put` and `filter` verbs. These two are special because they allow you to express statements using Miller's programming language. It's an *embedded domain-specific language* since it's inside Miller: often referred to simply as the *Miller DSL*.
In the [DSL reference](reference-dsl.md) page we have a complete reference to Miller's programming language. For now, let's take a quick look at key features -- you can use as few or as many features as you like.
On the [DSL reference](reference-dsl.md) page, we have a complete reference to Miller's programming language. For now, let's take a quick look at key features -- you can use as few or as many features as you like.
## Records and fields
@ -45,9 +45,9 @@ purple square false 10 91 72.3735 8.2430 596.5747605000001
When we type that, a few things are happening:
* We refer to fields in the input data using a dollar sign and then the field name, e.g. `$quantity`. (If a field name contains special characters like a dot or slash, just use curly braces: `${field.name}`.)
* We refer to fields in the input data using a dollar sign and then the field name, e.g., `$quantity`. (If a field name contains special characters like a dot or slash, just use curly braces: `${field.name}`.)
* The expression `$cost = $quantity * $rate` is executed once per record of the data file. Our [example.csv](./example.csv) has 10 records so this expression was executed 10 times, with the field names `$quantity` and `$rate` each time bound to the current record's values for those fields.
* On the left-hand side we have the new field name `$cost` which didn't come from the input data. Assignments to new variables result in a new field being placed after all the other ones. If we'd assigned to an existing field name, it would have been updated in-place.
* On the left-hand side, we have the new field name `$cost`, which didn't come from the input data. Assignments to new variables result in a new field being placed after all the other ones. If we'd assigned to an existing field name, it would have been updated in place.
* The entire expression is surrounded by single quotes (with an adjustment needed on [Windows](miller-on-windows.md)), to get it past the system shell. Inside those, only double quotes have meaning in Miller's programming language.
## Multi-line statements, and statements-from-file
@ -91,9 +91,9 @@ yellow circle true 9 8700 63.5058 8.3350 529.3208430000001
purple square false 10 9100 72.3735 8.2430 596.5747605000001
</pre>
Anything from a `#` character to end of line is a code comment.
Anything from a `#` character to the end of the line is a code comment.
One of Miller's key features is the ability to express data-transformation right there at the keyboard, interactively. But if you find yourself using expressions repeatedly, you can put everything between the single quotes into a file and refer to that using `put -f`:
One of Miller's key features is the ability to express data transformation right there at the keyboard, interactively. But if you find yourself using expressions repeatedly, you can put everything between the single quotes into a file and refer to that using `put -f`:
<pre class="pre-highlight-in-pair">
<b>cat dsl-example.mlr</b>
@ -120,13 +120,13 @@ yellow circle true 9 8700 63.5058 8.3350 529.3208430000001
purple square false 10 9100 72.3735 8.2430 596.5747605000001
</pre>
This becomes particularly important on Windows. Quite a bit of effort was put into making Miller on Windows be able to handle the kinds of single-quoted expressions we're showing here, but if you get syntax-error messages on Windows using examples in this documentation, you can put the parts between single quotes into a file and refer to that using `mlr put -f` -- or, use the triple-double-quote trick as described in the [Miller on Windows page](miller-on-windows.md).
This becomes particularly important on Windows. Quite a bit of effort was put into making Miller on Windows be able to handle the kinds of single-quoted expressions we're showing here. Still, if you get syntax-error messages on Windows using examples in this documentation, you can put the parts between single quotes into a file and refer to that using `mlr put -f` -- or, use the triple-double-quote trick as described in the [Miller on Windows page](miller-on-windows.md).
## Out-of-stream variables, begin, and end
Above we saw that your expression is executed once per record -- if a file has a million records, your expression will be executed a million times, once for each record. But you can mark statements to only be executed once, either before the record stream begins, or after the record stream is ended. If you know about [AWK](https://en.wikipedia.org/wiki/AWK), you might have noticed that Miller's programming language is loosely inspired by it, including the `begin` and `end` statements.
Above, we saw that your expression is executed once per record: if a file has a million records, your expression will be executed a million times, once for each record. But you can mark statements only to be executed once, either before the record stream begins or after the record stream is ended. If you know about [AWK](https://en.wikipedia.org/wiki/AWK), you might have noticed that Miller's programming language is loosely inspired by it, including the `begin` and `end` statements.
Above we also saw that names like `$quantity` are bound to each record in turn.
Above, we also saw that names like `$quantity` are bound to each record in turn.
To make `begin` and `end` statements useful, we need somewhere to put things that persist across the duration of the record stream, and a way to emit them. Miller uses [**out-of-stream variables**](reference-dsl-variables.md#out-of-stream-variables) (or **oosvars** for short) whose names start with an `@` sigil, along with the [`emit`](reference-dsl-output-statements.md#emit-statements) keyword to write them into the output record stream:
@ -209,8 +209,8 @@ So, take this sum/count example as an indication of the kinds of things you can
Also inspired by [AWK](https://en.wikipedia.org/wiki/AWK), the Miller DSL has the following special [**context variables**](reference-dsl-variables.md#built-in-variables):
* `FILENAME` -- the filename the current record came from. Especially useful in things like `mlr ... *.csv`.
* `FILENUM` -- similarly, but integer 1,2,3,... rather than filenam.e
* `NF` -- the number of fields in the current record. Note that if you assign `$newcolumn = some value` then `NF` will increment.
* `FILENUM` -- similarly, but integer 1,2,3,... rather than filename.
* `NF` -- the number of fields in the current record. Note that if you assign `$newcolumn = some value`, then `NF` will increment.
* `NR` -- starting from 1, counter of how many records processed so far.
* `FNR` -- similar, but resets to 1 at the start of each file.
@ -290,12 +290,12 @@ purple square false 10 91 72.3735 8.2430 3628800
Note that here we used the `-f` flag to `put` to load our function
definition, and also the `-e` flag to add another statement on the command
line. (We could have also put `$fact = factorial(NR)` inside
`factorial-example.mlr` but that would have made that file less flexible for our
`factorial-example.mlr`, but that would have made that file less flexible for our
future use.)
## If-statements, loops, and local variables
Suppose you want to only compute sums conditionally -- you can use an `if` statement:
Suppose you want only to compute sums conditionally -- you can use an `if` statement:
<pre class="pre-highlight-in-pair">
<b>cat if-example.mlr</b>
@ -331,7 +331,7 @@ page](reference-dsl-control-structures.md#for-loops), Miller has a few kinds of
for-loops. In addition to the usual 3-part `for (i = 0; i < 10; i += 1)` kind
that many programming languages have, Miller also lets you loop over
[maps](reference-main-maps.md) and [arrays](reference-main-arrays.md). We
haven't encountered maps and arrays yet in this introduction, but for now it
haven't encountered maps and arrays yet in this introduction, but for now, it
suffices to know that `$*` is a special variable holding the current record as
a map:
@ -375,14 +375,14 @@ Here we used the local variables `k` and `v`. Now we've seen four kinds of varia
* Local variables like `k`
* Built-in context variables like `NF` and `NR`
If you're curious about scope and extent of local variables, you can read more in the [section on variables](reference-dsl-variables.md).
If you're curious about the scope and extent of local variables, you can read more in the [section on variables](reference-dsl-variables.md).
## Arithmetic
Numbers in Miller's programming language are intended to operate with the principle of least surprise:
* Internally, numbers are either 64-bit signed integers or double-precision floating-point.
* Sums, differences, and products of integers are also integers (so `2*3=6` not `6.0`) -- unless the result of the operation would overflow a 64-bit signed integer in which case the result is automatically converted to float. (If you ever want integer-to-integer arithmetic, use `x .+ y`, `x .* y`, etc.)
* Sums, differences, and products of integers are also integers (so `2*3=6` not `6.0`) -- unless the result of the operation would overflow a 64-bit signed integer, in which case the result is automatically converted to float. (If you ever want integer-to-integer arithmetic, use `x .+ y`, `x .* y`, etc.)
* Quotients of integers are integers if the division is exact, else floating-point: so `6/2=3` but `7/2=3.5`.
You can read more about this in the [arithmetic reference](reference-main-arithmetic.md).
@ -397,7 +397,7 @@ see more in the [null-data reference](reference-main-null-data.md) but the
basic idea is:
* Adding a number to absent gives the number back. This means you don't have to put `@sum = 0` in your `begin` blocks.
* Any variable which has the absent value is not assigned. This means you don't have to check presence of things from one record to the next.
* Any variable that has the absent value is not assigned. This means you don't have to check the presence of things from one record to the next.
For example, you can sum up all the `$a` values across records without having to check whether they're present or not:

View file

@ -1,10 +1,10 @@
# Intro to Miller's programming language
In the [Miller in 10 minutes](10min.md) page we took a tour of some of Miller's most-used [verbs](reference-verbs.md) including `cat`, `head`, `tail`, `cut`, and `sort`. These are analogs of familiar system commands, but empowered by field-name indexing and file-format awareness: the system `sort` command only knows about lines and column names like `1,2,3,4`, while `mlr sort` knows about CSV/TSV/JSON/etc records, and field names like `color,shape,flag,index`.
On the [Miller in 10 minutes](10min.md) page, we took a tour of some of Miller's most-used [verbs](reference-verbs.md), including `cat`, `head`, `tail`, `cut`, and `sort`. These are analogs of familiar system commands, but empowered by field-name indexing and file-format awareness: the system `sort` command only knows about lines and column names like `1,2,3,4`, while `mlr sort` knows about CSV/TSV/JSON/etc records, and field names like `color,shape,flag,index`.
We also caught a glimpse of Miller's `put` and `filter` verbs. These two are special since they let you express statements using Miller's programming language. It's a *embedded domain-specific language* since it's inside Miller: often referred to simply as the *Miller DSL*.
We also caught a glimpse of Miller's `put` and `filter` verbs. These two are special because they allow you to express statements using Miller's programming language. It's an *embedded domain-specific language* since it's inside Miller: often referred to simply as the *Miller DSL*.
In the [DSL reference](reference-dsl.md) page we have a complete reference to Miller's programming language. For now, let's take a quick look at key features -- you can use as few or as many features as you like.
On the [DSL reference](reference-dsl.md) page, we have a complete reference to Miller's programming language. For now, let's take a quick look at key features -- you can use as few or as many features as you like.
## Records and fields
@ -16,9 +16,9 @@ GENMD-EOF
When we type that, a few things are happening:
* We refer to fields in the input data using a dollar sign and then the field name, e.g. `$quantity`. (If a field name contains special characters like a dot or slash, just use curly braces: `${field.name}`.)
* We refer to fields in the input data using a dollar sign and then the field name, e.g., `$quantity`. (If a field name contains special characters like a dot or slash, just use curly braces: `${field.name}`.)
* The expression `$cost = $quantity * $rate` is executed once per record of the data file. Our [example.csv](./example.csv) has 10 records so this expression was executed 10 times, with the field names `$quantity` and `$rate` each time bound to the current record's values for those fields.
* On the left-hand side we have the new field name `$cost` which didn't come from the input data. Assignments to new variables result in a new field being placed after all the other ones. If we'd assigned to an existing field name, it would have been updated in-place.
* On the left-hand side, we have the new field name `$cost`, which didn't come from the input data. Assignments to new variables result in a new field being placed after all the other ones. If we'd assigned to an existing field name, it would have been updated in place.
* The entire expression is surrounded by single quotes (with an adjustment needed on [Windows](miller-on-windows.md)), to get it past the system shell. Inside those, only double quotes have meaning in Miller's programming language.
## Multi-line statements, and statements-from-file
@ -36,9 +36,9 @@ mlr --c2p put '
' example.csv
GENMD-EOF
Anything from a `#` character to end of line is a code comment.
Anything from a `#` character to the end of the line is a code comment.
One of Miller's key features is the ability to express data-transformation right there at the keyboard, interactively. But if you find yourself using expressions repeatedly, you can put everything between the single quotes into a file and refer to that using `put -f`:
One of Miller's key features is the ability to express data transformation right there at the keyboard, interactively. But if you find yourself using expressions repeatedly, you can put everything between the single quotes into a file and refer to that using `put -f`:
GENMD-RUN-COMMAND
cat dsl-example.mlr
@ -48,13 +48,13 @@ GENMD-RUN-COMMAND
mlr --c2p put -f dsl-example.mlr example.csv
GENMD-EOF
This becomes particularly important on Windows. Quite a bit of effort was put into making Miller on Windows be able to handle the kinds of single-quoted expressions we're showing here, but if you get syntax-error messages on Windows using examples in this documentation, you can put the parts between single quotes into a file and refer to that using `mlr put -f` -- or, use the triple-double-quote trick as described in the [Miller on Windows page](miller-on-windows.md).
This becomes particularly important on Windows. Quite a bit of effort was put into making Miller on Windows be able to handle the kinds of single-quoted expressions we're showing here. Still, if you get syntax-error messages on Windows using examples in this documentation, you can put the parts between single quotes into a file and refer to that using `mlr put -f` -- or, use the triple-double-quote trick as described in the [Miller on Windows page](miller-on-windows.md).
## Out-of-stream variables, begin, and end
Above we saw that your expression is executed once per record -- if a file has a million records, your expression will be executed a million times, once for each record. But you can mark statements to only be executed once, either before the record stream begins, or after the record stream is ended. If you know about [AWK](https://en.wikipedia.org/wiki/AWK), you might have noticed that Miller's programming language is loosely inspired by it, including the `begin` and `end` statements.
Above, we saw that your expression is executed once per record: if a file has a million records, your expression will be executed a million times, once for each record. But you can mark statements only to be executed once, either before the record stream begins or after the record stream is ended. If you know about [AWK](https://en.wikipedia.org/wiki/AWK), you might have noticed that Miller's programming language is loosely inspired by it, including the `begin` and `end` statements.
Above we also saw that names like `$quantity` are bound to each record in turn.
Above, we also saw that names like `$quantity` are bound to each record in turn.
To make `begin` and `end` statements useful, we need somewhere to put things that persist across the duration of the record stream, and a way to emit them. Miller uses [**out-of-stream variables**](reference-dsl-variables.md#out-of-stream-variables) (or **oosvars** for short) whose names start with an `@` sigil, along with the [`emit`](reference-dsl-output-statements.md#emit-statements) keyword to write them into the output record stream:
@ -94,8 +94,8 @@ So, take this sum/count example as an indication of the kinds of things you can
Also inspired by [AWK](https://en.wikipedia.org/wiki/AWK), the Miller DSL has the following special [**context variables**](reference-dsl-variables.md#built-in-variables):
* `FILENAME` -- the filename the current record came from. Especially useful in things like `mlr ... *.csv`.
* `FILENUM` -- similarly, but integer 1,2,3,... rather than filenam.e
* `NF` -- the number of fields in the current record. Note that if you assign `$newcolumn = some value` then `NF` will increment.
* `FILENUM` -- similarly, but integer 1,2,3,... rather than filename.
* `NF` -- the number of fields in the current record. Note that if you assign `$newcolumn = some value`, then `NF` will increment.
* `NR` -- starting from 1, counter of how many records processed so far.
* `FNR` -- similar, but resets to 1 at the start of each file.
@ -130,12 +130,12 @@ GENMD-EOF
Note that here we used the `-f` flag to `put` to load our function
definition, and also the `-e` flag to add another statement on the command
line. (We could have also put `$fact = factorial(NR)` inside
`factorial-example.mlr` but that would have made that file less flexible for our
`factorial-example.mlr`, but that would have made that file less flexible for our
future use.)
## If-statements, loops, and local variables
Suppose you want to only compute sums conditionally -- you can use an `if` statement:
Suppose you want only to compute sums conditionally -- you can use an `if` statement:
GENMD-RUN-COMMAND
cat if-example.mlr
@ -152,7 +152,7 @@ page](reference-dsl-control-structures.md#for-loops), Miller has a few kinds of
for-loops. In addition to the usual 3-part `for (i = 0; i < 10; i += 1)` kind
that many programming languages have, Miller also lets you loop over
[maps](reference-main-maps.md) and [arrays](reference-main-arrays.md). We
haven't encountered maps and arrays yet in this introduction, but for now it
haven't encountered maps and arrays yet in this introduction, but for now, it
suffices to know that `$*` is a special variable holding the current record as
a map:
@ -175,14 +175,14 @@ Here we used the local variables `k` and `v`. Now we've seen four kinds of varia
* Local variables like `k`
* Built-in context variables like `NF` and `NR`
If you're curious about scope and extent of local variables, you can read more in the [section on variables](reference-dsl-variables.md).
If you're curious about the scope and extent of local variables, you can read more in the [section on variables](reference-dsl-variables.md).
## Arithmetic
Numbers in Miller's programming language are intended to operate with the principle of least surprise:
* Internally, numbers are either 64-bit signed integers or double-precision floating-point.
* Sums, differences, and products of integers are also integers (so `2*3=6` not `6.0`) -- unless the result of the operation would overflow a 64-bit signed integer in which case the result is automatically converted to float. (If you ever want integer-to-integer arithmetic, use `x .+ y`, `x .* y`, etc.)
* Sums, differences, and products of integers are also integers (so `2*3=6` not `6.0`) -- unless the result of the operation would overflow a 64-bit signed integer, in which case the result is automatically converted to float. (If you ever want integer-to-integer arithmetic, use `x .+ y`, `x .* y`, etc.)
* Quotients of integers are integers if the division is exact, else floating-point: so `6/2=3` but `7/2=3.5`.
You can read more about this in the [arithmetic reference](reference-main-arithmetic.md).
@ -197,7 +197,7 @@ see more in the [null-data reference](reference-main-null-data.md) but the
basic idea is:
* Adding a number to absent gives the number back. This means you don't have to put `@sum = 0` in your `begin` blocks.
* Any variable which has the absent value is not assigned. This means you don't have to check presence of things from one record to the next.
* Any variable that has the absent value is not assigned. This means you don't have to check the presence of things from one record to the next.
For example, you can sum up all the `$a` values across records without having to check whether they're present or not:

4
docs/src/missings.csv Normal file
View file

@ -0,0 +1,4 @@
a,x,z,w
red,7,,
green,,242,zdatsyg
blue,9,,
1 a x z w
2 red 7
3 green 242 zdatsyg
4 blue 9

5
docs/src/missings.json Normal file
View file

@ -0,0 +1,5 @@
[
{ "a": "red", "x": 7 },
{ "a": "green", "z": 242, "w": "zdatsyg" },
{ "a": "blue", "x": 9 }
]

View file

@ -722,7 +722,7 @@ Passes through input records with specified fields included/excluded.
-r Treat field names as regular expressions. "ab", "a.*b" will
match any field name containing the substring "ab" or matching
"a.*b", respectively; anchors of the form "^ab$", "^a.*b$" may
be used. The -o flag is ignored when -r is present.
be used.
Examples:
mlr cut -f hostname,status
mlr cut -x -f hostname,status

View file

@ -24,43 +24,23 @@ TL;DRs: [install](installing-miller.md), [binaries](https://github.com/johnkerl/
### Performance
Performance is on par with Miller 5 for simple processing, and is far better than Miller 5 for
complex processing chains -- the latter due to improved multicore utilization. CSV I/O is notably
improved. See the [Performance benchmarks](#performance-benchmarks) section at the bottom of this
page for details.
Performance is on par with Miller 5 for simple processing, and is far better than Miller 5 for complex processing chains -- the latter due to improved multicore utilization. CSV I/O is notably improved. See the [Performance benchmarks](#performance-benchmarks) section at the bottom of this page for details.
### Documentation improvements
Documentation (what you're reading here) and online help (`mlr --help`) have been completely reworked.
In the initial release, the focus was convincing users already familiar with
`awk`/`grep`/`cut` that Miller was a viable alternative -- but over time it's
become clear that many Miller users aren't expert with those tools. The focus
has shifted toward a higher quantity of more introductory/accessible material
for command-line data processing.
In the initial release, the focus was on convincing users already familiar with `awk`, `grep`, and `cut` that Miller was a viable alternative; however, over time, it has become clear that many Miller users aren't experts with those tools. The focus has shifted toward a higher quantity of more introductory/accessible material for command-line data processing.
Similarly, the FAQ/recipe material has been expanded to include more, and
simpler, use-cases including resolved questions from
[Miller Issues](https://github.com/johnkerl/miller/issues)
and
[Miller Discussions](https://github.com/johnkerl/miller/discussions);
more complex/niche material has been pushed farther down. The long reference
pages have been split up into separate pages. (See also
[Structure of these documents](structure-of-these-documents.md).)
Similarly, the FAQ/recipe material has been expanded to include more, and simpler, use-cases, including resolved questions from [Miller Issues](https://github.com/johnkerl/miller/issues) and [Miller Discussions](https://github.com/johnkerl/miller/discussions); more complex/niche material has been pushed farther down. The lengthy reference pages have been divided into separate pages. (See also [Structure of these documents](structure-of-these-documents.md).)
One of the main feedback themes from the 2021 Miller User Survey was that some
things should be easier to find. Namely, on each doc page there's now a banner
across the top with things that should be one click away from the landing page
(or any page): command-line flags, verbs, functions, glossary/acronyms, and a
finder for docs by release.
One of the main feedback themes from the 2021 Miller User Survey was that some things should be easier to find. Namely, on each doc page, there's now a banner across the top with things that should be one click away from the landing page (or any page): command-line flags, verbs, functions, glossary/acronyms, and a finder for docs by release.
Since CSV is overwhelmingly the most popular data format for Miller, it is
now discussed first, and more examples use CSV.
Since CSV is overwhelmingly the most popular data format for Miller, it is now discussed first, and more examples use CSV.
### Improved Windows experience
Stronger support for Windows (with or without MSYS2), with a couple of
exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
Stronger support for Windows (with or without MSYS2), with a couple of exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
Binaries are reliably available using GitHub Actions: see also [Installation](installing-miller.md).
@ -89,9 +69,7 @@ Parse error on token ">" at line 63 column 7.
### Scripting
Scripting is now easier -- support for `#!` with `sh`, as always, along with now support for `#!` with `mlr -s`. For
Windows, `mlr -s` can also be used. These help reduce backslash-clutter and let you do more while typing less.
See the [scripting page](scripting.md).
Scripting is now easier -- support for `#!` with `sh`, as always, along with now support for `#!` with `mlr -s`. For Windows, `mlr -s` can also be used. These help reduce backslash clutter and let you do more while typing less. See the [scripting page](scripting.md).
### REPL
@ -143,7 +121,7 @@ the `TZ` environment variable. Please see [DSL datetime/timezone functions](refe
### In-process support for compressed input
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly for `.z` and `.bz2` files. Please see the page on [Compressed data](reference-main-compressed-data.md) for more information.
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly, for `.z`, `.bz2`, and `.zst` files. Please refer to the page on [Compressed Data](reference-main-compressed-data.md) for more information.
### Support for reading web URLs
@ -171,9 +149,7 @@ purple,triangle,false,7,65,80.1405,5.8240
### Improved JSON / JSON Lines support, and arrays
Arrays are now supported in Miller's `put`/`filter` programming language, as
described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is
now a keyword so this is no longer usable as a local-variable or UDF name.)
Arrays are now supported in Miller's `put`/`filter` programming language, as described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is now a keyword, so this is no longer usable as a local variable or UDF name.)
JSON support is improved:
@ -196,24 +172,13 @@ See also the [Arrays reference](reference-main-arrays.md) for more information.
### Improved numeric conversion
The most central part of Miller 6 is a deep refactor of how data values are parsed
from file contents, how types are inferred, and how they're converted back to
text into output files.
The most central part of Miller 6 is a deep refactor of how data values are parsed from file contents, how types are inferred, and how they're converted back to text into output files.
This was all initiated by [https://github.com/johnkerl/miller/issues/151](https://github.com/johnkerl/miller/issues/151).
In Miller 5 and below, all values were stored as strings, then only converted
to int/float as-needed, for example when a particular field was referenced in
the `stats1` or `put` verbs. This led to awkwardnesses such as the `-S`
and `-F` flags for `put` and `filter`.
In Miller 5 and below, all values were stored as strings, then only converted to int/float as needed, for example, when a particular field was referenced in the `stats1` or `put` verbs. This led to awkwardnesses such as the `-S` and `-F` flags for `put` and `filter`.
In Miller 6, things parseable as int/float are treated as such from the moment
the input data is read, and these are passed along through the verb chain. All
values are typed from when they're read, and their types are passed along.
Meanwhile the original string representation of each value is also retained. If
a numeric field isn't modified during the processing chain, it's printed out
the way it arrived. Also, quoted values in JSON strings are flagged as being
strings throughout the processing chain.
In Miller 6, values parseable as integers or floating-point numbers are treated as such from the moment the input data is read, and these are passed along through the verb chain. All values are typed from when they're read, and their types are passed along. Meanwhile, the original string representation of each value is also retained. If a numeric field isn't modified during the processing chain, it's printed out the way it arrived. Additionally, quoted values in JSON strings are consistently flagged as strings throughout the processing chain.
For example (see [https://github.com/johnkerl/miller/issues/178](https://github.com/johnkerl/miller/issues/178)) you can now do
@ -242,30 +207,21 @@ For example (see [https://github.com/johnkerl/miller/issues/178](https://github.
### Deduping of repeated field names
By default, field names are deduped for all file formats except JSON / JSON Lines. So if you
have an input record with `x=8,x=9` then the second field's key is renamed to
`x_2` and so on -- the record scans as `x=8,x_2=9`. Use `mlr
--no-dedupe-field-names` to suppress this, and have the record be scanned as
`x=9`.
By default, field names are deduplicated for all file formats except JSON / JSON Lines. So if you have an input record with `x=8,x=9`, then the second field's key is renamed to `x_2` and so on -- the record scans as `x=8,x_2=9`. Use `mlr --no-dedupe-field-names` to suppress this, and have the record be scanned as `x=9`.
For JSON and JSON Lines, the last duplicated key in an input record is always retained,
regardless of `mlr --no-dedupe-field-names`: `{"x":8,"x":9}` scans as if it
were `{"x":9}`.
For JSON and JSON Lines, the last duplicated key in an input record is always retained, regardless of `mlr --no-dedupe-field-names`: `{"x":8,"x":9}` scans as if it were `{"x":9}`.
### Regex support for IFS and IPS
You can now split fields on whitespace when whitespace is a mix of tabs and
spaces. As well, you can use regular expressions for the input field-separator
and the input pair-separator. Please see the section on
[multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
You can now split fields on whitespace when whitespace is a mix of tabs and spaces. As well, you can use regular expressions for the input field-separator and the input pair-separator. Please see the section on [multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
In particular, for NIDX format, the default IFS now allows splitting on one or more of space or tab.
In particular, for NIDX format, the default `IFS` now allows splitting on one or more of space or tab.
### Case-folded sorting options
The [sort](reference-verbs.md#sort) verb now accepts `-c` and `-cr` options for case-folded ascending/descending sort, respetively.
The [sort](reference-verbs.md#sort) verb now accepts `-c` and `-cr` options for case-folded ascending/descending sort, respectively.
### New DSL functions / operators
### New DSL functions and operators
* Higher-order functions [`select`](reference-dsl-builtin-functions.md#select), [`apply`](reference-dsl-builtin-functions.md#apply), [`reduce`](reference-dsl-builtin-functions.md#reduce), [`fold`](reference-dsl-builtin-functions.md#fold), and [`sort`](reference-dsl-builtin-functions.md#sort). See the [sorting page](sorting.md) and the [higher-order-functions page](reference-dsl-higher-order-functions.md) for more information.
@ -293,30 +249,30 @@ The following differences are rather technical. If they don't sound familiar to
### Line endings
The `--auto` flag is now ignored. Before, if a file had CR/LF (Windows-style) line endings on input (on any platform), it would have the same on output; likewise, LF (Unix-style) line endings. Now, files with CR/LF or LF line endings are processed on any platform, but the output line-ending is for the platform. E.g. reading CR/LF files on Linux will now produce LF output.
The `--auto` flag is now ignored. Before, if a file had CR/LF (Windows-style) line endings on input (on any platform), it would have the same on output; likewise, LF (Unix-style) line endings. Now, files with CR/LF or LF line endings are processed on any platform, but the output line ending is for the platform. E.g., reading CR/LF files on Linux will now produce LF output.
### IFS and IPS as regular expressions
IFS and IPS can be regular expressions now. Please see the section on [multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
IFS and IPS can now be regular expressions. Please see the section on [multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
### JSON and JSON Lines formatting
* `--jknquoteint` and `jquoteall` are ignored; they were workarounds for the (now much-improved) type-inference and type-tracking in Miller 6.
* `--json-fatal-arrays-on-input`, `--json-map-arrays-on-input`, and `--json-skip-arrays-on-input` are ignored; Miller 6 now supports arrays fully.
* See also `mlr help legacy-flags` or the [legacy-flags reference](reference-main-flag-list.md#legacy-flags).
* Miller 5 accepted input records either with or without enclosing `[...]`; on output, by default it produced single-line records without outermost `[...]`. Miller 5 let you customize output formatting using `--jvstack` (multi-line records) and `--jlistwrap` (write outermost `[...]`). _Thus, Miller 5's JSON output format, with default flags, was in fact [JSON Lines](file-formats.md#json-lines) all along._
* Miller 5 accepted input records either with or without enclosing `[...]`; on output, by default, it produced single-line records without outermost `[...]`. Miller 5 lets you customize output formatting using `--jvstack` (multi-line records) and `--jlistwrap` (write outermost `[...]`). _Thus, Miller 5's JSON output format, with default flags, was in fact [JSON Lines](file-formats.md#json-lines) all along._
* In Miller 6, [JSON Lines](file-formats.md#json-lines) is acknowledged explicitly.
* On input, your records are accepted whether or not they have outermost `[...]`, and regardless of line breaks, whether the specified input format is JSON or JSON Lines. (This is similar to [jq](https://stedolan.github.io/jq/).)
* With `--ojson`, output records are written multiline (pretty-printed), with outermost `[...]`.
* With `--ojsonl`, output records are written single-line, without outermost `[...]`.
* This makes `--jvstack` and `--jlistwrap` unnecessary. However, if you want outermost `[...]` with single-line records, you can use `--ojson --no-jvstack`.
* Miller 5 tolerated trailing commas, which are not compliant with the JSON specification: for example, `{"x":1,"y":2,}`. Miller 6 uses a JSON parser which is compliant with the JSON specification and does not accept trailing commas.
* Miller 5 tolerated trailing commas, which are not compliant with the JSON specification: for example, `{"x":1,"y":2,}`. Miller 6 uses a JSON parser that is compliant with the JSON specification and does not accept trailing commas.
### Type-inference
* The `-S` and `-F` flags to `mlr put` and `mlr filter` are ignored, since type-inference is no longer done in `mlr put` and `mlr filter`, but rather, when records are first read. You can use `mlr -S` and `mlr -A`, respectively, instead to control type-inference within the record-readers.
* Octal numbers like `0123` and `07` are type-inferred as string. Use `mlr -O` to infer them as octal integers. Note that `08` and `09` will then infer as decimal integers.
* Any numbers prefix with `0o`, e.g. `0o377`, are already treated as octal regardless of `mlr -O` -- `mlr -O` only affects how leading-zero integers are handled.
* Any numbers prefixed with `0o`, e.g. `0o377`, are already treated as octal, regardless of `mlr -O` -- `mlr -O` only affects how leading-zero integers are handled.
* See also the [miscellaneous-flags reference](reference-main-flag-list.md#miscellaneous-flags).
### Emit statements
@ -341,13 +297,12 @@ This works in Miller 6 (and worked in Miller 5 as well) and is supported:
input=1
</pre>
Please see the [section on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf)
for more information.
Please see the [section on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf) for more information.
## Developer-specific aspects
* Miller has been ported from C to Go. Developer notes: [https://github.com/johnkerl/miller/blob/main/README-dev.md](https://github.com/johnkerl/miller/blob/main/README-dev.md).
* Regression testing has been completely reworked, including regression-testing now running fully on Windows (alongside Linux and Mac) [on each GitHub commit](https://github.com/johnkerl/miller/actions).
* Regression testing has been completely reworked, including regression-testing now running fully on Windows (alongside Linux and Mac) [on each github.commit](https://github.com/johnkerl/miller/actions).
## Performance benchmarks

View file

@ -8,43 +8,23 @@ TL;DRs: [install](installing-miller.md), [binaries](https://github.com/johnkerl/
### Performance
Performance is on par with Miller 5 for simple processing, and is far better than Miller 5 for
complex processing chains -- the latter due to improved multicore utilization. CSV I/O is notably
improved. See the [Performance benchmarks](#performance-benchmarks) section at the bottom of this
page for details.
Performance is on par with Miller 5 for simple processing, and is far better than Miller 5 for complex processing chains -- the latter due to improved multicore utilization. CSV I/O is notably improved. See the [Performance benchmarks](#performance-benchmarks) section at the bottom of this page for details.
### Documentation improvements
Documentation (what you're reading here) and online help (`mlr --help`) have been completely reworked.
In the initial release, the focus was convincing users already familiar with
`awk`/`grep`/`cut` that Miller was a viable alternative -- but over time it's
become clear that many Miller users aren't expert with those tools. The focus
has shifted toward a higher quantity of more introductory/accessible material
for command-line data processing.
In the initial release, the focus was on convincing users already familiar with `awk`, `grep`, and `cut` that Miller was a viable alternative; however, over time, it has become clear that many Miller users aren't experts with those tools. The focus has shifted toward a higher quantity of more introductory/accessible material for command-line data processing.
Similarly, the FAQ/recipe material has been expanded to include more, and
simpler, use-cases including resolved questions from
[Miller Issues](https://github.com/johnkerl/miller/issues)
and
[Miller Discussions](https://github.com/johnkerl/miller/discussions);
more complex/niche material has been pushed farther down. The long reference
pages have been split up into separate pages. (See also
[Structure of these documents](structure-of-these-documents.md).)
Similarly, the FAQ/recipe material has been expanded to include more, and simpler, use-cases, including resolved questions from [Miller Issues](https://github.com/johnkerl/miller/issues) and [Miller Discussions](https://github.com/johnkerl/miller/discussions); more complex/niche material has been pushed farther down. The lengthy reference pages have been divided into separate pages. (See also [Structure of these documents](structure-of-these-documents.md).)
One of the main feedback themes from the 2021 Miller User Survey was that some
things should be easier to find. Namely, on each doc page there's now a banner
across the top with things that should be one click away from the landing page
(or any page): command-line flags, verbs, functions, glossary/acronyms, and a
finder for docs by release.
One of the main feedback themes from the 2021 Miller User Survey was that some things should be easier to find. Namely, on each doc page, there's now a banner across the top with things that should be one click away from the landing page (or any page): command-line flags, verbs, functions, glossary/acronyms, and a finder for docs by release.
Since CSV is overwhelmingly the most popular data format for Miller, it is
now discussed first, and more examples use CSV.
Since CSV is overwhelmingly the most popular data format for Miller, it is now discussed first, and more examples use CSV.
### Improved Windows experience
Stronger support for Windows (with or without MSYS2), with a couple of
exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
Stronger support for Windows (with or without MSYS2), with a couple of exceptions. See [Miller on Windows](miller-on-windows.md) for more information.
Binaries are reliably available using GitHub Actions: see also [Installation](installing-miller.md).
@ -73,9 +53,7 @@ GENMD-EOF
### Scripting
Scripting is now easier -- support for `#!` with `sh`, as always, along with now support for `#!` with `mlr -s`. For
Windows, `mlr -s` can also be used. These help reduce backslash-clutter and let you do more while typing less.
See the [scripting page](scripting.md).
Scripting is now easier -- support for `#!` with `sh`, as always, along with now support for `#!` with `mlr -s`. For Windows, `mlr -s` can also be used. These help reduce backslash clutter and let you do more while typing less. See the [scripting page](scripting.md).
### REPL
@ -125,7 +103,7 @@ the `TZ` environment variable. Please see [DSL datetime/timezone functions](refe
### In-process support for compressed input
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly for `.z` and `.bz2` files. Please see the page on [Compressed data](reference-main-compressed-data.md) for more information.
In addition to `--prepipe gunzip`, you can now use the `--gzin` flag. In fact, if your files end in `.gz` you don't even need to do that -- Miller will autodetect by file extension and automatically uncompress `mlr --csv cat foo.csv.gz`. Similarly, for `.z`, `.bz2`, and `.zst` files. Please refer to the page on [Compressed Data](reference-main-compressed-data.md) for more information.
### Support for reading web URLs
@ -140,9 +118,7 @@ GENMD-EOF
### Improved JSON / JSON Lines support, and arrays
Arrays are now supported in Miller's `put`/`filter` programming language, as
described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is
now a keyword so this is no longer usable as a local-variable or UDF name.)
Arrays are now supported in Miller's `put`/`filter` programming language, as described in the [Arrays reference](reference-main-arrays.md). (Also, `array` is now a keyword, so this is no longer usable as a local variable or UDF name.)
JSON support is improved:
@ -165,24 +141,13 @@ See also the [Arrays reference](reference-main-arrays.md) for more information.
### Improved numeric conversion
The most central part of Miller 6 is a deep refactor of how data values are parsed
from file contents, how types are inferred, and how they're converted back to
text into output files.
The most central part of Miller 6 is a deep refactor of how data values are parsed from file contents, how types are inferred, and how they're converted back to text into output files.
This was all initiated by [https://github.com/johnkerl/miller/issues/151](https://github.com/johnkerl/miller/issues/151).
In Miller 5 and below, all values were stored as strings, then only converted
to int/float as-needed, for example when a particular field was referenced in
the `stats1` or `put` verbs. This led to awkwardnesses such as the `-S`
and `-F` flags for `put` and `filter`.
In Miller 5 and below, all values were stored as strings, then only converted to int/float as needed, for example, when a particular field was referenced in the `stats1` or `put` verbs. This led to awkwardnesses such as the `-S` and `-F` flags for `put` and `filter`.
In Miller 6, things parseable as int/float are treated as such from the moment
the input data is read, and these are passed along through the verb chain. All
values are typed from when they're read, and their types are passed along.
Meanwhile the original string representation of each value is also retained. If
a numeric field isn't modified during the processing chain, it's printed out
the way it arrived. Also, quoted values in JSON strings are flagged as being
strings throughout the processing chain.
In Miller 6, values parseable as integers or floating-point numbers are treated as such from the moment the input data is read, and these are passed along through the verb chain. All values are typed from when they're read, and their types are passed along. Meanwhile, the original string representation of each value is also retained. If a numeric field isn't modified during the processing chain, it's printed out the way it arrived. Additionally, quoted values in JSON strings are consistently flagged as strings throughout the processing chain.
For example (see [https://github.com/johnkerl/miller/issues/178](https://github.com/johnkerl/miller/issues/178)) you can now do
@ -196,30 +161,21 @@ GENMD-EOF
### Deduping of repeated field names
By default, field names are deduped for all file formats except JSON / JSON Lines. So if you
have an input record with `x=8,x=9` then the second field's key is renamed to
`x_2` and so on -- the record scans as `x=8,x_2=9`. Use `mlr
--no-dedupe-field-names` to suppress this, and have the record be scanned as
`x=9`.
By default, field names are deduplicated for all file formats except JSON / JSON Lines. So if you have an input record with `x=8,x=9`, then the second field's key is renamed to `x_2` and so on -- the record scans as `x=8,x_2=9`. Use `mlr --no-dedupe-field-names` to suppress this, and have the record be scanned as `x=9`.
For JSON and JSON Lines, the last duplicated key in an input record is always retained,
regardless of `mlr --no-dedupe-field-names`: `{"x":8,"x":9}` scans as if it
were `{"x":9}`.
For JSON and JSON Lines, the last duplicated key in an input record is always retained, regardless of `mlr --no-dedupe-field-names`: `{"x":8,"x":9}` scans as if it were `{"x":9}`.
### Regex support for IFS and IPS
You can now split fields on whitespace when whitespace is a mix of tabs and
spaces. As well, you can use regular expressions for the input field-separator
and the input pair-separator. Please see the section on
[multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
You can now split fields on whitespace when whitespace is a mix of tabs and spaces. As well, you can use regular expressions for the input field-separator and the input pair-separator. Please see the section on [multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
In particular, for NIDX format, the default IFS now allows splitting on one or more of space or tab.
In particular, for NIDX format, the default `IFS` now allows splitting on one or more of space or tab.
### Case-folded sorting options
The [sort](reference-verbs.md#sort) verb now accepts `-c` and `-cr` options for case-folded ascending/descending sort, respetively.
The [sort](reference-verbs.md#sort) verb now accepts `-c` and `-cr` options for case-folded ascending/descending sort, respectively.
### New DSL functions / operators
### New DSL functions and operators
* Higher-order functions [`select`](reference-dsl-builtin-functions.md#select), [`apply`](reference-dsl-builtin-functions.md#apply), [`reduce`](reference-dsl-builtin-functions.md#reduce), [`fold`](reference-dsl-builtin-functions.md#fold), and [`sort`](reference-dsl-builtin-functions.md#sort). See the [sorting page](sorting.md) and the [higher-order-functions page](reference-dsl-higher-order-functions.md) for more information.
@ -247,30 +203,30 @@ The following differences are rather technical. If they don't sound familiar to
### Line endings
The `--auto` flag is now ignored. Before, if a file had CR/LF (Windows-style) line endings on input (on any platform), it would have the same on output; likewise, LF (Unix-style) line endings. Now, files with CR/LF or LF line endings are processed on any platform, but the output line-ending is for the platform. E.g. reading CR/LF files on Linux will now produce LF output.
The `--auto` flag is now ignored. Before, if a file had CR/LF (Windows-style) line endings on input (on any platform), it would have the same on output; likewise, LF (Unix-style) line endings. Now, files with CR/LF or LF line endings are processed on any platform, but the output line ending is for the platform. E.g., reading CR/LF files on Linux will now produce LF output.
### IFS and IPS as regular expressions
IFS and IPS can be regular expressions now. Please see the section on [multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
IFS and IPS can now be regular expressions. Please see the section on [multi-character and regular-expression separators](reference-main-separators.md#multi-character-and-regular-expression-separators).
### JSON and JSON Lines formatting
* `--jknquoteint` and `jquoteall` are ignored; they were workarounds for the (now much-improved) type-inference and type-tracking in Miller 6.
* `--json-fatal-arrays-on-input`, `--json-map-arrays-on-input`, and `--json-skip-arrays-on-input` are ignored; Miller 6 now supports arrays fully.
* See also `mlr help legacy-flags` or the [legacy-flags reference](reference-main-flag-list.md#legacy-flags).
* Miller 5 accepted input records either with or without enclosing `[...]`; on output, by default it produced single-line records without outermost `[...]`. Miller 5 let you customize output formatting using `--jvstack` (multi-line records) and `--jlistwrap` (write outermost `[...]`). _Thus, Miller 5's JSON output format, with default flags, was in fact [JSON Lines](file-formats.md#json-lines) all along._
* Miller 5 accepted input records either with or without enclosing `[...]`; on output, by default, it produced single-line records without outermost `[...]`. Miller 5 lets you customize output formatting using `--jvstack` (multi-line records) and `--jlistwrap` (write outermost `[...]`). _Thus, Miller 5's JSON output format, with default flags, was in fact [JSON Lines](file-formats.md#json-lines) all along._
* In Miller 6, [JSON Lines](file-formats.md#json-lines) is acknowledged explicitly.
* On input, your records are accepted whether or not they have outermost `[...]`, and regardless of line breaks, whether the specified input format is JSON or JSON Lines. (This is similar to [jq](https://stedolan.github.io/jq/).)
* With `--ojson`, output records are written multiline (pretty-printed), with outermost `[...]`.
* With `--ojsonl`, output records are written single-line, without outermost `[...]`.
* This makes `--jvstack` and `--jlistwrap` unnecessary. However, if you want outermost `[...]` with single-line records, you can use `--ojson --no-jvstack`.
* Miller 5 tolerated trailing commas, which are not compliant with the JSON specification: for example, `{"x":1,"y":2,}`. Miller 6 uses a JSON parser which is compliant with the JSON specification and does not accept trailing commas.
* Miller 5 tolerated trailing commas, which are not compliant with the JSON specification: for example, `{"x":1,"y":2,}`. Miller 6 uses a JSON parser that is compliant with the JSON specification and does not accept trailing commas.
### Type-inference
* The `-S` and `-F` flags to `mlr put` and `mlr filter` are ignored, since type-inference is no longer done in `mlr put` and `mlr filter`, but rather, when records are first read. You can use `mlr -S` and `mlr -A`, respectively, instead to control type-inference within the record-readers.
* Octal numbers like `0123` and `07` are type-inferred as string. Use `mlr -O` to infer them as octal integers. Note that `08` and `09` will then infer as decimal integers.
* Any numbers prefix with `0o`, e.g. `0o377`, are already treated as octal regardless of `mlr -O` -- `mlr -O` only affects how leading-zero integers are handled.
* Any numbers prefixed with `0o`, e.g. `0o377`, are already treated as octal, regardless of `mlr -O` -- `mlr -O` only affects how leading-zero integers are handled.
* See also the [miscellaneous-flags reference](reference-main-flag-list.md#miscellaneous-flags).
### Emit statements
@ -290,13 +246,12 @@ GENMD-RUN-COMMAND
mlr -n put 'end {@input={"a":1}; emit1 {"input":@input["a"]}}'
GENMD-EOF
Please see the [section on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf)
for more information.
Please see the [section on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf) for more information.
## Developer-specific aspects
* Miller has been ported from C to Go. Developer notes: [https://github.com/johnkerl/miller/blob/main/README-dev.md](https://github.com/johnkerl/miller/blob/main/README-dev.md).
* Regression testing has been completely reworked, including regression-testing now running fully on Windows (alongside Linux and Mac) [on each GitHub commit](https://github.com/johnkerl/miller/actions).
* Regression testing has been completely reworked, including regression-testing now running fully on Windows (alongside Linux and Mac) [on each github.commit](https://github.com/johnkerl/miller/actions).
## Performance benchmarks

View file

@ -55,6 +55,7 @@ Flags:
mlr help comments-in-data-flags
mlr help compressed-data-flags
mlr help csv/tsv-only-flags
mlr help dkvp-only-flags
mlr help file-format-flags
mlr help flatten-unflatten-flags
mlr help format-conversion-keystroke-saver-flags
@ -86,6 +87,7 @@ Other:
mlr help mlrrc
mlr help output-colorization
mlr help type-arithmetic-info
mlr help type-arithmetic-info-extended
Shorthands:
mlr -g = mlr help flags
mlr -l = mlr help list-verbs
@ -143,6 +145,9 @@ gmt2localtime (class=time #args=1,2) Convert from a GMT-time string to a local-
Examples:
gmt2localtime("1999-12-31T22:00:00Z") = "2000-01-01 00:00:00" with TZ="Asia/Istanbul"
gmt2localtime("1999-12-31T22:00:00Z", "Asia/Istanbul") = "2000-01-01 00:00:00"
gmt2nsec (class=time #args=1) Parses GMT timestamp as integer nanoseconds since the epoch.
Example:
gmt2nsec("2001-02-03T04:05:06Z") = 981173106000000000
gmt2sec (class=time #args=1) Parses GMT timestamp as integer seconds since the epoch.
Example:
gmt2sec("2001-02-03T04:05:06Z") = 981173106
@ -150,6 +155,14 @@ localtime2gmt (class=time #args=1,2) Convert from a local-time string to a GMT-
Examples:
localtime2gmt("2000-01-01 00:00:00") = "1999-12-31T22:00:00Z" with TZ="Asia/Istanbul"
localtime2gmt("2000-01-01 00:00:00", "Asia/Istanbul") = "1999-12-31T22:00:00Z"
nsec2gmt (class=time #args=1,2) Formats integer nanoseconds since epoch as GMT timestamp. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part.
Examples:
nsec2gmt(1234567890000000000) = "2009-02-13T23:31:30Z"
nsec2gmt(1234567890123456789) = "2009-02-13T23:31:30Z"
nsec2gmt(1234567890123456789, 6) = "2009-02-13T23:31:30.123456Z"
nsec2gmtdate (class=time #args=1) Formats integer nanoseconds since epoch as GMT timestamp with year-month-date. Leaves non-numbers as-is.
Example:
sec2gmtdate(1440768801700000000) = "2015-08-28".
sec2gmt (class=time #args=1,2) Formats seconds since epoch as GMT timestamp. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part.
Examples:
sec2gmt(1234567890) = "2009-02-13T23:31:30Z"
@ -218,6 +231,7 @@ Options:
-nf {comma-separated field names} Same as -n
-nr {comma-separated field names} Numerical descending; nulls sort first
-t {comma-separated field names} Natural ascending
-b Move sort fields to start of record, as in reorder -b
-tr|-rt {comma-separated field names} Natural descending
-h|--help Show this message.

View file

@ -274,8 +274,6 @@ array will have [null-gaps](reference-main-arrays.md) in it:
"value": 54
}
]
[
]
</pre>
You can index `@records` by `@count` rather than `NR` to get a contiguous array:

View file

@ -16,7 +16,7 @@ Quick links:
</div>
# How original is Miller?
It isn't. Miller is one of many, many participants in the online-analytical-processing culture. Other key participants include `awk`, SQL, spreadsheets, etc. etc. etc. Far from being an original concept, Miller explicitly strives to imitate several existing tools:
It isn't. Miller is just one of many participants in the online analytical processing culture. Other key participants include `awk`, SQL, spreadsheets, etc. etc. etc. Far from being an original concept, Miller explicitly strives to imitate several existing tools:
**The Unix toolkit**: Intentional similarities as described in [Unix-toolkit Context](unix-toolkit-context.md).
@ -26,7 +26,7 @@ Recipes abound for command-line data analysis using the Unix toolkit. Here are j
* [http://www.gregreda.com/2013/07/15/unix-commands-for-data-science](http://www.gregreda.com/2013/07/15/unix-commands-for-data-science)
* [https://github.com/dbohdan/structured-text-tools](https://github.com/dbohdan/structured-text-tools)
**RecordStream**: Miller owes particular inspiration to [RecordStream](https://github.com/benbernard/RecordStream). The key difference is that RecordStream is a Perl-based tool for manipulating JSON (including requiring it to separately manipulate other formats such as CSV into and out of JSON), while Miller is fast Go which handles its formats natively. The similarities include the `sort`, `stats1` (analog of RecordStream's `collate`), and `delta` operations, as well as `filter` and `put`, and pretty-print formatting.
**RecordStream**: Miller owes particular inspiration to [RecordStream](https://github.com/benbernard/RecordStream). The key difference is that RecordStream is a Perl-based tool for manipulating JSON (including requiring it to separately manipulate other formats such as CSV into and out of JSON), while Miller is a fast Go tool that handles its formats natively. The similarities include the `sort`, `stats1` (analogous to RecordStream's `collate`), and `delta` operations, as well as `filter` and `put`, and the use of pretty-print formatting.
**stats_m**: A third source of lineage is my Python [stats_m](https://github.com/johnkerl/scripts-math/tree/master/stats) module. This includes simple single-pass algorithms which form Miller's `stats1` and `stats2` subcommands.
@ -35,21 +35,21 @@ Recipes abound for command-line data analysis using the Unix toolkit. Here are j
**Added value**: Miller's added values include:
* Name-indexing, compared to the Unix toolkit's positional indexing.
* Raw speed, compared to `awk`, RecordStream, `stats_m`, or various other kinds of Python/Ruby/etc. scripts one can easily create.
* Raw speed, compared to `awk`, RecordStream, `stats_m`, or various other kinds of Python/Ruby/etc. scripts that one can easily create.
* Compact keystroking for many common tasks, with a decent amount of flexibility.
* Ability to handle text files on the Unix pipe, without need for creating database tables, compared to SQL databases.
* Ability to handle text files on the Unix pipe, without the need for creating database tables, compared to SQL databases.
* Various file formats, and on-the-fly format conversion.
**jq**: Miller does for name-indexed text what [jq](https://stedolan.github.io/jq/) does for JSON. If you're not already familiar with `jq`, please check it out!.
**What about similar tools?**
Here's a comprehensive list: [https://github.com/dbohdan/structured-text-tools](https://github.com/dbohdan/structured-text-tools). Last I knew it doesn't mention [rows](https://github.com/turicas/rows) so here's a plug for that as well. As it turns out, I learned about most of these after writing Miller.
Here's a comprehensive list: [https://github.com/dbohdan/structured-text-tools](https://github.com/dbohdan/structured-text-tools). Last I knew, it doesn't mention [rows](https://github.com/turicas/rows) so here's a plug for that as well. As it turns out, I learned about most of these after writing Miller.
**What about DOTADIW?** One of the key points of the [Unix philosophy](http://en.wikipedia.org/wiki/Unix_philosophy) is that a tool should do one thing and do it well. Hence `sort` and `cut` do just one thing. Why does Miller put `awk`-like processing, a few SQL-like operations, and statistical reduction all into one tool? This is a fair question. First note that many standard tools, such as `awk` and `perl`, do quite a few things -- as does `jq`. But I could have pushed for putting format awareness and name-indexing options into `cut`, `awk`, and so on (so you could do `cut -f hostname,uptime` or `awk '{sum += $x*$y}END{print sum}'`). Patching `cut`, `sort`, etc. on multiple operating systems is a non-starter in terms of uptake. Moreover, it makes sense for me to have Miller be a tool which collects together format-aware record-stream processing into one place, with good reuse of Miller-internal library code for its various features.
**What about DOTADIW?** One of the key points of the [Unix philosophy](http://en.wikipedia.org/wiki/Unix_philosophy) is that a tool should do one thing and do it well. Hence, `sort` and `cut` do just one thing. Why does Miller put `awk`-like processing, a few SQL-like operations, and statistical reduction all into one tool? This is a fair question. First, note that many standard tools, such as `awk` and `perl`, do quite a few things -- as does `jq`. But I could have pushed for putting format awareness and name-indexing options into `cut`, `awk`, and so on (so you could do `cut -f hostname,uptime` or `awk '{sum += $x*$y}END{print sum}'`). Patching `cut`, `sort`, etc., on multiple operating systems is a non-starter in terms of uptake. Moreover, it makes sense for me to have Miller be a tool that collects together format-aware record-stream processing into one place, with good reuse of Miller's internal library code for its various features.
**Why not use Perl/Python/Ruby etc.?** Maybe you should. With those tools you'll get far more expressive power, and sufficiently quick turnaround time for small-to-medium-sized data. Using Miller you'll get something less than a complete programming language, but which is fast, with moderate amounts of flexibility and much less keystroking.
**Why not use Perl/Python/Ruby, etc.?** Maybe you should. With those tools, you'll gain significantly more expressive power and a sufficiently quick turnaround time for small to medium-sized datasets. Using Miller, you'll get something less than a complete programming language, but which is fast, with moderate amounts of flexibility and much less keystroking.
When I was first developing Miller I made a survey of several languages. Using low-level implementation languages like C, Go, Rust, and Nim, I'd need to create my own domain-specific language (DSL) which would always be less featured than a full programming language, but I'd get better performance. Using high-level interpreted languages such as Perl/Python/Ruby I'd get the language's `eval` for free and I wouldn't need a DSL; Miller would have mainly been a set of format-specific I/O hooks. If I'd gotten good enough performance from the latter I'd have done it without question and Miller would be far more flexible. But low-level languages win the performance criteria by a landslide so we have Miller in Go with a custom DSL.
When I was first developing Miller, I made a survey of several languages. Using low-level implementation languages like C, Go, Rust, and Nim, I'd need to create my own domain-specific language (DSL), which would always be less featured than a full programming language, but I'd get better performance. Using high-level interpreted languages such as Perl/Python/Ruby, I'd get the language's `eval` for free and I wouldn't need a DSL; Miller would have mainly been a set of format-specific I/O hooks. If I'd gotten good enough performance from the latter, I'd have done it without question, and Miller would be far more flexible. But low-level languages win the performance criteria by a landslide, so we have Miller in Go with a custom DSL.
**No, really, why one more command-line data-manipulation tool?** I wrote Miller because I was frustrated with tools like `grep`, `sed`, and so on being *line-aware* without being *format-aware*. The single most poignant example I can think of is seeing people grep data lines out of their CSV files and sadly losing their header lines. While some lighter-than-SQL processing is very nice to have, at core I wanted the format-awareness of [RecordStream](https://github.com/benbernard/RecordStream) combined with the raw speed of the Unix toolkit. Miller does precisely that.
**No, really, why one more command-line data-manipulation tool?** I wrote Miller because I was frustrated with tools like `grep`, `sed`, and so on being *line-aware* without being *format-aware*. The single most poignant example I can think of is seeing people grep data lines from their CSV files and sadly losing their header lines. While some lighter-than-SQL processing is very nice to have, at core I wanted the format-awareness of [RecordStream](https://github.com/benbernard/RecordStream) combined with the raw speed of the Unix toolkit. Miller does precisely that.

View file

@ -1,6 +1,6 @@
# How original is Miller?
It isn't. Miller is one of many, many participants in the online-analytical-processing culture. Other key participants include `awk`, SQL, spreadsheets, etc. etc. etc. Far from being an original concept, Miller explicitly strives to imitate several existing tools:
It isn't. Miller is just one of many participants in the online analytical processing culture. Other key participants include `awk`, SQL, spreadsheets, etc. etc. etc. Far from being an original concept, Miller explicitly strives to imitate several existing tools:
**The Unix toolkit**: Intentional similarities as described in [Unix-toolkit Context](unix-toolkit-context.md).
@ -10,7 +10,7 @@ Recipes abound for command-line data analysis using the Unix toolkit. Here are j
* [http://www.gregreda.com/2013/07/15/unix-commands-for-data-science](http://www.gregreda.com/2013/07/15/unix-commands-for-data-science)
* [https://github.com/dbohdan/structured-text-tools](https://github.com/dbohdan/structured-text-tools)
**RecordStream**: Miller owes particular inspiration to [RecordStream](https://github.com/benbernard/RecordStream). The key difference is that RecordStream is a Perl-based tool for manipulating JSON (including requiring it to separately manipulate other formats such as CSV into and out of JSON), while Miller is fast Go which handles its formats natively. The similarities include the `sort`, `stats1` (analog of RecordStream's `collate`), and `delta` operations, as well as `filter` and `put`, and pretty-print formatting.
**RecordStream**: Miller owes particular inspiration to [RecordStream](https://github.com/benbernard/RecordStream). The key difference is that RecordStream is a Perl-based tool for manipulating JSON (including requiring it to separately manipulate other formats such as CSV into and out of JSON), while Miller is a fast Go tool that handles its formats natively. The similarities include the `sort`, `stats1` (analogous to RecordStream's `collate`), and `delta` operations, as well as `filter` and `put`, and the use of pretty-print formatting.
**stats_m**: A third source of lineage is my Python [stats_m](https://github.com/johnkerl/scripts-math/tree/master/stats) module. This includes simple single-pass algorithms which form Miller's `stats1` and `stats2` subcommands.
@ -19,21 +19,21 @@ Recipes abound for command-line data analysis using the Unix toolkit. Here are j
**Added value**: Miller's added values include:
* Name-indexing, compared to the Unix toolkit's positional indexing.
* Raw speed, compared to `awk`, RecordStream, `stats_m`, or various other kinds of Python/Ruby/etc. scripts one can easily create.
* Raw speed, compared to `awk`, RecordStream, `stats_m`, or various other kinds of Python/Ruby/etc. scripts that one can easily create.
* Compact keystroking for many common tasks, with a decent amount of flexibility.
* Ability to handle text files on the Unix pipe, without need for creating database tables, compared to SQL databases.
* Ability to handle text files on the Unix pipe, without the need for creating database tables, compared to SQL databases.
* Various file formats, and on-the-fly format conversion.
**jq**: Miller does for name-indexed text what [jq](https://stedolan.github.io/jq/) does for JSON. If you're not already familiar with `jq`, please check it out!.
**What about similar tools?**
Here's a comprehensive list: [https://github.com/dbohdan/structured-text-tools](https://github.com/dbohdan/structured-text-tools). Last I knew it doesn't mention [rows](https://github.com/turicas/rows) so here's a plug for that as well. As it turns out, I learned about most of these after writing Miller.
Here's a comprehensive list: [https://github.com/dbohdan/structured-text-tools](https://github.com/dbohdan/structured-text-tools). Last I knew, it doesn't mention [rows](https://github.com/turicas/rows) so here's a plug for that as well. As it turns out, I learned about most of these after writing Miller.
**What about DOTADIW?** One of the key points of the [Unix philosophy](http://en.wikipedia.org/wiki/Unix_philosophy) is that a tool should do one thing and do it well. Hence `sort` and `cut` do just one thing. Why does Miller put `awk`-like processing, a few SQL-like operations, and statistical reduction all into one tool? This is a fair question. First note that many standard tools, such as `awk` and `perl`, do quite a few things -- as does `jq`. But I could have pushed for putting format awareness and name-indexing options into `cut`, `awk`, and so on (so you could do `cut -f hostname,uptime` or `awk '{sum += $x*$y}END{print sum}'`). Patching `cut`, `sort`, etc. on multiple operating systems is a non-starter in terms of uptake. Moreover, it makes sense for me to have Miller be a tool which collects together format-aware record-stream processing into one place, with good reuse of Miller-internal library code for its various features.
**What about DOTADIW?** One of the key points of the [Unix philosophy](http://en.wikipedia.org/wiki/Unix_philosophy) is that a tool should do one thing and do it well. Hence, `sort` and `cut` do just one thing. Why does Miller put `awk`-like processing, a few SQL-like operations, and statistical reduction all into one tool? This is a fair question. First, note that many standard tools, such as `awk` and `perl`, do quite a few things -- as does `jq`. But I could have pushed for putting format awareness and name-indexing options into `cut`, `awk`, and so on (so you could do `cut -f hostname,uptime` or `awk '{sum += $x*$y}END{print sum}'`). Patching `cut`, `sort`, etc., on multiple operating systems is a non-starter in terms of uptake. Moreover, it makes sense for me to have Miller be a tool that collects together format-aware record-stream processing into one place, with good reuse of Miller's internal library code for its various features.
**Why not use Perl/Python/Ruby etc.?** Maybe you should. With those tools you'll get far more expressive power, and sufficiently quick turnaround time for small-to-medium-sized data. Using Miller you'll get something less than a complete programming language, but which is fast, with moderate amounts of flexibility and much less keystroking.
**Why not use Perl/Python/Ruby, etc.?** Maybe you should. With those tools, you'll gain significantly more expressive power and a sufficiently quick turnaround time for small to medium-sized datasets. Using Miller, you'll get something less than a complete programming language, but which is fast, with moderate amounts of flexibility and much less keystroking.
When I was first developing Miller I made a survey of several languages. Using low-level implementation languages like C, Go, Rust, and Nim, I'd need to create my own domain-specific language (DSL) which would always be less featured than a full programming language, but I'd get better performance. Using high-level interpreted languages such as Perl/Python/Ruby I'd get the language's `eval` for free and I wouldn't need a DSL; Miller would have mainly been a set of format-specific I/O hooks. If I'd gotten good enough performance from the latter I'd have done it without question and Miller would be far more flexible. But low-level languages win the performance criteria by a landslide so we have Miller in Go with a custom DSL.
When I was first developing Miller, I made a survey of several languages. Using low-level implementation languages like C, Go, Rust, and Nim, I'd need to create my own domain-specific language (DSL), which would always be less featured than a full programming language, but I'd get better performance. Using high-level interpreted languages such as Perl/Python/Ruby, I'd get the language's `eval` for free and I wouldn't need a DSL; Miller would have mainly been a set of format-specific I/O hooks. If I'd gotten good enough performance from the latter, I'd have done it without question, and Miller would be far more flexible. But low-level languages win the performance criteria by a landslide, so we have Miller in Go with a custom DSL.
**No, really, why one more command-line data-manipulation tool?** I wrote Miller because I was frustrated with tools like `grep`, `sed`, and so on being *line-aware* without being *format-aware*. The single most poignant example I can think of is seeing people grep data lines out of their CSV files and sadly losing their header lines. While some lighter-than-SQL processing is very nice to have, at core I wanted the format-awareness of [RecordStream](https://github.com/benbernard/RecordStream) combined with the raw speed of the Unix toolkit. Miller does precisely that.
**No, really, why one more command-line data-manipulation tool?** I wrote Miller because I was frustrated with tools like `grep`, `sed`, and so on being *line-aware* without being *format-aware*. The single most poignant example I can think of is seeing people grep data lines from their CSV files and sadly losing their header lines. While some lighter-than-SQL processing is very nice to have, at core I wanted the format-awareness of [RecordStream](https://github.com/benbernard/RecordStream) combined with the raw speed of the Unix toolkit. Miller does precisely that.

View file

@ -50,7 +50,7 @@ described below:
* Suppression/unsuppression:
* `export MLR_NO_COLOR=true` means Miller won't color even when it normally would.
* `export MLR_NO_COLOR=true` or `export NO_COLOR=true` means Miller won't color even when it normally would.
* `export MLR_ALWAYS_COLOR=true` means Miller will color even when it normally would not. For example, you might want to use this when piping `mlr` output to `less -r`.
* Command-line flags `--no-color` or `-M`, `--always-color` or `-C`.
* On Windows, replace `export` with `set`

View file

@ -34,7 +34,7 @@ described below:
* Suppression/unsuppression:
* `export MLR_NO_COLOR=true` means Miller won't color even when it normally would.
* `export MLR_NO_COLOR=true` or `export NO_COLOR=true` means Miller won't color even when it normally would.
* `export MLR_ALWAYS_COLOR=true` means Miller will color even when it normally would not. For example, you might want to use this when piping `mlr` output to `less -r`.
* Command-line flags `--no-color` or `-M`, `--always-color` or `-C`.
* On Windows, replace `export` with `set`

View file

@ -118,9 +118,7 @@ However, if we ask for left-unpaireds, since there's no `color` column, we get a
id,code,color
4,ff0000,red
2,00ff00,green
id,code
3,0000ff
3,0000ff,
</pre>
To fix this, we can use **unsparsify**:

View file

@ -16,12 +16,11 @@ Quick links:
</div>
# Record-heterogeneity
We think of CSV tables as rectangular: if there are 17 columns in the header
then there are 17 columns for every row, else the data have a formatting error.
We think of CSV tables as rectangular: if there are 17 columns in the header, then there are 17 columns for every row, else the data has a formatting error.
But heterogeneous data abound -- log-file entries, JSON documents, no-SQL
databases such as MongoDB, etc. -- not to mention **data-cleaning
opportunities** we'll look at in this page. Miller offers several ways to
opportunities** we'll look at on this page. Miller offers several ways to
handle data heterogeneity.
## Terminology, examples, and solutions
@ -56,7 +55,7 @@ It has three records (written here using JSON Lines formatting):
Here every row has the same keys, in the same order: `a,b,c`.
These are also sometimes called **rectangular** since if we pretty-print them we get a nice rectangle:
These are also sometimes called **rectangular** since if we pretty-print them, we get a nice rectangle:
<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint cat data/het/hom.csv</b>
@ -94,7 +93,7 @@ a,b,c
This example is still homogeneous, though: every row has the same keys, in the same order: `a,b,c`.
Empty values don't make the data heterogeneous.
Note however that we can use the [`fill-empty`](reference-verbs.md#fill-empty) verb to make these
Note, however, that we can use the [`fill-empty`](reference-verbs.md#fill-empty) verb to make these
values non-empty, if we like:
<pre class="pre-highlight-in-pair">
@ -109,7 +108,7 @@ filler 8 9
### Ragged data
Next let's look at non-well-formed CSV files. For a third example:
Next, let's look at non-well-formed CSV files. For a third example:
<pre class="pre-highlight-in-pair">
<b>cat data/het/ragged.csv</b>
@ -127,18 +126,14 @@ If you `mlr --csv cat` this, you'll get an error message:
<b>mlr --csv cat data/het/ragged.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b,c
1,2,3
mlr: mlr: CSV header/data length mismatch 3 != 2 at filename data/het/ragged.csv row 3.
.
</pre>
There are two kinds of raggedness here. Since CSVs form records by zipping the
keys from the header line together with the values from each data line, the
second record has a missing value for key `c` (which ought to be fillable),
while the third record has a value `10` with no key for it.
There are two kinds of raggedness here. Since CSVs form records by zipping the keys from the header line, together with the values from each data line, the second record has a missing value for key `c` (which ought to be fillable), while the third record has a value `10` with no key for it.
Using the [`--allow-ragged-csv-input` flag](reference-main-flag-list.md#csv-only-flags)
we can fill values in too-short rows, and provide a key (column number starting
with 1) for too-long rows:
Using the [`--allow-ragged-csv-input` flag](reference-main-flag-list.md#csv-only-flags), we can fill values in too-short rows and provide a key (column number starting with 1) for too-long rows:
<pre class="pre-highlight-in-pair">
<b>mlr --icsv --ojson --allow-ragged-csv-input cat data/het/ragged.csv</b>
@ -152,8 +147,7 @@ with 1) for too-long rows:
},
{
"a": 4,
"b": 5,
"c": ""
"b": 5
},
{
"a": 7,
@ -186,7 +180,7 @@ This kind of data arises often in practice. One reason is that, while many
programming languages (including the Miller DSL) [preserve insertion
order](reference-main-maps.md#insertion-order-is-preserved) in maps; others do
not. So someone might have written `{"a":4,"b":5,"c":6}` in the source code,
but the data may not have printed that way into a given data file.
but the data may not have been printed that way into a given data file.
We can use the [`regularize`](reference-verbs.md#regularize) or
[`sort-within-records`](reference-verbs.md#sort-within-records) verb to order
@ -203,13 +197,13 @@ the keys:
The `regularize` verb tries to re-order subsequent rows to look like the first
(whatever order that is); the `sort-within-records` verb simply uses
alphabetical order (which is the same in the above example where the first
alphabetical order (which is the same in the above example, where the first
record has keys in the order `a,b,c`).
### Sparse data
Here's another frequently occurring situation -- quite often, systems will log
data for items which are present, but won't log data for items which aren't.
data for items that are present, but won't log data for items that aren't.
<pre class="pre-highlight-in-pair">
<b>mlr --json cat data/het/sparse.json</b>
@ -236,8 +230,7 @@ data for items which are present, but won't log data for items which aren't.
This data is called **sparse** (from the [data-storage term](https://en.wikipedia.org/wiki/Sparse_matrix)).
We can use the [`unsparsify`](reference-verbs.md#unsparsify) verb to make sure
every record has the same keys:
We can use the [`unsparsify`](reference-verbs.md#unsparsify) verb to make sure every record has the same keys:
<pre class="pre-highlight-in-pair">
<b>mlr --json unsparsify data/het/sparse.json</b>
@ -282,12 +275,11 @@ xy55.east - /dev/sda1 failover true
## Reading and writing heterogeneous data
In the previous sections we saw different kinds of data heterogeneity, and ways
to transform the data to make it homogeneous.
In the previous sections, we saw different kinds of data heterogeneity and ways to transform the data to make it homogeneous.
### Non-rectangular file formats: JSON, XTAB, NIDX, DKVP
For these formats, record-heterogeneity comes naturally:
For these formats, record heterogeneity comes naturally:
<pre class="pre-highlight-in-pair">
<b>cat data/het/sparse.json</b>
@ -371,16 +363,15 @@ record_count=150,resource=/path/to/second/file
### Rectangular file formats: CSV and pretty-print
CSV and pretty-print formats expect rectangular structure. But Miller lets you
CSV and pretty-print formats expect a rectangular structure. But Miller lets you
process non-rectangular using CSV and pretty-print.
Miller simply prints a newline and a new header when there is a schema change
-- where by _schema_ we mean simply the list of record keys in the order they
are encountered. When there is no schema change, you get CSV per se as a
special case. Likewise, Miller reads heterogeneous CSV or pretty-print input
the same way. The difference between CSV and CSV-lite is that the former is
[RFC-4180-compliant](file-formats.md#csvtsvasvusvetc), while the latter readily
handles heterogeneous data (which is non-compliant). For example:
For CSV-lite and TSV-lite, Miller prints a newline and a new header when there is a schema
change -- where by _schema_ we mean the list of record keys in the order they are
encountered. When there is no schema change, you get CSV per se as a special case. Likewise, Miller
reads heterogeneous CSV or pretty-print input the same way. The difference between CSV and CSV-lite
is that the former is [RFC-4180-compliant](file-formats.md#csvtsvasvusvetc), while the latter
readily handles heterogeneous data (which is non-compliant). For example:
<pre class="pre-highlight-in-pair">
<b>cat data/het.json</b>
@ -445,28 +436,52 @@ record_count resource
150 /path/to/second/file
</pre>
Miller handles explicit header changes as just shown. If your CSV input contains ragged data -- if there are implicit header changes (no intervening blank line and new header line) as seen above -- you can use `--allow-ragged-csv-input` (or keystroke-saver `--ragged`).
<pre class="pre-highlight-in-pair">
<b>mlr --ijson --ocsvlite group-like data/het.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
resource,loadsec,ok
/path/to/file,0.45,true
/path/to/second/file,0.32,true
/some/other/path,0.97,false
record_count,resource
100,/path/to/file
150,/path/to/second/file
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --csv --ragged cat data/het/ragged.csv</b>
<b>mlr --ijson --ocsv group-like data/het.json</b>
</pre>
<pre class="pre-non-highlight-in-pair">
resource,loadsec,ok
/path/to/file,0.45,true
/path/to/second/file,0.32,true
/some/other/path,0.97,false
mlr: CSV schema change: first keys "resource,loadsec,ok"; current keys "record_count,resource"
mlr: exiting due to data error.
</pre>
Miller handles explicit header changes as shown. If your CSV input contains ragged data -- if there are implicit header changes (no intervening blank line and new header line) as seen above -- you can use `--allow-ragged-csv-input` (or keystroke-saver `--ragged`).
<pre class="pre-highlight-in-pair">
<b>mlr --csv --allow-ragged-csv-input cat data/het/ragged.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a,b,c
1,2,3
4,5,
a,b,c,4
7,8,9,10
</pre>
## Processing heterogeneous data
Above we saw how to make heterogeneous data homogeneous, and then how to print heterogeneous data.
As for other processing, record-heterogeneity is not a problem for Miller.
As for other processing, record heterogeneity is not a problem for Miller.
Miller operates on specified fields and takes the rest along: for example, if
you are sorting on the `count` field then all records in the input stream must
have a `count` field but the other fields can vary, and moreover the sorted-on
you are sorting on the `count` field, then all records in the input stream must
have a `count` field, but the other fields can vary---and moreover the sorted-on
field name(s) don't need to be in the same position on each line:
<pre class="pre-highlight-in-pair">

View file

@ -1,11 +1,10 @@
# Record-heterogeneity
We think of CSV tables as rectangular: if there are 17 columns in the header
then there are 17 columns for every row, else the data have a formatting error.
We think of CSV tables as rectangular: if there are 17 columns in the header, then there are 17 columns for every row, else the data has a formatting error.
But heterogeneous data abound -- log-file entries, JSON documents, no-SQL
databases such as MongoDB, etc. -- not to mention **data-cleaning
opportunities** we'll look at in this page. Miller offers several ways to
opportunities** we'll look at on this page. Miller offers several ways to
handle data heterogeneity.
## Terminology, examples, and solutions
@ -29,7 +28,7 @@ GENMD-EOF
Here every row has the same keys, in the same order: `a,b,c`.
These are also sometimes called **rectangular** since if we pretty-print them we get a nice rectangle:
These are also sometimes called **rectangular** since if we pretty-print them, we get a nice rectangle:
GENMD-RUN-COMMAND
mlr --icsv --opprint cat data/het/hom.csv
@ -50,7 +49,7 @@ GENMD-EOF
This example is still homogeneous, though: every row has the same keys, in the same order: `a,b,c`.
Empty values don't make the data heterogeneous.
Note however that we can use the [`fill-empty`](reference-verbs.md#fill-empty) verb to make these
Note, however, that we can use the [`fill-empty`](reference-verbs.md#fill-empty) verb to make these
values non-empty, if we like:
GENMD-RUN-COMMAND
@ -59,7 +58,7 @@ GENMD-EOF
### Ragged data
Next let's look at non-well-formed CSV files. For a third example:
Next, let's look at non-well-formed CSV files. For a third example:
GENMD-RUN-COMMAND
cat data/het/ragged.csv
@ -71,14 +70,9 @@ GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr --csv cat data/het/ragged.csv
GENMD-EOF
There are two kinds of raggedness here. Since CSVs form records by zipping the
keys from the header line together with the values from each data line, the
second record has a missing value for key `c` (which ought to be fillable),
while the third record has a value `10` with no key for it.
There are two kinds of raggedness here. Since CSVs form records by zipping the keys from the header line, together with the values from each data line, the second record has a missing value for key `c` (which ought to be fillable), while the third record has a value `10` with no key for it.
Using the [`--allow-ragged-csv-input` flag](reference-main-flag-list.md#csv-only-flags)
we can fill values in too-short rows, and provide a key (column number starting
with 1) for too-long rows:
Using the [`--allow-ragged-csv-input` flag](reference-main-flag-list.md#csv-only-flags), we can fill values in too-short rows and provide a key (column number starting with 1) for too-long rows:
GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr --icsv --ojson --allow-ragged-csv-input cat data/het/ragged.csv
@ -101,7 +95,7 @@ This kind of data arises often in practice. One reason is that, while many
programming languages (including the Miller DSL) [preserve insertion
order](reference-main-maps.md#insertion-order-is-preserved) in maps; others do
not. So someone might have written `{"a":4,"b":5,"c":6}` in the source code,
but the data may not have printed that way into a given data file.
but the data may not have been printed that way into a given data file.
We can use the [`regularize`](reference-verbs.md#regularize) or
[`sort-within-records`](reference-verbs.md#sort-within-records) verb to order
@ -113,13 +107,13 @@ GENMD-EOF
The `regularize` verb tries to re-order subsequent rows to look like the first
(whatever order that is); the `sort-within-records` verb simply uses
alphabetical order (which is the same in the above example where the first
alphabetical order (which is the same in the above example, where the first
record has keys in the order `a,b,c`).
### Sparse data
Here's another frequently occurring situation -- quite often, systems will log
data for items which are present, but won't log data for items which aren't.
data for items that are present, but won't log data for items that aren't.
GENMD-RUN-COMMAND
mlr --json cat data/het/sparse.json
@ -127,8 +121,7 @@ GENMD-EOF
This data is called **sparse** (from the [data-storage term](https://en.wikipedia.org/wiki/Sparse_matrix)).
We can use the [`unsparsify`](reference-verbs.md#unsparsify) verb to make sure
every record has the same keys:
We can use the [`unsparsify`](reference-verbs.md#unsparsify) verb to make sure every record has the same keys:
GENMD-RUN-COMMAND
mlr --json unsparsify data/het/sparse.json
@ -142,12 +135,11 @@ GENMD-EOF
## Reading and writing heterogeneous data
In the previous sections we saw different kinds of data heterogeneity, and ways
to transform the data to make it homogeneous.
In the previous sections, we saw different kinds of data heterogeneity and ways to transform the data to make it homogeneous.
### Non-rectangular file formats: JSON, XTAB, NIDX, DKVP
For these formats, record-heterogeneity comes naturally:
For these formats, record heterogeneity comes naturally:
GENMD-RUN-COMMAND
cat data/het/sparse.json
@ -177,16 +169,15 @@ GENMD-EOF
### Rectangular file formats: CSV and pretty-print
CSV and pretty-print formats expect rectangular structure. But Miller lets you
CSV and pretty-print formats expect a rectangular structure. But Miller lets you
process non-rectangular using CSV and pretty-print.
Miller simply prints a newline and a new header when there is a schema change
-- where by _schema_ we mean simply the list of record keys in the order they
are encountered. When there is no schema change, you get CSV per se as a
special case. Likewise, Miller reads heterogeneous CSV or pretty-print input
the same way. The difference between CSV and CSV-lite is that the former is
[RFC-4180-compliant](file-formats.md#csvtsvasvusvetc), while the latter readily
handles heterogeneous data (which is non-compliant). For example:
For CSV-lite and TSV-lite, Miller prints a newline and a new header when there is a schema
change -- where by _schema_ we mean the list of record keys in the order they are
encountered. When there is no schema change, you get CSV per se as a special case. Likewise, Miller
reads heterogeneous CSV or pretty-print input the same way. The difference between CSV and CSV-lite
is that the former is [RFC-4180-compliant](file-formats.md#csvtsvasvusvetc), while the latter
readily handles heterogeneous data (which is non-compliant). For example:
GENMD-RUN-COMMAND
cat data/het.json
@ -200,20 +191,28 @@ GENMD-RUN-COMMAND
mlr --ijson --opprint group-like data/het.json
GENMD-EOF
Miller handles explicit header changes as just shown. If your CSV input contains ragged data -- if there are implicit header changes (no intervening blank line and new header line) as seen above -- you can use `--allow-ragged-csv-input` (or keystroke-saver `--ragged`).
GENMD-RUN-COMMAND
mlr --ijson --ocsvlite group-like data/het.json
GENMD-EOF
GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr --csv --ragged cat data/het/ragged.csv
mlr --ijson --ocsv group-like data/het.json
GENMD-EOF
Miller handles explicit header changes as shown. If your CSV input contains ragged data -- if there are implicit header changes (no intervening blank line and new header line) as seen above -- you can use `--allow-ragged-csv-input` (or keystroke-saver `--ragged`).
GENMD-RUN-COMMAND
mlr --csv --allow-ragged-csv-input cat data/het/ragged.csv
GENMD-EOF
## Processing heterogeneous data
Above we saw how to make heterogeneous data homogeneous, and then how to print heterogeneous data.
As for other processing, record-heterogeneity is not a problem for Miller.
As for other processing, record heterogeneity is not a problem for Miller.
Miller operates on specified fields and takes the rest along: for example, if
you are sorting on the `count` field then all records in the input stream must
have a `count` field but the other fields can vary, and moreover the sorted-on
you are sorting on the `count` field, then all records in the input stream must
have a `count` field, but the other fields can vary---and moreover the sorted-on
field name(s) don't need to be in the same position on each line:
GENMD-RUN-COMMAND

View file

@ -16,9 +16,7 @@ Quick links:
</div>
# DSL built-in functions
These are functions in the [Miller programming language](miller-programming-language.md)
that you can call when you use `mlr put` and `mlr filter`. For example, when you type
These are functions in the [Miller programming language](miller-programming-language.md) that you can call when you use `mlr put` and `mlr filter`. For example, when you type
<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv put '</b>
<b> $color = toupper($color);</b>
@ -43,26 +41,13 @@ the `toupper` and `gsub` bits are _functions_.
## Overview
At the command line, you can use `mlr -f` and `mlr -F` for information much
like what's on this page.
At the command line, you can use `mlr -f` and `mlr -F` for information much like what's on this page.
Each function takes a specific number of arguments, as shown below, except for
functions marked as variadic such as `min` and `max`. (The latter compute min
and max of any number of arguments.) There is no notion of optional or
default-on-absent arguments. All argument-passing is positional rather than by
name; arguments are passed by value, not by reference.
Each function takes a specific number of arguments, as shown below, except for functions marked as variadic, such as `min` and `max`. (The latter compute the min and max of any number of arguments.) There is no notion of optional or default-on-absent arguments. All argument-passing is positional rather than by name; arguments are passed by value, not by reference.
At the command line, you can get a list of all functions using `mlr -f`, with
details using `mlr -F`. (Or, `mlr help usage-functions-by-class` to get
details in the order shown on this page.) You can get detail for a given
function using `mlr help function namegoeshere`, e.g. `mlr help function
gsub`.
At the command line, you can get a list of all functions using `mlr -f`, with details using `mlr -F`. (Or, `mlr help usage-functions-by-class` to get details in the order shown on this page.) You can get details for a given function using `mlr help function namegoeshere`, e.g., `mlr help function gsub`.
Operators are listed here along with functions. In this case, the
argument-count is the number of items involved in the infix operator, e.g. we
say `x+y` so the details for the `+` operator say that its number of arguments
is 2. Unary operators such as `!` and `~` show argument-count of 1; the ternary
`? :` operator shows an argument-count of 3.
Operators are listed here along with functions. In this case, the argument count refers to the number of items involved in the infix operator. For example, we say `x+y`, so the details for the `+` operator indicate that it has two arguments. Unary operators such as `!` and `~` show argument-count of 1; the ternary `? :` operator shows an argument count of 3.
## Functions by class
@ -74,9 +59,10 @@ is 2. Unary operators such as `!` and `~` show argument-count of 1; the ternary
* [**Hashing functions**](#hashing-functions): [md5](#md5), [sha1](#sha1), [sha256](#sha256), [sha512](#sha512).
* [**Higher-order-functions functions**](#higher-order-functions-functions): [any](#any), [apply](#apply), [every](#every), [fold](#fold), [reduce](#reduce), [select](#select), [sort](#sort).
* [**Math functions**](#math-functions): [abs](#abs), [acos](#acos), [acosh](#acosh), [asin](#asin), [asinh](#asinh), [atan](#atan), [atan2](#atan2), [atanh](#atanh), [cbrt](#cbrt), [ceil](#ceil), [cos](#cos), [cosh](#cosh), [erf](#erf), [erfc](#erfc), [exp](#exp), [expm1](#expm1), [floor](#floor), [invqnorm](#invqnorm), [log](#log), [log10](#log10), [log1p](#log1p), [logifit](#logifit), [max](#max), [min](#min), [qnorm](#qnorm), [round](#round), [roundm](#roundm), [sgn](#sgn), [sin](#sin), [sinh](#sinh), [sqrt](#sqrt), [tan](#tan), [tanh](#tanh), [urand](#urand), [urand32](#urand32), [urandelement](#urandelement), [urandint](#urandint), [urandrange](#urandrange).
* [**String functions**](#string-functions): [capitalize](#capitalize), [clean_whitespace](#clean_whitespace), [collapse_whitespace](#collapse_whitespace), [format](#format), [gssub](#gssub), [gsub](#gsub), [index](#index), [latin1_to_utf8](#latin1_to_utf8), [leftpad](#leftpad), [lstrip](#lstrip), [regextract](#regextract), [regextract_or_else](#regextract_or_else), [rightpad](#rightpad), [rstrip](#rstrip), [ssub](#ssub), [strip](#strip), [strlen](#strlen), [sub](#sub), [substr](#substr), [substr0](#substr0), [substr1](#substr1), [tolower](#tolower), [toupper](#toupper), [truncate](#truncate), [unformat](#unformat), [unformatx](#unformatx), [utf8_to_latin1](#utf8_to_latin1), [\.](#dot).
* [**System functions**](#system-functions): [exec](#exec), [hostname](#hostname), [os](#os), [system](#system), [version](#version).
* [**Time functions**](#time-functions): [dhms2fsec](#dhms2fsec), [dhms2sec](#dhms2sec), [fsec2dhms](#fsec2dhms), [fsec2hms](#fsec2hms), [gmt2localtime](#gmt2localtime), [gmt2sec](#gmt2sec), [hms2fsec](#hms2fsec), [hms2sec](#hms2sec), [localtime2gmt](#localtime2gmt), [localtime2sec](#localtime2sec), [sec2dhms](#sec2dhms), [sec2gmt](#sec2gmt), [sec2gmtdate](#sec2gmtdate), [sec2hms](#sec2hms), [sec2localdate](#sec2localdate), [sec2localtime](#sec2localtime), [strftime](#strftime), [strftime_local](#strftime_local), [strptime](#strptime), [strptime_local](#strptime_local), [systime](#systime), [systimeint](#systimeint), [uptime](#uptime).
* [**Stats functions**](#stats-functions): [antimode](#antimode), [count](#count), [distinct_count](#distinct_count), [kurtosis](#kurtosis), [maxlen](#maxlen), [mean](#mean), [meaneb](#meaneb), [median](#median), [minlen](#minlen), [mode](#mode), [null_count](#null_count), [percentile](#percentile), [percentiles](#percentiles), [skewness](#skewness), [sort_collection](#sort_collection), [stddev](#stddev), [sum](#sum), [sum2](#sum2), [sum3](#sum3), [sum4](#sum4), [variance](#variance).
* [**String functions**](#string-functions): [capitalize](#capitalize), [clean_whitespace](#clean_whitespace), [collapse_whitespace](#collapse_whitespace), [contains](#contains), [format](#format), [gssub](#gssub), [gsub](#gsub), [index](#index), [latin1_to_utf8](#latin1_to_utf8), [leftpad](#leftpad), [lstrip](#lstrip), [regextract](#regextract), [regextract_or_else](#regextract_or_else), [rightpad](#rightpad), [rstrip](#rstrip), [ssub](#ssub), [strip](#strip), [strlen](#strlen), [strmatch](#strmatch), [strmatchx](#strmatchx), [sub](#sub), [substr](#substr), [substr0](#substr0), [substr1](#substr1), [tolower](#tolower), [toupper](#toupper), [truncate](#truncate), [unformat](#unformat), [unformatx](#unformatx), [utf8_to_latin1](#utf8_to_latin1), [\.](#dot).
* [**System functions**](#system-functions): [exec](#exec), [hostname](#hostname), [os](#os), [stat](#stat), [system](#system), [version](#version).
* [**Time functions**](#time-functions): [dhms2fsec](#dhms2fsec), [dhms2sec](#dhms2sec), [fsec2dhms](#fsec2dhms), [fsec2hms](#fsec2hms), [gmt2localtime](#gmt2localtime), [gmt2nsec](#gmt2nsec), [gmt2sec](#gmt2sec), [hms2fsec](#hms2fsec), [hms2sec](#hms2sec), [localtime2gmt](#localtime2gmt), [localtime2nsec](#localtime2nsec), [localtime2sec](#localtime2sec), [nsec2gmt](#nsec2gmt), [nsec2gmtdate](#nsec2gmtdate), [nsec2localdate](#nsec2localdate), [nsec2localtime](#nsec2localtime), [sec2dhms](#sec2dhms), [sec2gmt](#sec2gmt), [sec2gmtdate](#sec2gmtdate), [sec2hms](#sec2hms), [sec2localdate](#sec2localdate), [sec2localtime](#sec2localtime), [strfntime](#strfntime), [strfntime_local](#strfntime_local), [strftime](#strftime), [strftime_local](#strftime_local), [strpntime](#strpntime), [strpntime_local](#strpntime_local), [strptime](#strptime), [strptime_local](#strptime_local), [sysntime](#sysntime), [systime](#systime), [systimeint](#systimeint), [upntime](#upntime), [uptime](#uptime).
* [**Typing functions**](#typing-functions): [asserting_absent](#asserting_absent), [asserting_array](#asserting_array), [asserting_bool](#asserting_bool), [asserting_boolean](#asserting_boolean), [asserting_empty](#asserting_empty), [asserting_empty_map](#asserting_empty_map), [asserting_error](#asserting_error), [asserting_float](#asserting_float), [asserting_int](#asserting_int), [asserting_map](#asserting_map), [asserting_nonempty_map](#asserting_nonempty_map), [asserting_not_array](#asserting_not_array), [asserting_not_empty](#asserting_not_empty), [asserting_not_map](#asserting_not_map), [asserting_not_null](#asserting_not_null), [asserting_null](#asserting_null), [asserting_numeric](#asserting_numeric), [asserting_present](#asserting_present), [asserting_string](#asserting_string), [is_absent](#is_absent), [is_array](#is_array), [is_bool](#is_bool), [is_boolean](#is_boolean), [is_empty](#is_empty), [is_empty_map](#is_empty_map), [is_error](#is_error), [is_float](#is_float), [is_int](#is_int), [is_map](#is_map), [is_nan](#is_nan), [is_nonempty_map](#is_nonempty_map), [is_not_array](#is_not_array), [is_not_empty](#is_not_empty), [is_not_map](#is_not_map), [is_not_null](#is_not_null), [is_null](#is_null), [is_numeric](#is_numeric), [is_present](#is_present), [is_string](#is_string), [typeof](#typeof).
## Arithmetic functions
@ -533,9 +519,14 @@ $* = fmtifnum($*, "%.6f") formats numeric fields in the current record, leaving
### fmtnum
<pre class="pre-non-highlight-non-pair">
fmtnum (class=conversion #args=2) Convert int/float/bool to string using printf-style format string (https://pkg.go.dev/fmt), e.g. '$s = fmtnum($n, "%08d")' or '$t = fmtnum($n, "%.6e")'. This function recurses on array and map values.
Example:
$x = fmtnum($x, "%.6f")
fmtnum (class=conversion #args=2) Convert int/float/bool to string using printf-style format string (https://pkg.go.dev/fmt), e.g. '$s = fmtnum($n, "%08d")' or '$t = fmtnum($n, "%.6e")'. Miller-specific extension: "%_d" and "%_f" for comma-separated thousands. This function recurses on array and map values.
Examples:
$y = fmtnum($x, "%.6f")
$o = fmtnum($n, "%d")
$o = fmtnum($n, "%12d")
$y = fmtnum($x, "%.6_f")
$o = fmtnum($n, "%_d")
$o = fmtnum($n, "%12_d")
</pre>
@ -877,13 +868,13 @@ logifit (class=math #args=3) Given m and b from logistic regression, compute fi
### max
<pre class="pre-non-highlight-non-pair">
max (class=math #args=variadic) Max of n numbers; null loses.
max (class=math #args=variadic) Max of n numbers; null loses. The min and max functions also recurse into arrays and maps, so they can be used to get min/max stats on array/map values.
</pre>
### min
<pre class="pre-non-highlight-non-pair">
min (class=math #args=variadic) Min of n numbers; null loses.
min (class=math #args=variadic) Min of n numbers; null loses. The min and max functions also recurse into arrays and maps, so they can be used to get min/max stats on array/map values.
</pre>
@ -972,6 +963,231 @@ urandint (class=math #args=2) Integer uniformly distributed between inclusive i
urandrange (class=math #args=2) Floating-point numbers uniformly distributed on the interval [a, b).
</pre>
## Stats functions
### antimode
<pre class="pre-non-highlight-non-pair">
antimode (class=stats #args=1) Returns the least frequently occurring value in an array or map. Returns error for non-array/non-map types. Values are stringified for comparison, so for example string "1" and integer 1 are not distinct. In cases of ties, first-found wins.
Examples:
antimode([3,3,4,4,4]) is 3
antimode([3,3,4,4]) is 3
</pre>
### count
<pre class="pre-non-highlight-non-pair">
count (class=stats #args=1) Returns the length of an array or map. Returns error for non-array/non-map types.
Examples:
count([7,8,9]) is 3
count({"a":7,"b":8,"c":9}) is 3
</pre>
### distinct_count
<pre class="pre-non-highlight-non-pair">
distinct_count (class=stats #args=1) Returns the number of disinct values in an array or map. Returns error for non-array/non-map types. Values are stringified for comparison, so for example string "1" and integer 1 are not distinct.
Examples:
distinct_count([7,8,9,7]) is 3
distinct_count([1,"1"]) is 1
distinct_count([1,1.0]) is 2
</pre>
### kurtosis
<pre class="pre-non-highlight-non-pair">
kurtosis (class=stats #args=1) Returns the sample kurtosis of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
kurtosis([4,5,9,10,11]) is -1.6703688
</pre>
### maxlen
<pre class="pre-non-highlight-non-pair">
maxlen (class=stats #args=1) Returns the maximum string length of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
maxlen(["año", "alto"]) is 4
</pre>
### mean
<pre class="pre-non-highlight-non-pair">
mean (class=stats #args=1) Returns the arithmetic mean of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types.
Example:
mean([4,5,7,10]) is 6.5
</pre>
### meaneb
<pre class="pre-non-highlight-non-pair">
meaneb (class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
meaneb([4,5,7,10]) is 1.3228756
</pre>
### median
<pre class="pre-non-highlight-non-pair">
median (class=stats #args=1,2) Returns the median of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. Please see the percentiles function for information on optional flags, and on performance for large inputs.
Examples:
median([3,4,5,6,9,10]) is 6
median([3,4,5,6,9,10],{"interpolate_linearly":true}) is 5.5
median(["abc", "def", "ghi", "ghi"]) is "ghi"
</pre>
### minlen
<pre class="pre-non-highlight-non-pair">
minlen (class=stats #args=1) Returns the minimum string length of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
minlen(["año", "alto"]) is 3
</pre>
### mode
<pre class="pre-non-highlight-non-pair">
mode (class=stats #args=1) Returns the most frequently occurring value in an array or map. Returns error for non-array/non-map types. Values are stringified for comparison, so for example string "1" and integer 1 are not distinct. In cases of ties, first-found wins.
Examples:
mode([3,3,4,4,4]) is 4
mode([3,3,4,4]) is 3
</pre>
### null_count
<pre class="pre-non-highlight-non-pair">
null_count (class=stats #args=1) Returns the number of values in an array or map which are empty-string (AKA void) or JSON null. Returns error for non-array/non-map types. Values are stringified for comparison, so for example string "1" and integer 1 are not distinct.
Example:
null_count(["a", "", "c"]) is 1
</pre>
### percentile
<pre class="pre-non-highlight-non-pair">
percentile (class=stats #args=2,3) Returns the given percentile of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. Please see the percentiles function for information on optional flags, and on performance for large inputs.
Examples:
percentile([3,4,5,6,9,10], 90) is 10
percentile([3,4,5,6,9,10], 90, {"interpolate_linearly":true}) is 9.5
percentile(["abc", "def", "ghi", "ghi"], 90) is "ghi"
</pre>
### percentiles
<pre class="pre-non-highlight-non-pair">
percentiles (class=stats #args=2,3) Returns the given percentiles of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. See examples for information on the three option flags.
Examples:
Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort the input before computing percentiles:
percentiles([3,4,5,6,9,10], [25,75]) is { "25": 4, "75": 9 }
percentiles(["abc", "def", "ghi", "ghi"], [25,75]) is { "25": "def", "75": "ghi" }
Use "output_array_not_map" (or shorthand "oa") to get the outputs as an array:
percentiles([3,4,5,6,9,10], [25,75], {"output_array_not_map":true}) is [4, 9]
Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces error values on string inputs:
percentiles([3,4,5,6,9,10], [25,75], {"interpolate_linearly":true}) is { "25": 4.25, "75": 8.25 }
The percentiles function always sorts its inputs before computing percentiles. If you know your input is already sorted -- see also the sort_collection function -- then computation will be faster on large input if you pass in "array_is_sorted" (shorthand: "ais"):
x = [6,5,9,10,4,3]
percentiles(x, [25,75], {"ais":true}) gives { "25": 5, "75": 4 } which is incorrect
x = sort_collection(x)
percentiles(x, [25,75], {"ais":true}) gives { "25": 4, "75": 9 } which is correct
You can also leverage this feature to compute percentiles on a sort of your choosing. For example:
Non-sorted input:
x = splitax("the quick brown fox jumped loquaciously over the lazy dogs", " ")
x is: ["the", "quick", "brown", "fox", "jumped", "loquaciously", "over", "the", "lazy", "dogs"]
Percentiles are taken over the original positions of the words in the array -- "dogs" is last and hence appears as p99:
percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "dogs"]
With sorting done inside percentiles, "the" is alphabetically last and is therefore the p99:
percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]
With default sorting done outside percentiles, the same:
x = sort(x) # or x = sort_collection(x)
x is: ["brown", "dogs", "fox", "jumped", "lazy", "loquaciously", "over", "quick", "the", "the"]
percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "the"]
percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]
Now sorting by word length, "loquaciously" is longest and hence is the p99:
x = sort(x, func(a,b) { return strlen(a) <=> strlen(b) } )
x is: ["fox", "the", "the", "dogs", "lazy", "over", "brown", "quick", "jumped", "loquaciously"]
percentiles(x, [50, 99], {"oa":true, "ais":true})
["over", "loquaciously"]
</pre>
### skewness
<pre class="pre-non-highlight-non-pair">
skewness (class=stats #args=1) Returns the sample skewness of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
skewness([4,5,9,10,11]) is -0.2097285
</pre>
### sort_collection
<pre class="pre-non-highlight-non-pair">
sort_collection (class=stats #args=1) This is a helper function for the percentiles function; please see its online help for details.
</pre>
### stddev
<pre class="pre-non-highlight-non-pair">
stddev (class=stats #args=1) Returns the sample standard deviation of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
stddev([4,5,9,10,11]) is 3.1144823
</pre>
### sum
<pre class="pre-non-highlight-non-pair">
sum (class=stats #args=1) Returns the sum of values in an array or map. Returns error for non-array/non-map types.
Example:
sum([1,2,3,4,5]) is 15
</pre>
### sum2
<pre class="pre-non-highlight-non-pair">
sum2 (class=stats #args=1) Returns the sum of squares of values in an array or map. Returns error for non-array/non-map types.
Example:
sum2([1,2,3,4,5]) is 55
</pre>
### sum3
<pre class="pre-non-highlight-non-pair">
sum3 (class=stats #args=1) Returns the sum of cubes of values in an array or map. Returns error for non-array/non-map types.
Example:
sum3([1,2,3,4,5]) is 225
</pre>
### sum4
<pre class="pre-non-highlight-non-pair">
sum4 (class=stats #args=1) Returns the sum of fourth powers of values in an array or map. Returns error for non-array/non-map types.
Example:
sum4([1,2,3,4,5]) is 979
</pre>
### variance
<pre class="pre-non-highlight-non-pair">
variance (class=stats #args=1) Returns the sample variance of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
variance([4,5,9,10,11]) is 9.7
</pre>
## String functions
@ -983,7 +1199,7 @@ capitalize (class=string #args=1) Convert string's first character to uppercase
### clean_whitespace
<pre class="pre-non-highlight-non-pair">
clean_whitespace (class=string #args=1) Same as collapse_whitespace and strip.
clean_whitespace (class=string #args=1) Same as collapse_whitespace and strip, followed by type inference.
</pre>
@ -993,6 +1209,17 @@ collapse_whitespace (class=string #args=1) Strip repeated whitespace from strin
</pre>
### contains
<pre class="pre-non-highlight-non-pair">
contains (class=string #args=2) Returns true if the first argument contains the second as a substring. This is like saying `index(arg1, arg2) >= 0`but with less keystroking.
Examples:
contains("abcde", "e") gives true
contains("abcde", "x") gives false
contains(12345, 34) gives true
contains("forêt", "ê") gives true
</pre>
### format
<pre class="pre-non-highlight-non-pair">
format (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
@ -1028,7 +1255,7 @@ gsub("prefix4529:suffix8567", "(....ix)([0-9]+)", "[\1 : \2]") gives "[prefix :
index (class=string #args=2) Returns the index (1-based) of the second argument within the first. Returns -1 if the second argument isn't a substring of the first. Stringifies non-string inputs. Uses UTF-8 encoding to count characters, not bytes.
Examples:
index("abcde", "e") gives 5
index("abcde", "x") gives 01
index("abcde", "x") gives -1
index(12345, 34) gives 3
index("forêt", "t") gives 5
</pre>
@ -1113,6 +1340,46 @@ strlen (class=string #args=1) String length.
</pre>
### strmatch
<pre class="pre-non-highlight-non-pair">
strmatch (class=string #args=2) Boolean yes/no for whether the stringable first argument matches the regular-expression second argument. No regex captures are provided; please see `strmatch`.
Examples:
strmatch("a", "abc") is false
strmatch("abc", "a") is true
strmatch("abc", "a[a-z]c") is true
strmatch("abc", "(a).(c)") is true
strmatch(12345, "34") is true
</pre>
### strmatchx
<pre class="pre-non-highlight-non-pair">
strmatchx (class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in contrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
Examples:
strmatchx("a", "abc") returns:
{
"matched": false
}
strmatchx("abc", "a") returns:
{
"matched": true,
"full_capture": "a",
"full_start": 1,
"full_end": 1
}
strmatchx("[zy:3458]", "([a-z]+):([0-9]+)") returns:
{
"matched": true,
"full_capture": "zy:3458",
"full_start": 2,
"full_end": 8,
"captures": ["zy", "3458"],
"starts": [2, 5],
"ends": [3, 8]
}
</pre>
### sub
<pre class="pre-non-highlight-non-pair">
sub (class=string #args=3) '$name = sub($name, "old", "new")': replace once (first match, if there are multiple matches), with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to sub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io.
@ -1220,6 +1487,21 @@ os (class=system #args=0) Returns the operating-system name as a string.
</pre>
### stat
<pre class="pre-non-highlight-non-pair">
stat (class=system #args=1) Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.
Examples:
stat("./mlr") gives {
"name": "mlr",
"size": 38391584,
"mode": 0755,
"modtime": 1715207874,
"isdir": false
}
stat("./mlr")["size"] gives 38391584
</pre>
### system
<pre class="pre-non-highlight-non-pair">
system (class=system #args=1) Run command string, yielding its stdout minus final carriage return.
@ -1267,6 +1549,14 @@ gmt2localtime("1999-12-31T22:00:00Z", "Asia/Istanbul") = "2000-01-01 00:00:00"
</pre>
### gmt2nsec
<pre class="pre-non-highlight-non-pair">
gmt2nsec (class=time #args=1) Parses GMT timestamp as integer nanoseconds since the epoch.
Example:
gmt2nsec("2001-02-03T04:05:06Z") = 981173106000000000
</pre>
### gmt2sec
<pre class="pre-non-highlight-non-pair">
gmt2sec (class=time #args=1) Parses GMT timestamp as integer seconds since the epoch.
@ -1296,6 +1586,15 @@ localtime2gmt("2000-01-01 00:00:00", "Asia/Istanbul") = "1999-12-31T22:00:00Z"
</pre>
### localtime2nsec
<pre class="pre-non-highlight-non-pair">
localtime2nsec (class=time #args=1,2) Parses local timestamp as integer nanoseconds since the epoch. Consults $TZ environment variable, unless second argument is supplied.
Examples:
localtime2nsec("2001-02-03 04:05:06") = 981165906000000000 with TZ="Asia/Istanbul"
localtime2nsec("2001-02-03 04:05:06", "Asia/Istanbul") = 981165906000000000"
</pre>
### localtime2sec
<pre class="pre-non-highlight-non-pair">
localtime2sec (class=time #args=1,2) Parses local timestamp as integer seconds since the epoch. Consults $TZ environment variable, unless second argument is supplied.
@ -1305,6 +1604,44 @@ localtime2sec("2001-02-03 04:05:06", "Asia/Istanbul") = 981165906"
</pre>
### nsec2gmt
<pre class="pre-non-highlight-non-pair">
nsec2gmt (class=time #args=1,2) Formats integer nanoseconds since epoch as GMT timestamp. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part.
Examples:
nsec2gmt(1234567890000000000) = "2009-02-13T23:31:30Z"
nsec2gmt(1234567890123456789) = "2009-02-13T23:31:30Z"
nsec2gmt(1234567890123456789, 6) = "2009-02-13T23:31:30.123456Z"
</pre>
### nsec2gmtdate
<pre class="pre-non-highlight-non-pair">
nsec2gmtdate (class=time #args=1) Formats integer nanoseconds since epoch as GMT timestamp with year-month-date. Leaves non-numbers as-is.
Example:
sec2gmtdate(1440768801700000000) = "2015-08-28".
</pre>
### nsec2localdate
<pre class="pre-non-highlight-non-pair">
nsec2localdate (class=time #args=1,2) Formats integer nanoseconds since epoch as local timestamp with year-month-date. Leaves non-numbers as-is. Consults $TZ environment variable unless second argument is supplied.
Examples:
nsec2localdate(1440768801700000000) = "2015-08-28" with TZ="Asia/Istanbul"
nsec2localdate(1440768801700000000, "Asia/Istanbul") = "2015-08-28"
</pre>
### nsec2localtime
<pre class="pre-non-highlight-non-pair">
nsec2localtime (class=time #args=1,2,3) Formats integer nanoseconds since epoch as local timestamp. Consults $TZ environment variable unless third argument is supplied. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part
Examples:
nsec2localtime(1234567890000000000) = "2009-02-14 01:31:30" with TZ="Asia/Istanbul"
nsec2localtime(1234567890123456789) = "2009-02-14 01:31:30" with TZ="Asia/Istanbul"
nsec2localtime(1234567890123456789, 6) = "2009-02-14 01:31:30.123456" with TZ="Asia/Istanbul"
nsec2localtime(1234567890123456789, 6, "Asia/Istanbul") = "2009-02-14 01:31:30.123456"
</pre>
### sec2dhms
<pre class="pre-non-highlight-non-pair">
sec2dhms (class=time #args=1) Formats integer seconds as in sec2dhms(500000) = "5d18h53m20s"
@ -1355,6 +1692,27 @@ sec2localtime(1234567890.123456, 6, "Asia/Istanbul") = "2009-02-14 01:31:30.1234
</pre>
### strfntime
<pre class="pre-non-highlight-non-pair">
strfntime (class=time #args=2) Formats integer nanoseconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strfntime(1440768801123456789,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strfntime(1440768801123456789,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.123Z"
strfntime(1440768801123456789,"%Y-%m-%dT%H:%M:%6SZ") = "2015-08-28T13:33:21.123456Z"
</pre>
### strfntime_local
<pre class="pre-non-highlight-non-pair">
strfntime_local (class=time #args=2,3) Like strfntime but consults the $TZ environment variable to get local time zone.
Examples:
strfntime_local(1440768801123456789, "%Y-%m-%d %H:%M:%S %z") = "2015-08-28 16:33:21 +0300" with TZ="Asia/Istanbul"
strfntime_local(1440768801123456789, "%Y-%m-%d %H:%M:%3S %z") = "2015-08-28 16:33:21.123 +0300" with TZ="Asia/Istanbul"
strfntime_local(1440768801123456789, "%Y-%m-%d %H:%M:%3S %z", "Asia/Istanbul") = "2015-08-28 16:33:21.123 +0300"
strfntime_local(1440768801123456789, "%Y-%m-%d %H:%M:%9S %z", "Asia/Istanbul") = "2015-08-28 16:33:21.123456789 +0300"
</pre>
### strftime
<pre class="pre-non-highlight-non-pair">
strftime (class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
@ -1374,6 +1732,28 @@ strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z", "Asia/Istanbul") = "2015-0
</pre>
### strpntime
<pre class="pre-non-highlight-non-pair">
strpntime (class=time #args=2) strpntime: Parses timestamp as integer nanoseconds since the epoch. See also strpntime_local.
Examples:
strpntime("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801000000000
strpntime("2015-08-28T13:33:21.345Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801345000000
strpntime("1970-01-01 00:00:00 -0400", "%Y-%m-%d %H:%M:%S %z") = 14400000000000
strpntime("1970-01-01 00:00:00 +0200", "%Y-%m-%d %H:%M:%S %z") = -7200000000000
</pre>
### strpntime_local
<pre class="pre-non-highlight-non-pair">
strpntime_local (class=time #args=2,3) Like strpntime but consults the $TZ environment variable to get local time zone.
Examples:
strpntime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001000000000 with TZ="Asia/Istanbul"
strpntime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001345000000 with TZ="Asia/Istanbul"
strpntime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S") = 1440758001000000000 with TZ="Asia/Istanbul"
strpntime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S", "Asia/Istanbul") = 1440758001000000000
</pre>
### strptime
<pre class="pre-non-highlight-non-pair">
strptime (class=time #args=2) strptime: Parses timestamp as floating-point seconds since the epoch. See also strptime_local.
@ -1381,13 +1761,13 @@ Examples:
strptime("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801.000000
strptime("2015-08-28T13:33:21.345Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801.345000
strptime("1970-01-01 00:00:00 -0400", "%Y-%m-%d %H:%M:%S %z") = 14400
strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
strptime("1970-01-01 00:00:00 +0200", "%Y-%m-%d %H:%M:%S %z") = -7200
</pre>
### strptime_local
<pre class="pre-non-highlight-non-pair">
strptime_local (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
strptime_local (class=time #args=2,3) Like strptime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"
@ -1396,6 +1776,12 @@ strptime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S", "Asia/Istanbul")
</pre>
### sysntime
<pre class="pre-non-highlight-non-pair">
sysntime (class=time #args=0) Returns the system time in 64-bit nanoseconds since the epoch.
</pre>
### systime
<pre class="pre-non-highlight-non-pair">
systime (class=time #args=0) Returns the system time in floating-point seconds since the epoch.
@ -1408,6 +1794,12 @@ systimeint (class=time #args=0) Returns the system time in integer seconds sinc
</pre>
### upntime
<pre class="pre-non-highlight-non-pair">
upntime (class=time #args=0) Returns the time in 64-bit nanoseconds since the current Miller program was started.
</pre>
### uptime
<pre class="pre-non-highlight-non-pair">
uptime (class=time #args=0) Returns the time in floating-point seconds since the current Miller program was started.

View file

@ -1,8 +1,6 @@
# DSL built-in functions
These are functions in the [Miller programming language](miller-programming-language.md)
that you can call when you use `mlr put` and `mlr filter`. For example, when you type
These are functions in the [Miller programming language](miller-programming-language.md) that you can call when you use `mlr put` and `mlr filter`. For example, when you type
GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv put '
$color = toupper($color);
@ -14,25 +12,12 @@ the `toupper` and `gsub` bits are _functions_.
## Overview
At the command line, you can use `mlr -f` and `mlr -F` for information much
like what's on this page.
At the command line, you can use `mlr -f` and `mlr -F` for information much like what's on this page.
Each function takes a specific number of arguments, as shown below, except for
functions marked as variadic such as `min` and `max`. (The latter compute min
and max of any number of arguments.) There is no notion of optional or
default-on-absent arguments. All argument-passing is positional rather than by
name; arguments are passed by value, not by reference.
Each function takes a specific number of arguments, as shown below, except for functions marked as variadic, such as `min` and `max`. (The latter compute the min and max of any number of arguments.) There is no notion of optional or default-on-absent arguments. All argument-passing is positional rather than by name; arguments are passed by value, not by reference.
At the command line, you can get a list of all functions using `mlr -f`, with
details using `mlr -F`. (Or, `mlr help usage-functions-by-class` to get
details in the order shown on this page.) You can get detail for a given
function using `mlr help function namegoeshere`, e.g. `mlr help function
gsub`.
At the command line, you can get a list of all functions using `mlr -f`, with details using `mlr -F`. (Or, `mlr help usage-functions-by-class` to get details in the order shown on this page.) You can get details for a given function using `mlr help function namegoeshere`, e.g., `mlr help function gsub`.
Operators are listed here along with functions. In this case, the
argument-count is the number of items involved in the infix operator, e.g. we
say `x+y` so the details for the `+` operator say that its number of arguments
is 2. Unary operators such as `!` and `~` show argument-count of 1; the ternary
`? :` operator shows an argument-count of 3.
Operators are listed here along with functions. In this case, the argument count refers to the number of items involved in the infix operator. For example, we say `x+y`, so the details for the `+` operator indicate that it has two arguments. Unary operators such as `!` and `~` show argument-count of 1; the ternary `? :` operator shows an argument count of 3.
GENMD-RUN-CONTENT-GENERATOR(./mk-func-info.rb)

View file

@ -16,34 +16,9 @@ Quick links:
</div>
# A note on the complexity of Miller's expression language
One of Miller's strengths is its brevity: it's much quicker -- and less
error-prone -- to type `mlr stats1 -a sum -f x,y -g a,b` than having to track
summation variables as in `awk`, or using Miller's [out-of-stream
variables](reference-dsl-variables.md#out-of-stream-variables). And the more
language features Miller's put-DSL has (for-loops, if-statements, nested
control structures, user-defined functions, etc.) then the *less* powerful it
begins to seem: because of the other programming-language features it *doesn't*
have (classes, exceptions, and so on).
One of Miller's strengths is its brevity: it's much quicker -- and less error-prone -- to type `mlr stats1 -a sum -f x,y -g a,b` than having to track summation variables as in `awk`, or using Miller's [out-of-stream variables](reference-dsl-variables.md#out-of-stream-variables). And the more language features Miller's put-DSL has (for-loops, if-statements, nested control structures, user-defined functions, etc.), then the *less* powerful it begins to seem: because of the other programming-language features it *doesn't* have (classes, exceptions, and so on).
When I was originally prototyping Miller in 2015, the primary decision I had
was whether to hand-code in a low-level language like C or Rust or Go, with my
own hand-rolled DSL, or whether to use a higher-level language (like Python or
Lua or Nim) and let the `put` statements be handled by the implementation
language's own `eval`: the implementation language would take the place of a
DSL. Multiple performance experiments showed me I could get better throughput
using the former, by a wide margin. So Miller is Go under the hood with a
hand-rolled DSL.
When I was initially prototyping Miller in 2015, the primary decision I had was whether to hand-code in a low-level language like C or Rust or Go, with my hand-rolled DSL, or whether to use a higher-level language (like Python or Lua or Nim) and let the `put` statements be handled by the implementation language's own `eval`: the implementation language would take the place of a DSL. Multiple performance experiments showed me I could get better throughput using the former, by a wide margin. So Miller is Go under the hood with a hand-rolled DSL.
I do want to keep focusing on what Miller is good at -- concise notation, low
latency, and high throughput -- and not add too much in terms of
high-level-language features to the DSL. That said, some sort of
customizability is a basic thing to want. As of 4.1.0 we have recursive
`for`/`while`/`if` [structures](reference-dsl-control-structures.md) on about
the same complexity level as `awk`; as of 5.0.0 we have [user-defined
functions](reference-dsl-user-defined-functions.md) and [map-valued
variables](reference-dsl-variables.md), again on about the same complexity level
as `awk` along with optional type-declaration syntax; as of Miller 6 we have
full support for [arrays](reference-main-arrays.md). While I'm excited by these
powerful language features, I hope to keep new features focused on Miller's
sweet spot which is speed plus simplicity.
I want to continue focusing on what Miller excels at — concise notation, low latency, and high throughput — and not add too many high-level language features to the DSL. That said, some customizability is a basic thing to want. As of 4.1.0, we have recursive `for`/`while`/`if` [structures](reference-dsl-control-structures.md) on about the same complexity level as `awk`; as of 5.0.0, we have [user-defined functions](reference-dsl-user-defined-functions.md) and [map-valued variables](reference-dsl-variables.md), again on about the same complexity level as `awk` along with optional type-declaration syntax; as of Miller 6, we have full support for [arrays](reference-main-arrays.md). While I'm excited by these powerful language features, I hope to keep new features focused on Miller's sweet spot, which is speed plus simplicity.

View file

@ -1,33 +1,8 @@
# A note on the complexity of Miller's expression language
One of Miller's strengths is its brevity: it's much quicker -- and less
error-prone -- to type `mlr stats1 -a sum -f x,y -g a,b` than having to track
summation variables as in `awk`, or using Miller's [out-of-stream
variables](reference-dsl-variables.md#out-of-stream-variables). And the more
language features Miller's put-DSL has (for-loops, if-statements, nested
control structures, user-defined functions, etc.) then the *less* powerful it
begins to seem: because of the other programming-language features it *doesn't*
have (classes, exceptions, and so on).
One of Miller's strengths is its brevity: it's much quicker -- and less error-prone -- to type `mlr stats1 -a sum -f x,y -g a,b` than having to track summation variables as in `awk`, or using Miller's [out-of-stream variables](reference-dsl-variables.md#out-of-stream-variables). And the more language features Miller's put-DSL has (for-loops, if-statements, nested control structures, user-defined functions, etc.), then the *less* powerful it begins to seem: because of the other programming-language features it *doesn't* have (classes, exceptions, and so on).
When I was originally prototyping Miller in 2015, the primary decision I had
was whether to hand-code in a low-level language like C or Rust or Go, with my
own hand-rolled DSL, or whether to use a higher-level language (like Python or
Lua or Nim) and let the `put` statements be handled by the implementation
language's own `eval`: the implementation language would take the place of a
DSL. Multiple performance experiments showed me I could get better throughput
using the former, by a wide margin. So Miller is Go under the hood with a
hand-rolled DSL.
When I was initially prototyping Miller in 2015, the primary decision I had was whether to hand-code in a low-level language like C or Rust or Go, with my hand-rolled DSL, or whether to use a higher-level language (like Python or Lua or Nim) and let the `put` statements be handled by the implementation language's own `eval`: the implementation language would take the place of a DSL. Multiple performance experiments showed me I could get better throughput using the former, by a wide margin. So Miller is Go under the hood with a hand-rolled DSL.
I do want to keep focusing on what Miller is good at -- concise notation, low
latency, and high throughput -- and not add too much in terms of
high-level-language features to the DSL. That said, some sort of
customizability is a basic thing to want. As of 4.1.0 we have recursive
`for`/`while`/`if` [structures](reference-dsl-control-structures.md) on about
the same complexity level as `awk`; as of 5.0.0 we have [user-defined
functions](reference-dsl-user-defined-functions.md) and [map-valued
variables](reference-dsl-variables.md), again on about the same complexity level
as `awk` along with optional type-declaration syntax; as of Miller 6 we have
full support for [arrays](reference-main-arrays.md). While I'm excited by these
powerful language features, I hope to keep new features focused on Miller's
sweet spot which is speed plus simplicity.
I want to continue focusing on what Miller excels at — concise notation, low latency, and high throughput — and not add too many high-level language features to the DSL. That said, some customizability is a basic thing to want. As of 4.1.0, we have recursive `for`/`while`/`if` [structures](reference-dsl-control-structures.md) on about the same complexity level as `awk`; as of 5.0.0, we have [user-defined functions](reference-dsl-user-defined-functions.md) and [map-valued variables](reference-dsl-variables.md), again on about the same complexity level as `awk` along with optional type-declaration syntax; as of Miller 6, we have full support for [arrays](reference-main-arrays.md). While I'm excited by these powerful language features, I hope to keep new features focused on Miller's sweet spot, which is speed plus simplicity.

View file

@ -18,7 +18,7 @@ Quick links:
## Pattern-action blocks
These are reminiscent of `awk` syntax. They can be used to allow assignments to be done only when appropriate -- e.g. for math-function domain restrictions, regex-matching, and so on:
These are reminiscent of `awk` syntax. They can be used to allow assignments to be done only when appropriate -- e.g., for math-function domain restrictions, regex-matching, and so on:
<pre class="pre-highlight-in-pair">
<b>mlr cat data/put-gating-example-1.dkvp</b>
@ -64,7 +64,7 @@ a=some other name
a=xyz_789,b=left_xyz,c=right_789
</pre>
This produces heteregenous output which Miller, of course, has no problems with (see [Record Heterogeneity](record-heterogeneity.md)). But if you want homogeneous output, the curly braces can be replaced with a semicolon between the expression and the body statements. This causes `put` to evaluate the boolean expression (along with any side effects, namely, regex-captures `\1`, `\2`, etc.) but doesn't use it as a criterion for whether subsequent assignments should be executed. Instead, subsequent assignments are done unconditionally:
This produces heterogeneous output which Miller, of course, has no problems with (see [Record Heterogeneity](record-heterogeneity.md)). But if you want homogeneous output, the curly braces can be replaced with a semicolon between the expression and the body statements. This causes `put` to evaluate the boolean expression (along with any side effects, namely, regex-captures `\1`, `\2`, etc.) but doesn't use it as a criterion for whether subsequent assignments should be executed. Instead, subsequent assignments are done unconditionally:
<pre class="pre-highlight-in-pair">
<b>mlr --opprint put '</b>
@ -172,7 +172,7 @@ records](operating-on-all-records.md) for some options.
## For-loops
While Miller's `while` and `do-while` statements are much as in many other languages, `for` loops are more idiosyncratic to Miller. They are loops over key-value pairs, whether in stream records, out-of-stream variables, local variables, or map-literals: more reminiscent of `foreach`, as in (for example) PHP. There are **for-loops over map keys** and **for-loops over key-value tuples**. Additionally, Miller has a **C-style triple-for loop** with initialize, test, and update statements. Each is described below.
While Miller's `while` and `do-while` statements are much like those in many other languages, `for` loops are more idiosyncratic to Miller. They are loops over key-value pairs, whether in stream records, out-of-stream variables, local variables, or map-literals: more reminiscent of `foreach`, as in (for example) PHP. There are **for-loops over map keys** and **for-loops over key-value tuples**. Additionally, Miller has a **C-style triple-for loop** with initialize, test, and update statements. Each is described below.
As with `while` and `do-while`, a `break` or `continue` within nested control structures will propagate to the innermost loop enclosing them, if any, and a `break` or `continue` outside a loop is a syntax error that will be flagged as soon as the expression is parsed, before any input records are ingested.
@ -260,11 +260,9 @@ value: true valuetype: bool
### Key-value for-loops
For [maps](reference-main-maps.md), the first loop variable is the key and the
second is the value; for [arrays](reference-main-arrays.md), the first loop
variable is the (1-up) array index and the second is the value.
For [maps](reference-main-maps.md), the first loop variable is the key, and the second is the value. For [arrays](reference-main-arrays.md), the first loop variable is the (1-based) array index, and the second is the value.
Single-level keys may be gotten at using either `for(k,v)` or `for((k),v)`; multi-level keys may be gotten at using `for((k1,k2,k3),v)` and so on. The `v` variable will be bound to a scalar value (non-array/non-map) if the map stops at that level, or to a map-valued or array-valued variable if the map goes deeper. If the map isn't deep enough then the loop body won't be executed.
Single-level keys may be obtained using either `for(k,v)` or `for((k),v)`; multi-level keys may be obtained using `for((k1,k2,k3),v)` and so on. The `v` variable will be bound to a scalar value (non-array/non-map) if the map stops at that level, or to a map-valued or array-valued variable if the map goes deeper. If the map isn't deep enough then the loop body won't be executed.
<pre class="pre-highlight-in-pair">
<b>cat data/for-srec-example.tbl</b>
@ -333,7 +331,7 @@ eks wye 4 0.381399 0.134188 4.515587 18.062348
wye pan 5 0.573288 0.863624 6.4369119999999995 25.747647999999998
</pre>
It can be confusing to modify the stream record while iterating over a copy of it, so instead you might find it simpler to use a local variable in the loop and only update the stream record after the loop:
It can be confusing to modify the stream record while iterating over a copy of it, so instead, you might find it simpler to use a local variable in the loop and only update the stream record after the loop:
<pre class="pre-highlight-in-pair">
<b>mlr --from data/small --opprint put '</b>
@ -355,7 +353,7 @@ eks wye 4 0.381399 0.134188 4.515587
wye pan 5 0.573288 0.863624 6.4369119999999995
</pre>
You can also start iterating on sub-maps of an out-of-stream or local variable; you can loop over nested keys; you can loop over all out-of-stream variables. The bound variables are bound to a copy of the sub-map as it was before the loop started. The sub-map is specified by square-bracketed indices after `in`, and additional deeper indices are bound to loop key-variables. The terminal values are bound to the loop value-variable whenever the keys are not too shallow. The value-variable may refer to a terminal (string, number) or it may be map-valued if the map goes deeper. Example indexing is as follows:
You can also start iterating on sub-maps of an out-of-stream or local variable; you can loop over nested keys; you can loop over all out-of-stream variables. The bound variables are bound to a copy of the sub-map as it was before the loop started. The sub-map is specified by square-bracketed indices after `in`, and additional deeper indices are bound to loop key variables. The terminal values are bound to the loop value variable whenever the keys are not too shallow. The value variable may refer to a terminal (string, number) or it may be map-valued if the map goes deeper. Example indexing is as follows:
<pre class="pre-non-highlight-non-pair">
# Parentheses are optional for single key:
@ -516,15 +514,15 @@ wye pan 5 0.573288 0.863624 15 31
Notes:
* In `for (start; continuation; update) { body }`, the start, continuation, and update statements may be empty, single statements, or multiple comma-separated statements. If the continuation is empty (e.g. `for(i=1;;i+=1)`) it defaults to true.
* In `for (start; continuation; update) { body }`, the start, continuation, and update statements may be empty, single statements, or multiple comma-separated statements. If the continuation is empty (e.g. `for(i=1;;i+=1)`), it defaults to true.
* In particular, you may use `$`-variables and/or `@`-variables in the start, continuation, and/or update steps (as well as the body, of course).
* The typedecls such as `int` or `num` are optional. If a typedecl is provided (for a local variable), it binds a variable scoped to the for-loop regardless of whether a same-name variable is present in outer scope. If a typedecl is not provided, then the variable is scoped to the for-loop if no same-name variable is present in outer scope, or if a same-name variable is present in outer scope then it is modified.
* The typedecls such as `int` or `num` are optional. If a typedecl is provided (for a local variable), it binds a variable scoped to the for-loop regardless of whether a same-name variable is present in the outer scope. If a typedecl is not provided, then the variable is scoped to the for-loop if no same-name variable is present in the outer scope, or if a same-name variable is present in the outer scope, then it is modified.
* Miller has no `++` or `--` operators.
* As with all `for`/`if`/`while` statements in Miller, the curly braces are required even if the body is a single statement, or empty.
* As with all `for`/`if`/`while` statements in Miller, the curly braces are required even if the body is a single statement or empty.
## Begin/end blocks

View file

@ -2,7 +2,7 @@
## Pattern-action blocks
These are reminiscent of `awk` syntax. They can be used to allow assignments to be done only when appropriate -- e.g. for math-function domain restrictions, regex-matching, and so on:
These are reminiscent of `awk` syntax. They can be used to allow assignments to be done only when appropriate -- e.g., for math-function domain restrictions, regex-matching, and so on:
GENMD-RUN-COMMAND
mlr cat data/put-gating-example-1.dkvp
@ -24,7 +24,7 @@ mlr put '
data/put-gating-example-2.dkvp
GENMD-EOF
This produces heteregenous output which Miller, of course, has no problems with (see [Record Heterogeneity](record-heterogeneity.md)). But if you want homogeneous output, the curly braces can be replaced with a semicolon between the expression and the body statements. This causes `put` to evaluate the boolean expression (along with any side effects, namely, regex-captures `\1`, `\2`, etc.) but doesn't use it as a criterion for whether subsequent assignments should be executed. Instead, subsequent assignments are done unconditionally:
This produces heterogeneous output which Miller, of course, has no problems with (see [Record Heterogeneity](record-heterogeneity.md)). But if you want homogeneous output, the curly braces can be replaced with a semicolon between the expression and the body statements. This causes `put` to evaluate the boolean expression (along with any side effects, namely, regex-captures `\1`, `\2`, etc.) but doesn't use it as a criterion for whether subsequent assignments should be executed. Instead, subsequent assignments are done unconditionally:
GENMD-RUN-COMMAND
mlr --opprint put '
@ -120,7 +120,7 @@ records](operating-on-all-records.md) for some options.
## For-loops
While Miller's `while` and `do-while` statements are much as in many other languages, `for` loops are more idiosyncratic to Miller. They are loops over key-value pairs, whether in stream records, out-of-stream variables, local variables, or map-literals: more reminiscent of `foreach`, as in (for example) PHP. There are **for-loops over map keys** and **for-loops over key-value tuples**. Additionally, Miller has a **C-style triple-for loop** with initialize, test, and update statements. Each is described below.
While Miller's `while` and `do-while` statements are much like those in many other languages, `for` loops are more idiosyncratic to Miller. They are loops over key-value pairs, whether in stream records, out-of-stream variables, local variables, or map-literals: more reminiscent of `foreach`, as in (for example) PHP. There are **for-loops over map keys** and **for-loops over key-value tuples**. Additionally, Miller has a **C-style triple-for loop** with initialize, test, and update statements. Each is described below.
As with `while` and `do-while`, a `break` or `continue` within nested control structures will propagate to the innermost loop enclosing them, if any, and a `break` or `continue` outside a loop is a syntax error that will be flagged as soon as the expression is parsed, before any input records are ingested.
@ -165,11 +165,9 @@ GENMD-EOF
### Key-value for-loops
For [maps](reference-main-maps.md), the first loop variable is the key and the
second is the value; for [arrays](reference-main-arrays.md), the first loop
variable is the (1-up) array index and the second is the value.
For [maps](reference-main-maps.md), the first loop variable is the key, and the second is the value. For [arrays](reference-main-arrays.md), the first loop variable is the (1-based) array index, and the second is the value.
Single-level keys may be gotten at using either `for(k,v)` or `for((k),v)`; multi-level keys may be gotten at using `for((k1,k2,k3),v)` and so on. The `v` variable will be bound to a scalar value (non-array/non-map) if the map stops at that level, or to a map-valued or array-valued variable if the map goes deeper. If the map isn't deep enough then the loop body won't be executed.
Single-level keys may be obtained using either `for(k,v)` or `for((k),v)`; multi-level keys may be obtained using `for((k1,k2,k3),v)` and so on. The `v` variable will be bound to a scalar value (non-array/non-map) if the map stops at that level, or to a map-valued or array-valued variable if the map goes deeper. If the map isn't deep enough then the loop body won't be executed.
GENMD-RUN-COMMAND
cat data/for-srec-example.tbl
@ -210,7 +208,7 @@ mlr --from data/small --opprint put '
'
GENMD-EOF
It can be confusing to modify the stream record while iterating over a copy of it, so instead you might find it simpler to use a local variable in the loop and only update the stream record after the loop:
It can be confusing to modify the stream record while iterating over a copy of it, so instead, you might find it simpler to use a local variable in the loop and only update the stream record after the loop:
GENMD-RUN-COMMAND
mlr --from data/small --opprint put '
@ -224,7 +222,7 @@ mlr --from data/small --opprint put '
'
GENMD-EOF
You can also start iterating on sub-maps of an out-of-stream or local variable; you can loop over nested keys; you can loop over all out-of-stream variables. The bound variables are bound to a copy of the sub-map as it was before the loop started. The sub-map is specified by square-bracketed indices after `in`, and additional deeper indices are bound to loop key-variables. The terminal values are bound to the loop value-variable whenever the keys are not too shallow. The value-variable may refer to a terminal (string, number) or it may be map-valued if the map goes deeper. Example indexing is as follows:
You can also start iterating on sub-maps of an out-of-stream or local variable; you can loop over nested keys; you can loop over all out-of-stream variables. The bound variables are bound to a copy of the sub-map as it was before the loop started. The sub-map is specified by square-bracketed indices after `in`, and additional deeper indices are bound to loop key variables. The terminal values are bound to the loop value variable whenever the keys are not too shallow. The value variable may refer to a terminal (string, number) or it may be map-valued if the map goes deeper. Example indexing is as follows:
GENMD-INCLUDE-ESCAPED(data/for-oosvar-example-0a.txt)
@ -333,15 +331,15 @@ GENMD-EOF
Notes:
* In `for (start; continuation; update) { body }`, the start, continuation, and update statements may be empty, single statements, or multiple comma-separated statements. If the continuation is empty (e.g. `for(i=1;;i+=1)`) it defaults to true.
* In `for (start; continuation; update) { body }`, the start, continuation, and update statements may be empty, single statements, or multiple comma-separated statements. If the continuation is empty (e.g. `for(i=1;;i+=1)`), it defaults to true.
* In particular, you may use `$`-variables and/or `@`-variables in the start, continuation, and/or update steps (as well as the body, of course).
* The typedecls such as `int` or `num` are optional. If a typedecl is provided (for a local variable), it binds a variable scoped to the for-loop regardless of whether a same-name variable is present in outer scope. If a typedecl is not provided, then the variable is scoped to the for-loop if no same-name variable is present in outer scope, or if a same-name variable is present in outer scope then it is modified.
* The typedecls such as `int` or `num` are optional. If a typedecl is provided (for a local variable), it binds a variable scoped to the for-loop regardless of whether a same-name variable is present in the outer scope. If a typedecl is not provided, then the variable is scoped to the for-loop if no same-name variable is present in the outer scope, or if a same-name variable is present in the outer scope, then it is modified.
* Miller has no `++` or `--` operators.
* As with all `for`/`if`/`while` statements in Miller, the curly braces are required even if the body is a single statement, or empty.
* As with all `for`/`if`/`while` statements in Miller, the curly braces are required even if the body is a single statement or empty.
## Begin/end blocks

View file

@ -16,6 +16,55 @@ Quick links:
</div>
# DSL errors and transparency
# Handling for data errors
By default, Miller doesn't stop data processing for a single cell error. For example:
<pre class="pre-highlight-in-pair">
<b>mlr --csv --from data-error.csv cat</b>
</pre>
<pre class="pre-non-highlight-in-pair">
x
1
2
3
text
4
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr --csv --from data-error.csv put '$y = log10($x)'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
x,y
1,0
2,0.3010299956639812
3,0.4771212547196624
text,(error)
4,0.6020599913279624
</pre>
If you do want to stop processing, though, you have three options. The first is the `mlr -x` flag:
<pre class="pre-highlight-in-pair">
<b>mlr -x --csv --from data-error.csv put '$y = log10($x)'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
x,y
1,0
2,0.3010299956639812
3,0.4771212547196624
mlr: data error at NR=4 FNR=4 FILENAME=data-error.csv
mlr: field y: log10: unacceptable type string with value "text"
mlr: exiting due to data error.
</pre>
The second is to put `-x` into your [`~/.mlrrc` file](customization.md).
The third is to set the `MLR_FAIL_ON_DATA_ERROR` environment variable, which makes `-x` implicit.
# Common causes of syntax errors
As soon as you have a [programming language](miller-programming-language.md), you start having the problem *What is my code doing, and why?* This includes getting syntax errors -- which are always annoying -- as well as the even more annoying problem of a program which parses without syntax error but doesn't do what you expect.
The syntax-error message gives you line/column position for the syntax that couldn't be parsed. The cause may be clear from that information, or perhaps not. Here are some common causes of syntax errors:
@ -26,7 +75,7 @@ The syntax-error message gives you line/column position for the syntax that coul
* Curly braces are required for the bodies of `if`/`while`/`for` blocks, even when the body is a single statement.
As for transparency:
# Transparency
* As in any language, you can do `print`, or `eprint` to print to stderr. See [Print statements](reference-dsl-output-statements.md#print-statements); see also [Dump statements](reference-dsl-output-statements.md#dump-statements) and [Emit statements](reference-dsl-output-statements.md#emit-statements).

View file

@ -1,5 +1,29 @@
# DSL errors and transparency
# Handling for data errors
By default, Miller doesn't stop data processing for a single cell error. For example:
GENMD-RUN-COMMAND
mlr --csv --from data-error.csv cat
GENMD-EOF
GENMD-RUN-COMMAND
mlr --csv --from data-error.csv put '$y = log10($x)'
GENMD-EOF
If you do want to stop processing, though, you have three options. The first is the `mlr -x` flag:
GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr -x --csv --from data-error.csv put '$y = log10($x)'
GENMD-EOF
The second is to put `-x` into your [`~/.mlrrc` file](customization.md).
The third is to set the `MLR_FAIL_ON_DATA_ERROR` environment variable, which makes `-x` implicit.
# Common causes of syntax errors
As soon as you have a [programming language](miller-programming-language.md), you start having the problem *What is my code doing, and why?* This includes getting syntax errors -- which are always annoying -- as well as the even more annoying problem of a program which parses without syntax error but doesn't do what you expect.
The syntax-error message gives you line/column position for the syntax that couldn't be parsed. The cause may be clear from that information, or perhaps not. Here are some common causes of syntax errors:
@ -10,7 +34,7 @@ The syntax-error message gives you line/column position for the syntax that coul
* Curly braces are required for the bodies of `if`/`while`/`for` blocks, even when the body is a single statement.
As for transparency:
# Transparency
* As in any language, you can do `print`, or `eprint` to print to stderr. See [Print statements](reference-dsl-output-statements.md#print-statements); see also [Dump statements](reference-dsl-output-statements.md#dump-statements) and [Emit statements](reference-dsl-output-statements.md#emit-statements).

View file

@ -36,7 +36,7 @@ red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
</pre>
The former, of course, is a little easier to type. For another example:
The former is a little easier to type. For another example:
<pre class="pre-highlight-in-pair">
<b>mlr --csv put '@running_sum += $quantity; filter @running_sum > 500' example.csv</b>

View file

@ -10,7 +10,7 @@ GENMD-RUN-COMMAND
mlr --csv put 'filter NR==2 || NR==3' example.csv
GENMD-EOF
The former, of course, is a little easier to type. For another example:
The former is a little easier to type. For another example:
GENMD-RUN-COMMAND
mlr --csv put '@running_sum += $quantity; filter @running_sum > 500' example.csv

View file

@ -29,23 +29,15 @@ As of [Miller 6](new-in-miller-6.md) you can use
intuitive operations on arrays and maps, as an alternative to things which
would otherwise require for-loops.
See also the [`get_keys`](reference-dsl-builtin-functions.md#get_keys) and
[`get_values`](reference-dsl-builtin-functions.md#get_values) functions which,
when given a map, return an array of its keys or an array of its values,
respectively.
See also the [`get_keys`](reference-dsl-builtin-functions.md#get_keys) and [`get_values`](reference-dsl-builtin-functions.md#get_values) functions which, when given a map, return an array of its keys or an array of its values, respectively.
## select
The [`select`](reference-dsl-builtin-functions.md#select) function takes a map
or array as its first argument and a function as second argument. It includes
each input element in the output if the function returns true.
The [`select`](reference-dsl-builtin-functions.md#select) function takes a map or array as its first argument and a function as its second argument. It includes each input element in the output if the function returns true.
For arrays, that function should take one argument, for array element; for
maps, it should take two, for map-element key and value. In either case it
should return a boolean.
For arrays, that function should take one argument, for an array element; for maps, it should take two, for a map element key and value. In either case, it should return a boolean.
A perhaps helpful analogy: the `select` function is to arrays and maps as the
[`filter`](reference-verbs.md#filter) is to records.
A perhaps helpful analogy: the `select` function is to arrays and maps as the [`filter`](reference-verbs.md#filter) is to records.
Array examples:
@ -123,16 +115,11 @@ Values with last digit >= 5:
## apply
The [`apply`](reference-dsl-builtin-functions.md#apply) function takes a map
or array as its first argument and a function as second argument. It applies
the function to each element of the array or map.
The [`apply`](reference-dsl-builtin-functions.md#apply) function takes a map or array as its first argument and a function as its second argument. It applies the function to each element of the array or map.
For arrays, the function should take one argument, for array element; it should
return a new element. For maps, it should take two, for map-element key and
value. It should return a new key-value pair (i.e. a single-entry map).
For arrays, the function should take one argument, representing an array element, and return a new element. For maps, it should take two, for the map element key and value. It should return a new key-value pair (i.e., a single-entry map).
A perhaps helpful analogy: the `apply` function is to arrays and maps as the
[`put`](reference-verbs.md#put) is to records.
A perhaps helpful analogy: the `apply` function is to arrays and maps as the [`put`](reference-verbs.md#put) is to records.
Array examples:
@ -232,17 +219,11 @@ Same, with upcased keys:
## reduce
The [`reduce`](reference-dsl-builtin-functions.md#reduce) function takes a map
or array as its first argument and a function as second argument. It accumulates entries into a final
output -- for example, sum or product.
The [`reduce`](reference-dsl-builtin-functions.md#reduce) function takes a map or array as its first argument and a function as its second argument. It accumulates entries into a final output, such as a sum or product.
For arrays, the function should take two arguments, for accumulated value and
array element; for maps, it should take four, for accumulated key and value
and map-element key and value. In either case it should return the updated
accumulator.
For arrays, the function should take two arguments, for the accumulated value and the array element; for maps, it should take four, for the accumulated key and value, and the map-element key and value. In either case it should return the updated accumulator.
The start value for the accumulator is the first element for arrays, or the
first element's key-value pair for maps.
The start value for the accumulator is the first element for arrays, or the first element's key-value pair for maps.
<pre class="pre-highlight-in-pair">
<b>mlr -n put '</b>
@ -370,10 +351,7 @@ String-join of values:
## fold
The [`fold`](reference-dsl-builtin-functions.md#fold) function is the same as
`reduce`, except that instead of the starting value for the accumulation being
taken from the first entry of the array/map, you specify it as the third
argument.
The [`fold`](reference-dsl-builtin-functions.md#fold) function is the same as `reduce`, except that instead of the starting value for the accumulation being taken from the first entry of the array/map, you specify it as the third argument.
<pre class="pre-highlight-in-pair">
<b>mlr -n put '</b>
@ -469,22 +447,13 @@ Sum of values with fold and 1000000 initial value:
## sort
The [`sort`](reference-dsl-builtin-functions.md#sort) function takes a map or
array as its first argument, and it can take a function as second argument.
Unlike the other higher-order functions, the second argument can be omitted
when the natural ordering is desired -- ordered by array element for arrays, or by
key for maps.
The [`sort`](reference-dsl-builtin-functions.md#sort) function takes a map or array as its first argument, and it can take a function as its second argument. Unlike the other higher-order functions, the second argument can be omitted when the natural ordering is desired -- ordered by array element for arrays, or by key for maps.
As a second option, character flags such as `r` for reverse or `c` for
case-folded lexical sort can be supplied as the second argument.
As a second option, character flags such as `r` for reverse or `c` for case-folded lexical sort can be supplied as the second argument.
As a third option, a function can be supplied as the second argument.
For arrays, that function should take two arguments `a` and `b`, returning a
negative, zero, or positive number as `a<b`, `a==b`, or `a>b` respectively.
For maps, the function should take four arguments `ak`, `av`, `bk`, and `bv`,
again returning negative, zero, or positive, using `a` and `b`'s keys and
values.
For arrays, that function should take two arguments `a` and `b`, returning a negative, zero, or positive number as `a<b`, `a==b`, or `a>b` respectively. For maps, the function should take four arguments `ak`, `av`, `bk`, and `bv`, again returning negative, zero, or positive, using `a`'s and `b`'s keys and values.
Array examples:
@ -703,9 +672,7 @@ red square false 6 64 77.1991 9.5310
## Combined examples
Using a paradigm from the [page on operating on all
records](operating-on-all-records.md), we can retain a column from the input
data as an array, then apply some higher-order functions to it:
Using a paradigm from the [page on operating on all records](operating-on-all-records.md), we can retain a column from the input data as an array, then apply some higher-order functions to it:
<pre class="pre-highlight-in-pair">
<b>mlr --c2p cat example.csv</b>
@ -776,7 +743,7 @@ Sorted, then cubed, then summed:
### Remember return
From other languages it's easy to accidentally write
From other languages, it's easy to write accidentally
<pre class="pre-highlight-in-pair">
<b>mlr -n put 'end { print select([1,2,3,4,5], func (e) { e >= 3 })}'</b>
@ -833,7 +800,7 @@ but this does:
2187
</pre>
### Built-in functions currently unsupported as arguments
### Built-in functions are currently unsupported as arguments
[Built-in functions](reference-dsl-user-defined-functions.md) are, as of
September 2021, a bit separate from [user-defined

View file

@ -13,23 +13,15 @@ As of [Miller 6](new-in-miller-6.md) you can use
intuitive operations on arrays and maps, as an alternative to things which
would otherwise require for-loops.
See also the [`get_keys`](reference-dsl-builtin-functions.md#get_keys) and
[`get_values`](reference-dsl-builtin-functions.md#get_values) functions which,
when given a map, return an array of its keys or an array of its values,
respectively.
See also the [`get_keys`](reference-dsl-builtin-functions.md#get_keys) and [`get_values`](reference-dsl-builtin-functions.md#get_values) functions which, when given a map, return an array of its keys or an array of its values, respectively.
## select
The [`select`](reference-dsl-builtin-functions.md#select) function takes a map
or array as its first argument and a function as second argument. It includes
each input element in the output if the function returns true.
The [`select`](reference-dsl-builtin-functions.md#select) function takes a map or array as its first argument and a function as its second argument. It includes each input element in the output if the function returns true.
For arrays, that function should take one argument, for array element; for
maps, it should take two, for map-element key and value. In either case it
should return a boolean.
For arrays, that function should take one argument, for an array element; for maps, it should take two, for a map element key and value. In either case, it should return a boolean.
A perhaps helpful analogy: the `select` function is to arrays and maps as the
[`filter`](reference-verbs.md#filter) is to records.
A perhaps helpful analogy: the `select` function is to arrays and maps as the [`filter`](reference-verbs.md#filter) is to records.
Array examples:
@ -75,16 +67,11 @@ GENMD-EOF
## apply
The [`apply`](reference-dsl-builtin-functions.md#apply) function takes a map
or array as its first argument and a function as second argument. It applies
the function to each element of the array or map.
The [`apply`](reference-dsl-builtin-functions.md#apply) function takes a map or array as its first argument and a function as its second argument. It applies the function to each element of the array or map.
For arrays, the function should take one argument, for array element; it should
return a new element. For maps, it should take two, for map-element key and
value. It should return a new key-value pair (i.e. a single-entry map).
For arrays, the function should take one argument, representing an array element, and return a new element. For maps, it should take two, for the map element key and value. It should return a new key-value pair (i.e., a single-entry map).
A perhaps helpful analogy: the `apply` function is to arrays and maps as the
[`put`](reference-verbs.md#put) is to records.
A perhaps helpful analogy: the `apply` function is to arrays and maps as the [`put`](reference-verbs.md#put) is to records.
Array examples:
@ -134,17 +121,11 @@ GENMD-EOF
## reduce
The [`reduce`](reference-dsl-builtin-functions.md#reduce) function takes a map
or array as its first argument and a function as second argument. It accumulates entries into a final
output -- for example, sum or product.
The [`reduce`](reference-dsl-builtin-functions.md#reduce) function takes a map or array as its first argument and a function as its second argument. It accumulates entries into a final output, such as a sum or product.
For arrays, the function should take two arguments, for accumulated value and
array element; for maps, it should take four, for accumulated key and value
and map-element key and value. In either case it should return the updated
accumulator.
For arrays, the function should take two arguments, for the accumulated value and the array element; for maps, it should take four, for the accumulated key and value, and the map-element key and value. In either case it should return the updated accumulator.
The start value for the accumulator is the first element for arrays, or the
first element's key-value pair for maps.
The start value for the accumulator is the first element for arrays, or the first element's key-value pair for maps.
GENMD-RUN-COMMAND
mlr -n put '
@ -213,10 +194,7 @@ GENMD-EOF
## fold
The [`fold`](reference-dsl-builtin-functions.md#fold) function is the same as
`reduce`, except that instead of the starting value for the accumulation being
taken from the first entry of the array/map, you specify it as the third
argument.
The [`fold`](reference-dsl-builtin-functions.md#fold) function is the same as `reduce`, except that instead of the starting value for the accumulation being taken from the first entry of the array/map, you specify it as the third argument.
GENMD-RUN-COMMAND
mlr -n put '
@ -269,22 +247,13 @@ GENMD-EOF
## sort
The [`sort`](reference-dsl-builtin-functions.md#sort) function takes a map or
array as its first argument, and it can take a function as second argument.
Unlike the other higher-order functions, the second argument can be omitted
when the natural ordering is desired -- ordered by array element for arrays, or by
key for maps.
The [`sort`](reference-dsl-builtin-functions.md#sort) function takes a map or array as its first argument, and it can take a function as its second argument. Unlike the other higher-order functions, the second argument can be omitted when the natural ordering is desired -- ordered by array element for arrays, or by key for maps.
As a second option, character flags such as `r` for reverse or `c` for
case-folded lexical sort can be supplied as the second argument.
As a second option, character flags such as `r` for reverse or `c` for case-folded lexical sort can be supplied as the second argument.
As a third option, a function can be supplied as the second argument.
For arrays, that function should take two arguments `a` and `b`, returning a
negative, zero, or positive number as `a<b`, `a==b`, or `a>b` respectively.
For maps, the function should take four arguments `ak`, `av`, `bk`, and `bv`,
again returning negative, zero, or positive, using `a` and `b`'s keys and
values.
For arrays, that function should take two arguments `a` and `b`, returning a negative, zero, or positive number as `a<b`, `a==b`, or `a>b` respectively. For maps, the function should take four arguments `ak`, `av`, `bk`, and `bv`, again returning negative, zero, or positive, using `a`'s and `b`'s keys and values.
Array examples:
@ -379,9 +348,7 @@ GENMD-EOF
## Combined examples
Using a paradigm from the [page on operating on all
records](operating-on-all-records.md), we can retain a column from the input
data as an array, then apply some higher-order functions to it:
Using a paradigm from the [page on operating on all records](operating-on-all-records.md), we can retain a column from the input data as an array, then apply some higher-order functions to it:
GENMD-RUN-COMMAND
mlr --c2p cat example.csv
@ -426,7 +393,7 @@ GENMD-EOF
### Remember return
From other languages it's easy to accidentally write
From other languages, it's easy to write accidentally
GENMD-RUN-COMMAND-TOLERATING-ERROR
mlr -n put 'end { print select([1,2,3,4,5], func (e) { e >= 3 })}'
@ -465,7 +432,7 @@ mlr -n put '
'
GENMD-EOF
### Built-in functions currently unsupported as arguments
### Built-in functions are currently unsupported as arguments
[Built-in functions](reference-dsl-user-defined-functions.md) are, as of
September 2021, a bit separate from [user-defined

View file

@ -22,7 +22,7 @@ Operators are listed on the [DSL built-in functions page](reference-dsl-builtin-
## Operator precedence
Operators are listed in order of decreasing precedence, highest first.
Operators are listed in order of decreasing precedence, from highest to lowest.
| Operators | Associativity |
|-------------------------------|---------------|
@ -46,14 +46,13 @@ Operators are listed in order of decreasing precedence, highest first.
| `? :` | right to left |
| `=` | N/A for Miller (there is no $a=$b=$c) |
See also the [section on parsing and operator precedence in the REPL](repl.md#parsing-and-operator-precedence)
for information on how to examine operator precedence interactively.
See also the [section on parsing and operator precedence in the REPL](repl.md#parsing-and-operator-precedence) for information on how to examine operator precedence interactively.
## Operator and function semantics
* Functions are often pass-throughs straight to the system-standard Go libraries.
* The [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max) functions are different from other multi-argument functions which return null if any of their inputs are null: for [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max), by contrast, if one argument is absent-null, the other is returned. Empty-null loses min or max against numeric or boolean; empty-null is less than any other string.
* The [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max) functions are different from other multi-argument functions, which return null if any of their inputs are null: for [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max), by contrast, if one argument is absent-null, the other is returned. Empty-null loses min or max against numeric or boolean; empty-null is less than any other string.
* Symmetrically with respect to the bitwise OR, AND, and XOR operators
[`|`](reference-dsl-builtin-functions.md#bitwise-or),
@ -71,7 +70,7 @@ for information on how to examine operator precedence interactively.
The main use for the `.` operator is for string concatenation: `"abc" . "def"` is `"abc.def"`.
However, in Miller 6 it has optional use for map traversal. Example:
However, in Miller 6, it has an optional use for map traversal. Example:
<pre class="pre-highlight-in-pair">
<b>cat data/server-log.json</b>
@ -109,8 +108,6 @@ However, in Miller 6 it has optional use for map traversal. Example:
<pre class="pre-non-highlight-in-pair">
bar.baz
bar.baz
[
]
</pre>
This also works on the left-hand sides of assignment statements:
@ -148,7 +145,7 @@ This also works on the left-hand sides of assignment statements:
A few caveats:
* This is why `.` has higher precedece than `+` in the table above -- in Miller 5 and below, where `.` was only used for concatenation, it had the same precedence as `+`. So you can now do this:
* This is why `.` has higher precedence than `+` in the table above -- in Miller 5 and below, where `.` was only used for concatenation, it had the same precedence as `+`. So you can now do this:
<pre class="pre-highlight-in-pair">
<b>mlr --json --from data/server-log.json put -q '</b>
@ -157,8 +154,6 @@ A few caveats:
</pre>
<pre class="pre-non-highlight-in-pair">
6989
[
]
</pre>
* However (awkwardly), if you want to use `.` for map-traversal as well as string-concatenation in the same statement, you'll need to insert parentheses, as the default associativity is left-to-right:
@ -170,8 +165,6 @@ A few caveats:
</pre>
<pre class="pre-non-highlight-in-pair">
(error)
[
]
</pre>
<pre class="pre-highlight-in-pair">
@ -181,6 +174,4 @@ A few caveats:
</pre>
<pre class="pre-non-highlight-in-pair">
GET -- api/check
[
]
</pre>

View file

@ -6,7 +6,7 @@ Operators are listed on the [DSL built-in functions page](reference-dsl-builtin-
## Operator precedence
Operators are listed in order of decreasing precedence, highest first.
Operators are listed in order of decreasing precedence, from highest to lowest.
| Operators | Associativity |
|-------------------------------|---------------|
@ -30,14 +30,13 @@ Operators are listed in order of decreasing precedence, highest first.
| `? :` | right to left |
| `=` | N/A for Miller (there is no $a=$b=$c) |
See also the [section on parsing and operator precedence in the REPL](repl.md#parsing-and-operator-precedence)
for information on how to examine operator precedence interactively.
See also the [section on parsing and operator precedence in the REPL](repl.md#parsing-and-operator-precedence) for information on how to examine operator precedence interactively.
## Operator and function semantics
* Functions are often pass-throughs straight to the system-standard Go libraries.
* The [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max) functions are different from other multi-argument functions which return null if any of their inputs are null: for [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max), by contrast, if one argument is absent-null, the other is returned. Empty-null loses min or max against numeric or boolean; empty-null is less than any other string.
* The [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max) functions are different from other multi-argument functions, which return null if any of their inputs are null: for [`min`](reference-dsl-builtin-functions.md#min) and [`max`](reference-dsl-builtin-functions.md#max), by contrast, if one argument is absent-null, the other is returned. Empty-null loses min or max against numeric or boolean; empty-null is less than any other string.
* Symmetrically with respect to the bitwise OR, AND, and XOR operators
[`|`](reference-dsl-builtin-functions.md#bitwise-or),
@ -55,7 +54,7 @@ for information on how to examine operator precedence interactively.
The main use for the `.` operator is for string concatenation: `"abc" . "def"` is `"abc.def"`.
However, in Miller 6 it has optional use for map traversal. Example:
However, in Miller 6, it has an optional use for map traversal. Example:
GENMD-RUN-COMMAND
cat data/server-log.json
@ -78,7 +77,7 @@ GENMD-EOF
A few caveats:
* This is why `.` has higher precedece than `+` in the table above -- in Miller 5 and below, where `.` was only used for concatenation, it had the same precedence as `+`. So you can now do this:
* This is why `.` has higher precedence than `+` in the table above -- in Miller 5 and below, where `.` was only used for concatenation, it had the same precedence as `+`. So you can now do this:
GENMD-RUN-COMMAND
mlr --json --from data/server-log.json put -q '

View file

@ -22,15 +22,15 @@ You can **output** variable-values or expressions in **five ways**:
* Use **emit1**/**emit**/**emitp**/**emitf** to send out-of-stream variables' current values to the output record stream, e.g. `@sum += $x; emit1 @sum` which produces an extra record such as `sum=3.1648382`. These records, just like records from input file(s), participate in downstream [then-chaining](reference-main-then-chaining.md) to other verbs.
* Use the **print** or **eprint** keywords which immediately print an expression *directly to standard output or standard error*, respectively. Note that `dump`, `edump`, `print`, and `eprint` don't output records which participate in `then`-chaining; rather, they're just immediate prints to stdout/stderr. The `printn` and `eprintn` keywords are the same except that they don't print final newlines. Additionally, you can print to a specified file instead of stdout/stderr.
* Use the **print** or **eprint** keywords which immediately print an expression *directly to standard output or standard error*, respectively. Note that `dump`, `edump`, `print`, and `eprint` don't output records that participate in `then`-chaining; rather, they're just immediate prints to stdout/stderr. The `printn` and `eprintn` keywords are the same except that they don't print final newlines. Additionally, you can print to a specified file instead of stdout/stderr.
* Use the **dump** or **edump** keywords, which *immediately print all out-of-stream variables as a JSON data structure to the standard output or standard error* (respectively).
* Use **tee** which formats the current stream record (not just an arbitrary string as with **print**) to a specific file.
* Use **tee**, which formats the current stream record (not just an arbitrary string as with **print**) to a specific file.
For the first two options you are populating the output-records stream which feeds into the next verb in a `then`-chain (if any), or which otherwise is formatted for output using `--o...` flags.
For the first two options, you are populating the output-records stream which feeds into the next verb in a `then`-chain (if any), or which otherwise is formatted for output using `--o...` flags.
For the last three options you are sending output directly to standard output, standard error, or a file.
For the last three options, you are sending output directly to standard output, standard error, or a file.
## Print statements
@ -38,7 +38,7 @@ The `print` statement is perhaps self-explanatory, but with a few light caveats:
* There are four variants: `print` goes to stdout with final newline, `printn` goes to stdout without final newline (you can include one using "\n" in your output string), `eprint` goes to stderr with final newline, and `eprintn` goes to stderr without final newline.
* Output goes directly to stdout/stderr, respectively: data produced this way do not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* Output goes directly to stdout/stderr, respectively: data produced this way does not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* Print statements are for strings (`print "hello"`), or things which can be made into strings: numbers (`print 3`, `print $a + $b`), or concatenations thereof (`print "a + b = " . ($a + $b)`). Maps (in `$*`, map-valued out-of-stream or local variables, and map literals) as well as arrays are printed as JSON.
@ -62,9 +62,9 @@ The `dump` statement is for printing expressions, including maps, directly to st
* There are two variants: `dump` prints to stdout; `edump` prints to stderr.
* Output goes directly to stdout/stderr, respectively: data produced this way do not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* Output goes directly to stdout/stderr, respectively: data produced this way does not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* You can use `dump` to output single strings, numbers, or expressions including map-valued data. Map-valued data are printed as JSON.
* You can use `dump` to output single strings, numbers, or expressions including map-valued data. Map-valued data is printed as JSON.
* If you use `dump` (or `edump`) with no arguments, you get a JSON structure representing the current values of all out-of-stream variables.
@ -76,7 +76,7 @@ The `dump` statement is for printing expressions, including maps, directly to st
Records produced by a `mlr put` go downstream to the next verb in your `then`-chain, if any, or otherwise to standard output. If you want to additionally copy out records to files, you can do that using `tee`.
The syntax is, by example:
The syntax is, for example:
<pre class="pre-highlight-non-pair">
<b>mlr --from myfile.dat put 'tee > "tap.dat", $*' then sort -n index</b>
@ -84,8 +84,7 @@ The syntax is, by example:
First is `tee >`, then the filename expression (which can be an expression such as `"tap.".$a.".dat"`), then a comma, then `$*`. (Nothing else but `$*` is teeable.)
You can also write to a variable file name -- for example, you can split a
single file into multiple ones on field names:
You can also write to a variable file name -- for example, you can split a single file into multiple ones on field names:
<pre class="pre-highlight-in-pair">
<b>mlr --csv cat example.csv</b>
@ -324,26 +323,12 @@ There are four variants: `emit1`, `emitf`, `emit`, and `emitp`. These are used
to insert new records into the record stream -- or, optionally, redirect them
to files.
Keep in mind that out-of-stream variables are a nested, multi-level
[map](reference-main-maps.md) (directly viewable as JSON using `dump`), while
Miller record values are as well during processing -- but records may be
flattened down for output to tabular formats. See the page [Flatten/unflatten:
JSON vs. tabular formats](flatten-unflatten.md) for more information.
Keep in mind that out-of-stream variables are a nested, multi-level [map](reference-main-maps.md) (directly viewable as JSON using `dump`), while Miller record values are as well during processing -- but records may be flattened down for output to tabular formats. See the page [Flatten/unflatten: JSON vs. tabular formats](flatten-unflatten.md) for more information.
* You can use `emit1` to emit any map-valued expression, including `$*`,
map-valued out-of-stream variables, the entire out-of-stream-variable
collection `@*`, map-valued local variables, map literals, or map-valued
function return values.
* For `emit`, `emitp`, and `emitf`, you can emit map-valued local variables,
map-valued field attributes (with `$`), map-va out-of-stream variables (with
`@`), `$*`, `@*`, or map literals (with outermost `{...}`) -- but not arbitrary
expressions which evaluate to map (such as function return values).
* You can use `emit1` to emit any map-valued expression, including `$*`, map-valued out-of-stream variables, the entire out-of-stream-variable collection `@*`, map-valued local variables, map literals, or map-valued function return values.
* For `emit`, `emitp`, and `emitf`, you can emit map-valued local variables, map-valued field attributes (with `$`), map-va out-of-stream variables (with `@`), `$*`, `@*`, or map literals (with outermost `{...}`) -- but not arbitrary expressions which evaluate to map (such as function return values).
The reason for this is part historical and part technical. As we'll see below,
you can do lots of syntactical things with `emit`, `emitp`, and `emitf`,
including printing them side-by-side, index them, redirect the output to files,
etc. What this means syntactically is that Miller's parser needs to handle all
sorts of commas, parentheses, and so on:
The reason for this is partly historical and partly technical. As we'll see below, you can do lots of syntactical things with `emit`, `emitp`, and `emitf`, including printing them side-by-side, indexing them, redirecting the output to files, etc. What this means syntactically is that Miller's parser needs to handle all sorts of commas, parentheses, and so on:
<pre class="pre-non-highlight-non-pair">
emitf @count, @sum
@ -352,12 +337,7 @@ sorts of commas, parentheses, and so on:
# etc
</pre>
When we try to allow `emitf`/`emit`/`emitp` to handle arbitrary map-valued
expressions, like `mapexcept($*, mymap)` and so on, this inserts more syntactic
complexity in terms of commas, parentheses, and so on. The technical term is
_LR-1 shift-reduce conflicts_, but we can simply think of this in terms of the
parser not being able to efficiently disambiguate all the punctuational
opportunities.
When we try to allow `emitf`/`emit`/`emitp` to handle arbitrary map-valued expressions, like `mapexcept($*, mymap)` and so on, this inserts more syntactic complexity in terms of commas, parentheses, and so on. The technical term is _LR-1 shift-reduce conflicts_, but we can think of this in terms of the parser being unable to efficiently disambiguate all the punctuational opportunities.
So, `emit1` can handle syntactic richness in the one thing being emitted;
`emitf`, `emit`, and `emitp` can handle syntactic richness in the side-by-side
@ -365,7 +345,7 @@ placement, indexing, and redirection.
(Mnemonic: If all you want is to insert a new record into the record stream, `emit1` is probably the _one_ you want.)
What this means is that if you want to emit an expression which evaluates to a map, you can do quite simply
What this means is that if you want to emit an expression that evaluates to a map, you can do it quite simply:
<pre class="pre-highlight-in-pair">
<b>mlr --c2p --from example.csv put -q '</b>
@ -386,7 +366,7 @@ id color shape flag k index quantity rate
10 purple square false 10 91 72.3735 8.2430
</pre>
And if you want indexing, redirects, etc., just assign to a temporary variable and use one of the other emit variants:
And if you want indexing, redirects, etc., just assign to a temporary variable and use one of the other `emit` variants:
<pre class="pre-highlight-in-pair">
<b>mlr --c2p --from example.csv put -q '</b>
@ -410,7 +390,7 @@ id color shape flag k index quantity rate
## Emitf statements
Use **emitf** to output several out-of-stream variables side-by-side in the same output record. For `emitf` these mustn't have indexing using `@name[...]`. Example:
Use **emitf** to output several out-of-stream variables side-by-side in the same output record. For `emitf`, these mustn't have indexing using `@name[...]`. Example:
<pre class="pre-highlight-in-pair">
<b>mlr put -q '</b>
@ -426,7 +406,7 @@ count=5,x_sum=2.26476,y_sum=2.585083
## Emit statements
Use **emit** to output an out-of-stream variable. If it's non-indexed you'll get a simple key-value pair:
Use **emit** to output an out-of-stream variable. If it's non-indexed, you'll get a simple key-value pair:
<pre class="pre-highlight-in-pair">
<b>cat data/small</b>
@ -455,7 +435,7 @@ a=wye,b=pan,i=5,x=0.573288,y=0.863624
sum=2.26476
</pre>
If it's indexed then use as many names after `emit` as there are indices:
If it's indexed, then use as many names after `emit` as there are indices:
<pre class="pre-highlight-in-pair">
<b>mlr put -q '@sum[$a] += $x; end { dump }' data/small</b>
@ -624,8 +604,7 @@ sum.wye.wye 0.204603
sum.wye.pan 0.573288
</pre>
Use **--flatsep** to specify the character which joins multilevel
keys for `emitp` (it defaults to a colon):
Use **--flatsep** to specify the character that joins multilevel keys for `emitp` (it defaults to a colon):
<pre class="pre-highlight-in-pair">
<b>mlr --flatsep / put -q '@sum[$a][$b] += $x; end { emitp @sum, "a" }' data/small</b>
@ -703,11 +682,11 @@ hat hat 182.8535323148762 381 0.47993053101017374
hat pan 168.5538067327806 363 0.4643355557376876
</pre>
What this does is walk through the first out-of-stream variable (`@x_sum` in this example) as usual, then for each keylist found (e.g. `pan,wye`), include the values for the remaining out-of-stream variables (here, `@x_count` and `@x_mean`). You should use this when all out-of-stream variables in the emit statement have **the same shape and the same keylists**.
What this does is walk through the first out-of-stream variable (`@x_sum` in this example) as usual, then for each keylist found (e.g., `pan,wye`), include the values for the remaining out-of-stream variables (here, `@x_count` and `@x_mean`). You should use this when all out-of-stream variables in the emit statement have **the same shape and the same keylists**.
## Emit-all statements
Use **emit all** (or `emit @*` which is synonymous) to output all out-of-stream variables. You can use the following idiom to get various accumulators output side-by-side (reminiscent of `mlr stats1`):
Use **emit all** (or `emit @*`, which is synonymous) to output all out-of-stream variables. You can use the following idiom to get various accumulators' output side-by-side (reminiscent of `mlr stats1`):
<pre class="pre-highlight-in-pair">
<b>mlr --from data/small --opprint put -q '</b>

View file

@ -6,15 +6,15 @@ You can **output** variable-values or expressions in **five ways**:
* Use **emit1**/**emit**/**emitp**/**emitf** to send out-of-stream variables' current values to the output record stream, e.g. `@sum += $x; emit1 @sum` which produces an extra record such as `sum=3.1648382`. These records, just like records from input file(s), participate in downstream [then-chaining](reference-main-then-chaining.md) to other verbs.
* Use the **print** or **eprint** keywords which immediately print an expression *directly to standard output or standard error*, respectively. Note that `dump`, `edump`, `print`, and `eprint` don't output records which participate in `then`-chaining; rather, they're just immediate prints to stdout/stderr. The `printn` and `eprintn` keywords are the same except that they don't print final newlines. Additionally, you can print to a specified file instead of stdout/stderr.
* Use the **print** or **eprint** keywords which immediately print an expression *directly to standard output or standard error*, respectively. Note that `dump`, `edump`, `print`, and `eprint` don't output records that participate in `then`-chaining; rather, they're just immediate prints to stdout/stderr. The `printn` and `eprintn` keywords are the same except that they don't print final newlines. Additionally, you can print to a specified file instead of stdout/stderr.
* Use the **dump** or **edump** keywords, which *immediately print all out-of-stream variables as a JSON data structure to the standard output or standard error* (respectively).
* Use **tee** which formats the current stream record (not just an arbitrary string as with **print**) to a specific file.
* Use **tee**, which formats the current stream record (not just an arbitrary string as with **print**) to a specific file.
For the first two options you are populating the output-records stream which feeds into the next verb in a `then`-chain (if any), or which otherwise is formatted for output using `--o...` flags.
For the first two options, you are populating the output-records stream which feeds into the next verb in a `then`-chain (if any), or which otherwise is formatted for output using `--o...` flags.
For the last three options you are sending output directly to standard output, standard error, or a file.
For the last three options, you are sending output directly to standard output, standard error, or a file.
## Print statements
@ -22,7 +22,7 @@ The `print` statement is perhaps self-explanatory, but with a few light caveats:
* There are four variants: `print` goes to stdout with final newline, `printn` goes to stdout without final newline (you can include one using "\n" in your output string), `eprint` goes to stderr with final newline, and `eprintn` goes to stderr without final newline.
* Output goes directly to stdout/stderr, respectively: data produced this way do not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* Output goes directly to stdout/stderr, respectively: data produced this way does not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* Print statements are for strings (`print "hello"`), or things which can be made into strings: numbers (`print 3`, `print $a + $b`), or concatenations thereof (`print "a + b = " . ($a + $b)`). Maps (in `$*`, map-valued out-of-stream or local variables, and map literals) as well as arrays are printed as JSON.
@ -46,9 +46,9 @@ The `dump` statement is for printing expressions, including maps, directly to st
* There are two variants: `dump` prints to stdout; `edump` prints to stderr.
* Output goes directly to stdout/stderr, respectively: data produced this way do not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* Output goes directly to stdout/stderr, respectively: data produced this way does not go downstream to the next verb in a `then`-chain. (Use `emit` for that.)
* You can use `dump` to output single strings, numbers, or expressions including map-valued data. Map-valued data are printed as JSON.
* You can use `dump` to output single strings, numbers, or expressions including map-valued data. Map-valued data is printed as JSON.
* If you use `dump` (or `edump`) with no arguments, you get a JSON structure representing the current values of all out-of-stream variables.
@ -60,7 +60,7 @@ The `dump` statement is for printing expressions, including maps, directly to st
Records produced by a `mlr put` go downstream to the next verb in your `then`-chain, if any, or otherwise to standard output. If you want to additionally copy out records to files, you can do that using `tee`.
The syntax is, by example:
The syntax is, for example:
GENMD-CARDIFY-HIGHLIGHT-ONE
mlr --from myfile.dat put 'tee > "tap.dat", $*' then sort -n index
@ -68,8 +68,7 @@ GENMD-EOF
First is `tee >`, then the filename expression (which can be an expression such as `"tap.".$a.".dat"`), then a comma, then `$*`. (Nothing else but `$*` is teeable.)
You can also write to a variable file name -- for example, you can split a
single file into multiple ones on field names:
You can also write to a variable file name -- for example, you can split a single file into multiple ones on field names:
GENMD-RUN-COMMAND
mlr --csv cat example.csv
@ -135,26 +134,12 @@ There are four variants: `emit1`, `emitf`, `emit`, and `emitp`. These are used
to insert new records into the record stream -- or, optionally, redirect them
to files.
Keep in mind that out-of-stream variables are a nested, multi-level
[map](reference-main-maps.md) (directly viewable as JSON using `dump`), while
Miller record values are as well during processing -- but records may be
flattened down for output to tabular formats. See the page [Flatten/unflatten:
JSON vs. tabular formats](flatten-unflatten.md) for more information.
Keep in mind that out-of-stream variables are a nested, multi-level [map](reference-main-maps.md) (directly viewable as JSON using `dump`), while Miller record values are as well during processing -- but records may be flattened down for output to tabular formats. See the page [Flatten/unflatten: JSON vs. tabular formats](flatten-unflatten.md) for more information.
* You can use `emit1` to emit any map-valued expression, including `$*`,
map-valued out-of-stream variables, the entire out-of-stream-variable
collection `@*`, map-valued local variables, map literals, or map-valued
function return values.
* For `emit`, `emitp`, and `emitf`, you can emit map-valued local variables,
map-valued field attributes (with `$`), map-va out-of-stream variables (with
`@`), `$*`, `@*`, or map literals (with outermost `{...}`) -- but not arbitrary
expressions which evaluate to map (such as function return values).
* You can use `emit1` to emit any map-valued expression, including `$*`, map-valued out-of-stream variables, the entire out-of-stream-variable collection `@*`, map-valued local variables, map literals, or map-valued function return values.
* For `emit`, `emitp`, and `emitf`, you can emit map-valued local variables, map-valued field attributes (with `$`), map-va out-of-stream variables (with `@`), `$*`, `@*`, or map literals (with outermost `{...}`) -- but not arbitrary expressions which evaluate to map (such as function return values).
The reason for this is part historical and part technical. As we'll see below,
you can do lots of syntactical things with `emit`, `emitp`, and `emitf`,
including printing them side-by-side, index them, redirect the output to files,
etc. What this means syntactically is that Miller's parser needs to handle all
sorts of commas, parentheses, and so on:
The reason for this is partly historical and partly technical. As we'll see below, you can do lots of syntactical things with `emit`, `emitp`, and `emitf`, including printing them side-by-side, indexing them, redirecting the output to files, etc. What this means syntactically is that Miller's parser needs to handle all sorts of commas, parentheses, and so on:
GENMD-CARDIFY
emitf @count, @sum
@ -163,12 +148,7 @@ GENMD-CARDIFY
# etc
GENMD-EOF
When we try to allow `emitf`/`emit`/`emitp` to handle arbitrary map-valued
expressions, like `mapexcept($*, mymap)` and so on, this inserts more syntactic
complexity in terms of commas, parentheses, and so on. The technical term is
_LR-1 shift-reduce conflicts_, but we can simply think of this in terms of the
parser not being able to efficiently disambiguate all the punctuational
opportunities.
When we try to allow `emitf`/`emit`/`emitp` to handle arbitrary map-valued expressions, like `mapexcept($*, mymap)` and so on, this inserts more syntactic complexity in terms of commas, parentheses, and so on. The technical term is _LR-1 shift-reduce conflicts_, but we can think of this in terms of the parser being unable to efficiently disambiguate all the punctuational opportunities.
So, `emit1` can handle syntactic richness in the one thing being emitted;
`emitf`, `emit`, and `emitp` can handle syntactic richness in the side-by-side
@ -176,7 +156,7 @@ placement, indexing, and redirection.
(Mnemonic: If all you want is to insert a new record into the record stream, `emit1` is probably the _one_ you want.)
What this means is that if you want to emit an expression which evaluates to a map, you can do quite simply
What this means is that if you want to emit an expression that evaluates to a map, you can do it quite simply:
GENMD-RUN-COMMAND
mlr --c2p --from example.csv put -q '
@ -184,7 +164,7 @@ mlr --c2p --from example.csv put -q '
'
GENMD-EOF
And if you want indexing, redirects, etc., just assign to a temporary variable and use one of the other emit variants:
And if you want indexing, redirects, etc., just assign to a temporary variable and use one of the other `emit` variants:
GENMD-RUN-COMMAND
mlr --c2p --from example.csv put -q '
@ -195,7 +175,7 @@ GENMD-EOF
## Emitf statements
Use **emitf** to output several out-of-stream variables side-by-side in the same output record. For `emitf` these mustn't have indexing using `@name[...]`. Example:
Use **emitf** to output several out-of-stream variables side-by-side in the same output record. For `emitf`, these mustn't have indexing using `@name[...]`. Example:
GENMD-RUN-COMMAND
mlr put -q '
@ -208,7 +188,7 @@ GENMD-EOF
## Emit statements
Use **emit** to output an out-of-stream variable. If it's non-indexed you'll get a simple key-value pair:
Use **emit** to output an out-of-stream variable. If it's non-indexed, you'll get a simple key-value pair:
GENMD-RUN-COMMAND
cat data/small
@ -222,7 +202,7 @@ GENMD-RUN-COMMAND
mlr put -q '@sum += $x; end { emit @sum }' data/small
GENMD-EOF
If it's indexed then use as many names after `emit` as there are indices:
If it's indexed, then use as many names after `emit` as there are indices:
GENMD-RUN-COMMAND
mlr put -q '@sum[$a] += $x; end { dump }' data/small
@ -277,8 +257,7 @@ GENMD-RUN-COMMAND
mlr --oxtab put -q '@sum[$a][$b] += $x; end { emitp @sum }' data/small
GENMD-EOF
Use **--flatsep** to specify the character which joins multilevel
keys for `emitp` (it defaults to a colon):
Use **--flatsep** to specify the character that joins multilevel keys for `emitp` (it defaults to a colon):
GENMD-RUN-COMMAND
mlr --flatsep / put -q '@sum[$a][$b] += $x; end { emitp @sum, "a" }' data/small
@ -313,11 +292,11 @@ mlr --from data/medium --opprint put -q '
'
GENMD-EOF
What this does is walk through the first out-of-stream variable (`@x_sum` in this example) as usual, then for each keylist found (e.g. `pan,wye`), include the values for the remaining out-of-stream variables (here, `@x_count` and `@x_mean`). You should use this when all out-of-stream variables in the emit statement have **the same shape and the same keylists**.
What this does is walk through the first out-of-stream variable (`@x_sum` in this example) as usual, then for each keylist found (e.g., `pan,wye`), include the values for the remaining out-of-stream variables (here, `@x_count` and `@x_mean`). You should use this when all out-of-stream variables in the emit statement have **the same shape and the same keylists**.
## Emit-all statements
Use **emit all** (or `emit @*` which is synonymous) to output all out-of-stream variables. You can use the following idiom to get various accumulators output side-by-side (reminiscent of `mlr stats1`):
Use **emit all** (or `emit @*`, which is synonymous) to output all out-of-stream variables. You can use the following idiom to get various accumulators' output side-by-side (reminiscent of `mlr stats1`):
GENMD-RUN-COMMAND
mlr --from data/small --opprint put -q '

View file

@ -63,7 +63,7 @@ hat wye 10002 0.321507044286237609 0.568893318795083758 5 9 4 2 data/s
pan zee 10003 0.272054845593895200 0.425789896597056627 5 10 5 2 data/small2
</pre>
Anything from a `#` character to end of line is a code comment.
Anything from a `#` character to the end of the line is a code comment.
<pre class="pre-highlight-in-pair">
<b>mlr --opprint filter '($x > 0.5 && $y < 0.5) || ($x < 0.5 && $y > 0.5)' \</b>
@ -147,11 +147,11 @@ a=eks,b=wye,i=4,x=0.381399,y=0.134188,xy=0.40431623334340655
a=wye,b=pan,i=5,x=0.573288,y=0.863624,xy=1.036583592538489
</pre>
A suggested use-case here is defining functions in files, and calling them from command-line expressions.
A suggested use case here is defining functions in files and calling them from command-line expressions.
Another suggested use-case is putting default parameter values in files, e.g. using `begin{@count=is_present(@count)?@count:10}` in the file, where you can precede that using `begin{@count=40}` using `-e`.
Another suggested use case is putting default parameter values in files, e.g., using `begin{@count=is_present(@count)?@count:10}` in the file, where you can precede that using `begin{@count=40}` using `-e`.
Moreover, you can have one or more `-f` expressions (maybe one function per file, for example) and one or more `-e` expressions on the command line. If you mix `-f` and `-e` then the expressions are evaluated in the order encountered.
Moreover, you can have one or more `-f` expressions (maybe one function per file, for example) and one or more `-e` expressions on the command line. If you mix `-f` and `-e`, then the expressions are evaluated in the order encountered.
## Semicolons, commas, newlines, and curly braces
@ -180,7 +180,7 @@ x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
x=1,y=2,3=,4=,5=,6=,7=,8=,9=,10=,foo=bar
</pre>
Semicolons are required between statements even if those statements are on separate lines. **Newlines** are for your convenience but have no syntactic meaning: line endings do not terminate statements. For example, adjacent assignment statements must be separated by semicolons even if those statements are on separate lines:
Semicolons are required between statements, even if those statements are on separate lines. **Newlines** are for your convenience but have no syntactic meaning: line endings do not terminate statements. For example, adjacent assignment statements must be separated by semicolons even if those statements are on separate lines:
<pre class="pre-non-highlight-non-pair">
mlr put '

View file

@ -21,7 +21,7 @@ mlr --opprint put '
' data/small data/small2
GENMD-EOF
Anything from a `#` character to end of line is a code comment.
Anything from a `#` character to the end of the line is a code comment.
GENMD-RUN-COMMAND
mlr --opprint filter '($x > 0.5 && $y < 0.5) || ($x < 0.5 && $y > 0.5)' \
@ -62,11 +62,11 @@ GENMD-RUN-COMMAND
mlr --from data/small put -f data/fe-example-4.mlr -e '$xy = f($x, $y)'
GENMD-EOF
A suggested use-case here is defining functions in files, and calling them from command-line expressions.
A suggested use case here is defining functions in files and calling them from command-line expressions.
Another suggested use-case is putting default parameter values in files, e.g. using `begin{@count=is_present(@count)?@count:10}` in the file, where you can precede that using `begin{@count=40}` using `-e`.
Another suggested use case is putting default parameter values in files, e.g., using `begin{@count=is_present(@count)?@count:10}` in the file, where you can precede that using `begin{@count=40}` using `-e`.
Moreover, you can have one or more `-f` expressions (maybe one function per file, for example) and one or more `-e` expressions on the command line. If you mix `-f` and `-e` then the expressions are evaluated in the order encountered.
Moreover, you can have one or more `-f` expressions (maybe one function per file, for example) and one or more `-e` expressions on the command line. If you mix `-f` and `-e`, then the expressions are evaluated in the order encountered.
## Semicolons, commas, newlines, and curly braces
@ -84,7 +84,7 @@ GENMD-RUN-COMMAND
echo x=1,y=2 | mlr put 'while (NF < 10) { $[NF+1] = ""}; $foo = "bar"'
GENMD-EOF
Semicolons are required between statements even if those statements are on separate lines. **Newlines** are for your convenience but have no syntactic meaning: line endings do not terminate statements. For example, adjacent assignment statements must be separated by semicolons even if those statements are on separate lines:
Semicolons are required between statements, even if those statements are on separate lines. **Newlines** are for your convenience but have no syntactic meaning: line endings do not terminate statements. For example, adjacent assignment statements must be separated by semicolons even if those statements are on separate lines:
GENMD-INCLUDE-ESCAPED(data/newline-example.txt)

Some files were not shown because too many files have changed in this diff Show more