mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-23 02:14:13 +00:00
Note IANA TSV support (#1582)
* Note IANA TSV support * run `make docs`
This commit is contained in:
parent
202a79d0e2
commit
dc21fa3cd5
2 changed files with 24 additions and 12 deletions
|
|
@ -106,17 +106,23 @@ When `mlr` is invoked with the `--csv` or `--csvlite` option, key names are foun
|
|||
|
||||
Miller has record separator `RS` and field separator `FS`, just as `awk` does. (See also the [separators page](reference-main-separators.md).)
|
||||
|
||||
**TSV (tab-separated values):** `FS` is tab and `RS` is newline (or carriage return + linefeed for
|
||||
Windows). On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return,
|
||||
newline, tab, and backslash, respectively. On output, the reverse is done -- for example, if a field
|
||||
has an embedded newline, that newline is replaced by `\n`.
|
||||
**CSV (comma-separated values):** Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180).
|
||||
|
||||
* This includes CRLF line-terminators by default, regardless of platform.
|
||||
* Any cell containing a comma or a carriage return within it must be double-quoted.
|
||||
|
||||
**TSV (tab-separated values):** Miller's `--tsv` supports [IANA TSV](https://www.iana.org/assignments/media-types/text/tab-separated-values).
|
||||
|
||||
* `FS` is tab and `RS` is newline (or carriage return + linefeed for Windows).
|
||||
* On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return, newline, tab, and backslash, respectively.
|
||||
* On output, the reverse is done -- for example, if a field has an embedded newline, that newline is replaced by `\n`.
|
||||
* A tab within a cell must be encoded as `\t`.
|
||||
* A carriage return within a cell must be encoded as `\n`.
|
||||
|
||||
**ASV (ASCII-separated values):** the flags `--asv`, `--iasv`, `--oasv`, `--asvlite`, `--iasvlite`, and `--oasvlite` are analogous except they use ASCII FS and RS `0x1f` and `0x1e`, respectively.
|
||||
|
||||
**USV (Unicode-separated values):** likewise, the flags `--usv`, `--iusv`, `--ousv`, `--usvlite`, `--iusvlite`, and `--ousvlite` use Unicode FS and RS `U+241F` (UTF-8 `0x0xe2909f`) and `U+241E` (UTF-8 `0xe2909e`), respectively.
|
||||
|
||||
Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180). This includes CRLF line-terminators by default, regardless of platform.
|
||||
|
||||
Here are the differences between CSV and CSV-lite:
|
||||
|
||||
* CSV-lite naively splits lines on newline, and fields on comma -- embedded commas and newlines are not escaped in any way.
|
||||
|
|
|
|||
|
|
@ -18,17 +18,23 @@ When `mlr` is invoked with the `--csv` or `--csvlite` option, key names are foun
|
|||
|
||||
Miller has record separator `RS` and field separator `FS`, just as `awk` does. (See also the [separators page](reference-main-separators.md).)
|
||||
|
||||
**TSV (tab-separated values):** `FS` is tab and `RS` is newline (or carriage return + linefeed for
|
||||
Windows). On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return,
|
||||
newline, tab, and backslash, respectively. On output, the reverse is done -- for example, if a field
|
||||
has an embedded newline, that newline is replaced by `\n`.
|
||||
**CSV (comma-separated values):** Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180).
|
||||
|
||||
* This includes CRLF line-terminators by default, regardless of platform.
|
||||
* Any cell containing a comma or a carriage return within it must be double-quoted.
|
||||
|
||||
**TSV (tab-separated values):** Miller's `--tsv` supports [IANA TSV](https://www.iana.org/assignments/media-types/text/tab-separated-values).
|
||||
|
||||
* `FS` is tab and `RS` is newline (or carriage return + linefeed for Windows).
|
||||
* On input, if fields have `\r`, `\n`, `\t`, or `\\`, those are decoded as carriage return, newline, tab, and backslash, respectively.
|
||||
* On output, the reverse is done -- for example, if a field has an embedded newline, that newline is replaced by `\n`.
|
||||
* A tab within a cell must be encoded as `\t`.
|
||||
* A carriage return within a cell must be encoded as `\n`.
|
||||
|
||||
**ASV (ASCII-separated values):** the flags `--asv`, `--iasv`, `--oasv`, `--asvlite`, `--iasvlite`, and `--oasvlite` are analogous except they use ASCII FS and RS `0x1f` and `0x1e`, respectively.
|
||||
|
||||
**USV (Unicode-separated values):** likewise, the flags `--usv`, `--iusv`, `--ousv`, `--usvlite`, `--iusvlite`, and `--ousvlite` use Unicode FS and RS `U+241F` (UTF-8 `0x0xe2909f`) and `U+241E` (UTF-8 `0xe2909e`), respectively.
|
||||
|
||||
Miller's `--csv` flag supports [RFC-4180 CSV](https://tools.ietf.org/html/rfc4180). This includes CRLF line-terminators by default, regardless of platform.
|
||||
|
||||
Here are the differences between CSV and CSV-lite:
|
||||
|
||||
* CSV-lite naively splits lines on newline, and fields on comma -- embedded commas and newlines are not escaped in any way.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue