mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-23 02:14:13 +00:00
parent
fafff68c20
commit
93862f16f9
12 changed files with 75 additions and 45 deletions
5
Makefile
5
Makefile
|
|
@ -189,6 +189,9 @@ dev:
|
|||
make -C docs
|
||||
@echo DONE
|
||||
|
||||
docs:
|
||||
make -C docs
|
||||
|
||||
# ----------------------------------------------------------------
|
||||
# Keystroke-savers
|
||||
it: build check
|
||||
|
|
@ -216,4 +219,4 @@ release_tarball: build check
|
|||
|
||||
# ================================================================
|
||||
# Go does its own dependency management, outside of make.
|
||||
.PHONY: build mlr mprof mprof2 mprof3 mprof4 mprof5 check unit_test regression_test fmt dev
|
||||
.PHONY: build mlr mprof mprof2 mprof3 mprof4 mprof5 check unit_test regression_test fmt dev docs
|
||||
|
|
|
|||
|
|
@ -488,10 +488,11 @@ MISCELLANEOUS FLAGS
|
|||
slight performance benefit.
|
||||
--infer-int-as-float or -A
|
||||
Cast all integers in data files to floats.
|
||||
--infer-no-octal or -O Treat numbers like 0123 in data files as string
|
||||
"0123", not octal for decimal 83 etc.
|
||||
--infer-none or -S Don't treat values like 123 or 456.7 in data files as
|
||||
int/float; leave them as strings.
|
||||
--infer-octal or -O Treat numbers like 0123 in data files as numeric;
|
||||
default is string. Note that 00--07 etc scan as int;
|
||||
08-09 scan as float.
|
||||
--load {filename} Load DSL script file for all put/filter operations on
|
||||
the command line. If the name following `--load` is a
|
||||
directory, load all `*.mlr` files in that directory.
|
||||
|
|
@ -3006,5 +3007,5 @@ SEE ALSO
|
|||
|
||||
|
||||
|
||||
2021-12-15 MILLER(1)
|
||||
2021-12-22 MILLER(1)
|
||||
</pre>
|
||||
|
|
|
|||
|
|
@ -467,10 +467,11 @@ MISCELLANEOUS FLAGS
|
|||
slight performance benefit.
|
||||
--infer-int-as-float or -A
|
||||
Cast all integers in data files to floats.
|
||||
--infer-no-octal or -O Treat numbers like 0123 in data files as string
|
||||
"0123", not octal for decimal 83 etc.
|
||||
--infer-none or -S Don't treat values like 123 or 456.7 in data files as
|
||||
int/float; leave them as strings.
|
||||
--infer-octal or -O Treat numbers like 0123 in data files as numeric;
|
||||
default is string. Note that 00--07 etc scan as int;
|
||||
08-09 scan as float.
|
||||
--load {filename} Load DSL script file for all put/filter operations on
|
||||
the command line. If the name following `--load` is a
|
||||
directory, load all `*.mlr` files in that directory.
|
||||
|
|
@ -2985,4 +2986,4 @@ SEE ALSO
|
|||
|
||||
|
||||
|
||||
2021-12-15 MILLER(1)
|
||||
2021-12-22 MILLER(1)
|
||||
|
|
|
|||
|
|
@ -251,7 +251,7 @@ The following differences are rather technical. If they don't sound familiar to
|
|||
* See also `mlr help legacy-flags` or the [legacy-flags reference](reference-main-flag-list.md#legacy-flags).
|
||||
* Type-inference:
|
||||
* The `-S` and `-F` flags to `mlr put` and `mlr filter` are ignored, since type-inference is no longer done in `mlr put` and `mlr filter`, but rather, when records are first read. You can use `mlr -S` and `mlr -A`, respectively, instead to control type-inference within the record-readers.
|
||||
* Similarly, use `mlr -O` to force octal-looking strings to remain strings like `"0123"`, not ints like `0123` which is 83 in decimal.
|
||||
* Octal numbers like `0123` and `07` are type-inferred as string. Use `mlr -O` to infer them as octal integers. Note that `08` and `09` will then infer as float.
|
||||
* See also the [miscellaneous-flags reference](reference-main-flag-list.md#miscellaneous-flags).
|
||||
* Emitting a map-valued expression now requires either a temporary variable or the new `emit1` keyword. Please see the
|
||||
[page on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf) for more information.
|
||||
|
|
|
|||
|
|
@ -209,7 +209,7 @@ The following differences are rather technical. If they don't sound familiar to
|
|||
* See also `mlr help legacy-flags` or the [legacy-flags reference](reference-main-flag-list.md#legacy-flags).
|
||||
* Type-inference:
|
||||
* The `-S` and `-F` flags to `mlr put` and `mlr filter` are ignored, since type-inference is no longer done in `mlr put` and `mlr filter`, but rather, when records are first read. You can use `mlr -S` and `mlr -A`, respectively, instead to control type-inference within the record-readers.
|
||||
* Similarly, use `mlr -O` to force octal-looking strings to remain strings like `"0123"`, not ints like `0123` which is 83 in decimal.
|
||||
* Octal numbers like `0123` and `07` are type-inferred as string. Use `mlr -O` to infer them as octal integers. Note that `08` and `09` will then infer as float.
|
||||
* See also the [miscellaneous-flags reference](reference-main-flag-list.md#miscellaneous-flags).
|
||||
* Emitting a map-valued expression now requires either a temporary variable or the new `emit1` keyword. Please see the
|
||||
[page on emit statements](reference-dsl-output-statements.md#emit1-and-emitemitpemitf) for more information.
|
||||
|
|
|
|||
|
|
@ -345,10 +345,10 @@ These are flags which don't fit into any other category.
|
|||
`: This is an internal parameter which normally does not need to be modified. It controls the mechanism by which Miller accesses fields within records. In general --no-hash-records is faster, and is the default. For specific use-cases involving data having many fields, and many of them being processed during a given processing run, --hash-records might offer a slight performance benefit.
|
||||
* `--infer-int-as-float or -A
|
||||
`: Cast all integers in data files to floats.
|
||||
* `--infer-no-octal or -O
|
||||
`: Treat numbers like 0123 in data files as string "0123", not octal for decimal 83 etc.
|
||||
* `--infer-none or -S
|
||||
`: Don't treat values like 123 or 456.7 in data files as int/float; leave them as strings.
|
||||
* `--infer-octal or -O
|
||||
`: Treat numbers like 0123 in data files as numeric; default is string. Note that 00--07 etc scan as int; 08-09 scan as float.
|
||||
* `--load {filename}
|
||||
`: Load DSL script file for all put/filter operations on the command line. If the name following `--load` is a directory, load all `*.mlr` files in that directory. This is just like `put -f` and `filter -f` except it's up-front on the command line, so you can do something like `alias mlr='mlr --load ~/myscripts'` if you like.
|
||||
* `--mfrom {filenames}
|
||||
|
|
|
|||
|
|
@ -2619,11 +2619,12 @@ data having many fields, and many of them being processed during a given process
|
|||
},
|
||||
|
||||
{
|
||||
name: "--infer-no-octal",
|
||||
name: "--infer-octal",
|
||||
altNames: []string{"-O"},
|
||||
help: `Treat numbers like 0123 in data files as string "0123", not octal for decimal 83 etc.`,
|
||||
help: `Treat numbers like 0123 in data files as numeric; default is string.
|
||||
Note that 00--07 etc scan as int; 08-09 scan as float.`,
|
||||
parser: func(args []string, argc int, pargi *int, options *TOptions) {
|
||||
mlrval.SetInferrerNoOctal()
|
||||
mlrval.SetInferrerOctalAsInt()
|
||||
*pargi += 1
|
||||
},
|
||||
},
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ import (
|
|||
"github.com/johnkerl/miller/internal/pkg/lib"
|
||||
)
|
||||
|
||||
// TODO: no infer-bool from data files. Always false in this path.
|
||||
// TODO: comment no infer-bool from data files. Always false in this path.
|
||||
|
||||
// It's essential that we use mv.Type() not mv.mvtype since types are
|
||||
// JIT-computed on first access for most data-file values. See type.go for more
|
||||
|
|
@ -23,14 +23,24 @@ func (mv *Mlrval) Type() MVType {
|
|||
// Support for mlr -S, mlr -A, mlr -O.
|
||||
type tInferrer func(mv *Mlrval, input string, inferBool bool) *Mlrval
|
||||
|
||||
var packageLevelInferrer tInferrer = inferNormally
|
||||
var packageLevelInferrer tInferrer = inferWithOctalAsString
|
||||
|
||||
func SetInferrerNoOctal() {
|
||||
packageLevelInferrer = inferWithOctalSuppress
|
||||
// SetInferrerOctalAsInt is for default behavior.
|
||||
func SetInferrerOctalAsString() {
|
||||
packageLevelInferrer = inferWithOctalAsString
|
||||
}
|
||||
|
||||
// SetInferrerOctalAsInt is for mlr -O.
|
||||
func SetInferrerOctalAsInt() {
|
||||
packageLevelInferrer = inferWithOctalAsInt
|
||||
}
|
||||
|
||||
// SetInferrerStringOnly is for mlr -A.
|
||||
func SetInferrerIntAsFloat() {
|
||||
packageLevelInferrer = inferWithIntAsFloat
|
||||
}
|
||||
|
||||
// SetInferrerStringOnly is for mlr -S.
|
||||
func SetInferrerStringOnly() {
|
||||
packageLevelInferrer = inferStringOnly
|
||||
}
|
||||
|
|
@ -47,7 +57,24 @@ var downcasedFloatNamesToNotInfer = map[string]bool{
|
|||
"nan": true,
|
||||
}
|
||||
|
||||
func inferNormally(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
||||
var octalDetector = regexp.MustCompile("^-?0[0-9]+")
|
||||
|
||||
// inferWithOctalAsString is for default behavior.
|
||||
func inferWithOctalAsString(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
||||
inferWithOctalAsInt(mv, input, inferBool)
|
||||
if mv.mvtype != MT_INT && mv.mvtype != MT_FLOAT {
|
||||
return mv
|
||||
}
|
||||
|
||||
if octalDetector.MatchString(mv.printrep) {
|
||||
return mv.SetFromString(input)
|
||||
} else {
|
||||
return mv
|
||||
}
|
||||
}
|
||||
|
||||
// inferWithOctalAsInt is for mlr -O.
|
||||
func inferWithOctalAsInt(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
||||
if input == "" {
|
||||
return mv.SetFromVoid()
|
||||
}
|
||||
|
|
@ -73,23 +100,9 @@ func inferNormally(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
|||
return mv.SetFromString(input)
|
||||
}
|
||||
|
||||
var octalDetector = regexp.MustCompile("^-?0[0-9]+")
|
||||
|
||||
func inferWithOctalSuppress(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
||||
inferNormally(mv, input, inferBool)
|
||||
if mv.mvtype != MT_INT && mv.mvtype != MT_FLOAT {
|
||||
return mv
|
||||
}
|
||||
|
||||
if octalDetector.MatchString(mv.printrep) {
|
||||
return mv.SetFromString(input)
|
||||
} else {
|
||||
return mv
|
||||
}
|
||||
}
|
||||
|
||||
// inferWithIntAsFloat is for mlr -A.
|
||||
func inferWithIntAsFloat(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
||||
inferNormally(mv, input, inferBool)
|
||||
inferWithOctalAsString(mv, input, inferBool)
|
||||
if mv.Type() == MT_INT {
|
||||
mv.floatval = float64(mv.intval)
|
||||
mv.mvtype = MT_FLOAT
|
||||
|
|
@ -97,6 +110,7 @@ func inferWithIntAsFloat(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
|||
return mv
|
||||
}
|
||||
|
||||
// inferStringOnly is for mlr -S.
|
||||
func inferStringOnly(mv *Mlrval, input string, inferBool bool) *Mlrval {
|
||||
return mv.SetFromString(input)
|
||||
}
|
||||
|
|
|
|||
|
|
@ -19,10 +19,17 @@ func (mv *Mlrval) String() string {
|
|||
if floatOutputFormatter != nil && mv.Type() == MT_FLOAT {
|
||||
// Use the format string from global --ofmt, if supplied
|
||||
return floatOutputFormatter.FormatFloat(mv.floatval)
|
||||
} else {
|
||||
mv.setPrintRep()
|
||||
return mv.printrep
|
||||
}
|
||||
|
||||
// TODO: track dirty-flag checking / somesuch.
|
||||
// At present it's cumbersome to check if an array or map has been modified
|
||||
// and it's safest to always recompute the string-rep.
|
||||
if mv.IsArrayOrMap() {
|
||||
mv.printrepValid = false
|
||||
}
|
||||
|
||||
mv.setPrintRep()
|
||||
return mv.printrep
|
||||
}
|
||||
|
||||
// See mlrval.go for more about JIT-formatting of string backings
|
||||
|
|
|
|||
|
|
@ -467,10 +467,11 @@ MISCELLANEOUS FLAGS
|
|||
slight performance benefit.
|
||||
--infer-int-as-float or -A
|
||||
Cast all integers in data files to floats.
|
||||
--infer-no-octal or -O Treat numbers like 0123 in data files as string
|
||||
"0123", not octal for decimal 83 etc.
|
||||
--infer-none or -S Don't treat values like 123 or 456.7 in data files as
|
||||
int/float; leave them as strings.
|
||||
--infer-octal or -O Treat numbers like 0123 in data files as numeric;
|
||||
default is string. Note that 00--07 etc scan as int;
|
||||
08-09 scan as float.
|
||||
--load {filename} Load DSL script file for all put/filter operations on
|
||||
the command line. If the name following `--load` is a
|
||||
directory, load all `*.mlr` files in that directory.
|
||||
|
|
@ -2985,4 +2986,4 @@ SEE ALSO
|
|||
|
||||
|
||||
|
||||
2021-12-15 MILLER(1)
|
||||
2021-12-22 MILLER(1)
|
||||
|
|
|
|||
|
|
@ -2,12 +2,12 @@
|
|||
.\" Title: mlr
|
||||
.\" Author: [see the "AUTHOR" section]
|
||||
.\" Generator: ./mkman.rb
|
||||
.\" Date: 2021-12-15
|
||||
.\" Date: 2021-12-22
|
||||
.\" Manual: \ \&
|
||||
.\" Source: \ \&
|
||||
.\" Language: English
|
||||
.\"
|
||||
.TH "MILLER" "1" "2021-12-15" "\ \&" "\ \&"
|
||||
.TH "MILLER" "1" "2021-12-22" "\ \&" "\ \&"
|
||||
.\" -----------------------------------------------------------------
|
||||
.\" * Portability definitions
|
||||
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
|
@ -586,10 +586,11 @@ These are flags which don't fit into any other category.
|
|||
slight performance benefit.
|
||||
--infer-int-as-float or -A
|
||||
Cast all integers in data files to floats.
|
||||
--infer-no-octal or -O Treat numbers like 0123 in data files as string
|
||||
"0123", not octal for decimal 83 etc.
|
||||
--infer-none or -S Don't treat values like 123 or 456.7 in data files as
|
||||
int/float; leave them as strings.
|
||||
--infer-octal or -O Treat numbers like 0123 in data files as numeric;
|
||||
default is string. Note that 00--07 etc scan as int;
|
||||
08-09 scan as float.
|
||||
--load {filename} Load DSL script file for all put/filter operations on
|
||||
the command line. If the name following `--load` is a
|
||||
directory, load all `*.mlr` files in that directory.
|
||||
|
|
|
|||
1
todo.txt
1
todo.txt
|
|
@ -5,6 +5,7 @@ PUNCHDOWN LIST
|
|||
- sort-hof check
|
||||
- more linux perf checks
|
||||
- mlr -O / abor!
|
||||
> doc 07 int 08 float
|
||||
- --ifs-regex & --ips-regex -- guessing is not safe as evidence by '.' and '|'
|
||||
- big-picture item @ Rmd (csv memes; and beyond); also webdoc intro page
|
||||
- function: randsel for arrays; use for example-csv-expander
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue