New format DSL function (#869)

* New format DSL function

* Updated affected test cases involving on-line help on "for" prefix

* doc-build artifacts for previous commit

* regression-test cases
This commit is contained in:
John Kerl 2022-01-12 22:40:59 -05:00 committed by GitHub
parent 8d251161a5
commit a0048f0393
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
50 changed files with 180 additions and 13 deletions

View file

@ -207,9 +207,9 @@ FUNCTION LIST
asserting_present asserting_string atan atan2 atanh bitcount boolean
capitalize cbrt ceil clean_whitespace collapse_whitespace concat cos cosh
depth dhms2fsec dhms2sec erf erfc every exp expm1 flatten float floor fmtnum
fold fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub haskey
hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array is_bool
is_boolean is_empty is_empty_map is_error is_float is_int is_map
fold format fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub
haskey hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array
is_bool is_boolean is_empty is_empty_map is_error is_float is_int is_map
is_nonempty_map is_not_array is_not_empty is_not_map is_not_null is_null
is_numeric is_present is_string joink joinkv joinv json_parse json_stringify
leafcount length localtime2gmt localtime2sec log log10 log1p logifit lstrip
@ -2139,6 +2139,13 @@ FUNCTIONS FOR FILTER/PUT
Array example: fold([1,2,3,4,5], func(acc,e) {return acc + e**3}, 10000) returns 10225.
Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum": accv+ev**2}}, {"sum":10000}) returns 10035.
format
(class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
fsec2dhms
(class=time #args=1) Formats floating-point seconds as in fsec2dhms(500000.25) = "5d18h53m20.250000s"

View file

@ -186,9 +186,9 @@ FUNCTION LIST
asserting_present asserting_string atan atan2 atanh bitcount boolean
capitalize cbrt ceil clean_whitespace collapse_whitespace concat cos cosh
depth dhms2fsec dhms2sec erf erfc every exp expm1 flatten float floor fmtnum
fold fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub haskey
hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array is_bool
is_boolean is_empty is_empty_map is_error is_float is_int is_map
fold format fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub
haskey hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array
is_bool is_boolean is_empty is_empty_map is_error is_float is_int is_map
is_nonempty_map is_not_array is_not_empty is_not_map is_not_null is_null
is_numeric is_present is_string joink joinkv joinv json_parse json_stringify
leafcount length localtime2gmt localtime2sec log log10 log1p logifit lstrip
@ -2118,6 +2118,13 @@ FUNCTIONS FOR FILTER/PUT
Array example: fold([1,2,3,4,5], func(acc,e) {return acc + e**3}, 10000) returns 10225.
Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum": accv+ev**2}}, {"sum":10000}) returns 10035.
format
(class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
fsec2dhms
(class=time #args=1) Formats floating-point seconds as in fsec2dhms(500000.25) = "5d18h53m20.250000s"

View file

@ -74,7 +74,7 @@ is 2. Unary operators such as `!` and `~` show argument-count of 1; the ternary
* [**Hashing functions**](#hashing-functions): [md5](#md5), [sha1](#sha1), [sha256](#sha256), [sha512](#sha512).
* [**Higher-order-functions functions**](#higher-order-functions-functions): [any](#any), [apply](#apply), [every](#every), [fold](#fold), [reduce](#reduce), [select](#select), [sort](#sort).
* [**Math functions**](#math-functions): [abs](#abs), [acos](#acos), [acosh](#acosh), [asin](#asin), [asinh](#asinh), [atan](#atan), [atan2](#atan2), [atanh](#atanh), [cbrt](#cbrt), [ceil](#ceil), [cos](#cos), [cosh](#cosh), [erf](#erf), [erfc](#erfc), [exp](#exp), [expm1](#expm1), [floor](#floor), [invqnorm](#invqnorm), [log](#log), [log10](#log10), [log1p](#log1p), [logifit](#logifit), [max](#max), [min](#min), [qnorm](#qnorm), [round](#round), [roundm](#roundm), [sgn](#sgn), [sin](#sin), [sinh](#sinh), [sqrt](#sqrt), [tan](#tan), [tanh](#tanh), [urand](#urand), [urand32](#urand32), [urandelement](#urandelement), [urandint](#urandint), [urandrange](#urandrange).
* [**String functions**](#string-functions): [capitalize](#capitalize), [clean_whitespace](#clean_whitespace), [collapse_whitespace](#collapse_whitespace), [gsub](#gsub), [lstrip](#lstrip), [regextract](#regextract), [regextract_or_else](#regextract_or_else), [rstrip](#rstrip), [ssub](#ssub), [strip](#strip), [strlen](#strlen), [sub](#sub), [substr](#substr), [substr0](#substr0), [substr1](#substr1), [tolower](#tolower), [toupper](#toupper), [truncate](#truncate), [\.](#dot).
* [**String functions**](#string-functions): [capitalize](#capitalize), [clean_whitespace](#clean_whitespace), [collapse_whitespace](#collapse_whitespace), [format](#format), [gsub](#gsub), [lstrip](#lstrip), [regextract](#regextract), [regextract_or_else](#regextract_or_else), [rstrip](#rstrip), [ssub](#ssub), [strip](#strip), [strlen](#strlen), [sub](#sub), [substr](#substr), [substr0](#substr0), [substr1](#substr1), [tolower](#tolower), [toupper](#toupper), [truncate](#truncate), [\.](#dot).
* [**System functions**](#system-functions): [hostname](#hostname), [os](#os), [system](#system), [version](#version).
* [**Time functions**](#time-functions): [dhms2fsec](#dhms2fsec), [dhms2sec](#dhms2sec), [fsec2dhms](#fsec2dhms), [fsec2hms](#fsec2hms), [gmt2localtime](#gmt2localtime), [gmt2sec](#gmt2sec), [hms2fsec](#hms2fsec), [hms2sec](#hms2sec), [localtime2gmt](#localtime2gmt), [localtime2sec](#localtime2sec), [sec2dhms](#sec2dhms), [sec2gmt](#sec2gmt), [sec2gmtdate](#sec2gmtdate), [sec2hms](#sec2hms), [sec2localdate](#sec2localdate), [sec2localtime](#sec2localtime), [strftime](#strftime), [strftime_local](#strftime_local), [strptime](#strptime), [strptime_local](#strptime_local), [systime](#systime), [systimeint](#systimeint), [uptime](#uptime).
* [**Typing functions**](#typing-functions): [asserting_absent](#asserting_absent), [asserting_array](#asserting_array), [asserting_bool](#asserting_bool), [asserting_boolean](#asserting_boolean), [asserting_empty](#asserting_empty), [asserting_empty_map](#asserting_empty_map), [asserting_error](#asserting_error), [asserting_float](#asserting_float), [asserting_int](#asserting_int), [asserting_map](#asserting_map), [asserting_nonempty_map](#asserting_nonempty_map), [asserting_not_array](#asserting_not_array), [asserting_not_empty](#asserting_not_empty), [asserting_not_map](#asserting_not_map), [asserting_not_null](#asserting_not_null), [asserting_null](#asserting_null), [asserting_numeric](#asserting_numeric), [asserting_present](#asserting_present), [asserting_string](#asserting_string), [is_absent](#is_absent), [is_array](#is_array), [is_bool](#is_bool), [is_boolean](#is_boolean), [is_empty](#is_empty), [is_empty_map](#is_empty_map), [is_error](#is_error), [is_float](#is_float), [is_int](#is_int), [is_map](#is_map), [is_nonempty_map](#is_nonempty_map), [is_not_array](#is_not_array), [is_not_empty](#is_not_empty), [is_not_map](#is_not_map), [is_not_null](#is_not_null), [is_null](#is_null), [is_numeric](#is_numeric), [is_present](#is_present), [is_string](#is_string), [typeof](#typeof).
@ -927,6 +927,16 @@ collapse_whitespace (class=string #args=1) Strip repeated whitespace from strin
</pre>
### format
<pre class="pre-non-highlight-non-pair">
format (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
</pre>
### gsub
<pre class="pre-non-highlight-non-pair">
gsub (class=string #args=3) '$name=gsub($name, "old", "new")' (replace all).

View file

@ -1,6 +1,7 @@
package bifs
import (
"bytes"
"regexp"
"strconv"
"strings"
@ -287,6 +288,53 @@ func BIF_clean_whitespace(input1 *mlrval.Mlrval) *mlrval.Mlrval {
)
}
// ================================================================
func BIF_format(mlrvals []*mlrval.Mlrval) *mlrval.Mlrval {
if len(mlrvals) == 0 {
return mlrval.VOID
}
formatString, ok := mlrvals[0].GetStringValue()
if !ok { // not a string
return mlrval.ERROR
}
pieces := lib.SplitString(formatString, "{}")
var buffer bytes.Buffer
// Example: format("{}:{}", 8, 9)
//
// * piece[0] ""
// * piece[1] ":"
// * piece[2] ""
// * mlrval[1] 8
// * mlrval[2] 9
//
// So:
// * Write piece[0]
// * Write mlrvals[1]
// * Write piece[1]
// * Write mlrvals[2]
// * Write piece[2]
// Q: What if too few arguments for format?
// A: Leave them off
// Q: What if too many arguments for format?
// A: Leave them off
n := len(mlrvals)
for i, piece := range pieces {
if i > 0 {
if i < n {
buffer.WriteString(mlrvals[i].String())
}
}
buffer.WriteString(piece)
}
return mlrval.FromString(buffer.String())
}
// ================================================================
func BIF_hexfmt(input1 *mlrval.Mlrval) *mlrval.Mlrval {
if input1.IsInt() {

View file

@ -510,6 +510,19 @@ Arrays are new in Miller 6; the substr function is older.`,
binaryFunc: bifs.BIF_truncate,
},
{
name: "format",
class: FUNC_CLASS_STRING,
help: `Using first argument as format string, interpolate remaining arguments in place of
each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.`,
examples: []string{
`format("{}:{}:{}", 1,2) gives "1:2:".`,
`format("{}:{}:{}", 1,2,3) gives "1:2:3".`,
`format("{}:{}:{}", 1,2,3,4) gives "1:2:3".`,
},
variadicFunc: bifs.BIF_format,
},
// ----------------------------------------------------------------
// FUNC_CLASS_HASHING

View file

@ -186,9 +186,9 @@ FUNCTION LIST
asserting_present asserting_string atan atan2 atanh bitcount boolean
capitalize cbrt ceil clean_whitespace collapse_whitespace concat cos cosh
depth dhms2fsec dhms2sec erf erfc every exp expm1 flatten float floor fmtnum
fold fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub haskey
hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array is_bool
is_boolean is_empty is_empty_map is_error is_float is_int is_map
fold format fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub
haskey hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array
is_bool is_boolean is_empty is_empty_map is_error is_float is_int is_map
is_nonempty_map is_not_array is_not_empty is_not_map is_not_null is_null
is_numeric is_present is_string joink joinkv joinv json_parse json_stringify
leafcount length localtime2gmt localtime2sec log log10 log1p logifit lstrip
@ -2118,6 +2118,13 @@ FUNCTIONS FOR FILTER/PUT
Array example: fold([1,2,3,4,5], func(acc,e) {return acc + e**3}, 10000) returns 10225.
Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum": accv+ev**2}}, {"sum":10000}) returns 10035.
format
(class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
fsec2dhms
(class=time #args=1) Formats floating-point seconds as in fsec2dhms(500000.25) = "5d18h53m20.250000s"

View file

@ -233,9 +233,9 @@ asserting_not_map asserting_not_null asserting_null asserting_numeric
asserting_present asserting_string atan atan2 atanh bitcount boolean
capitalize cbrt ceil clean_whitespace collapse_whitespace concat cos cosh
depth dhms2fsec dhms2sec erf erfc every exp expm1 flatten float floor fmtnum
fold fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub haskey
hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array is_bool
is_boolean is_empty is_empty_map is_error is_float is_int is_map
fold format fsec2dhms fsec2hms get_keys get_values gmt2localtime gmt2sec gsub
haskey hexfmt hms2fsec hms2sec hostname int invqnorm is_absent is_array
is_bool is_boolean is_empty is_empty_map is_error is_float is_int is_map
is_nonempty_map is_not_array is_not_empty is_not_map is_not_null is_null
is_numeric is_present is_string joink joinkv joinv json_parse json_stringify
leafcount length localtime2gmt localtime2sec log log10 log1p logifit lstrip
@ -2963,6 +2963,19 @@ Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum":
.fi
.if n \{\
.RE
.SS "format"
.if n \{\
.RS 0
.\}
.nf
(class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
.fi
.if n \{\
.RE
.SS "fsec2dhms"
.if n \{\
.RS 0

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@

View file

@ -0,0 +1,3 @@
end {
print format()
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
(error)

View file

@ -0,0 +1,3 @@
end {
print format(1)
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@

View file

@ -0,0 +1,3 @@
end {
print format("")
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
abc

View file

@ -0,0 +1,3 @@
end {
print format("abc")
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@

View file

@ -0,0 +1,3 @@
end {
print format("{}")
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
1

View file

@ -0,0 +1,3 @@
end {
print format("{}", 1)
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
1

View file

@ -0,0 +1,3 @@
end {
print format("{}", 1, 2)
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
<abc:>

View file

@ -0,0 +1,3 @@
end {
print format("<{}:{}>", "abc")
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
<abc:def>

View file

@ -0,0 +1,3 @@
end {
print format("<{}:{}>", "abc", "def")
}

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

View file

@ -0,0 +1 @@
<abc:def>

View file

@ -0,0 +1,3 @@
end {
print format("<{}:{}>", "abc", "def", "ghi")
}

View file

@ -28,6 +28,11 @@ Options:
with s in them. Undefined behavior results otherwise.
-n Coerce field values autodetected as int to float, and then
apply the float format.
format (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
for: defines a for-loop using one of three styles. The body statements must
be wrapped in curly braces.
For-loop over stream record:

View file

@ -13,3 +13,8 @@ For-loop over out-of-stream variables:
C-style for-loop:
Example: 'for (var i = 0, var b = 1; i < 10; i += 1, b *= 2) { ... }'
format (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".

View file

@ -80,6 +80,8 @@ UX
! bnf fix for '[[' ']]' etc -- make it a nesting of singles. since otherwise no '[[3,4]]' literals :(
! broadly rethink os.Exit, especially as affecting mlr repl
* ?xyz and ??xyz in repl, for :help and :help find respectively
* consider expanding '(error)' to have more useful error-text
* sync-print option; or (yuck) another xprint variant; or ...; emph dump/eprint
* strptime w/ ...00.Z -> error