* latin1_to_utf8 and utf8_to_latin1 DSL functions * doc-build artifacts for previous commit * Test cases for latin1_to_utf8 and utf8_to_latin1 * extend on-line help * latin1_to_utf8 and utf8_to_latin1 verbs * unit-test cases for verbs * Keep with kebab-case naming convention for verbs * webdocs
56 KiB
These are functions in the Miller programming language
that you can call when you use mlr put and mlr filter. For example, when you type
mlr --icsv --opprint --from example.csv put ' $color = toupper($color); $shape = gsub($shape, "[aeiou]", "*"); '
color shape flag k index quantity rate YELLOW tr**ngl* true 1 11 43.6498 9.8870 RED sq**r* true 2 15 79.2778 0.0130 RED c*rcl* true 3 16 13.8103 2.9010 RED sq**r* false 4 48 77.5542 7.4670 PURPLE tr**ngl* false 5 51 81.2290 8.5910 RED sq**r* false 6 64 77.1991 9.5310 PURPLE tr**ngl* false 7 65 80.1405 5.8240 YELLOW c*rcl* true 8 73 63.9785 4.2370 YELLOW c*rcl* true 9 87 63.5058 8.3350 PURPLE sq**r* false 10 91 72.3735 8.2430
the toupper and gsub bits are functions.
Overview
At the command line, you can use mlr -f and mlr -F for information much
like what's on this page.
Each function takes a specific number of arguments, as shown below, except for
functions marked as variadic such as min and max. (The latter compute min
and max of any number of arguments.) There is no notion of optional or
default-on-absent arguments. All argument-passing is positional rather than by
name; arguments are passed by value, not by reference.
At the command line, you can get a list of all functions using mlr -f, with
details using mlr -F. (Or, mlr help usage-functions-by-class to get
details in the order shown on this page.) You can get detail for a given
function using mlr help function namegoeshere, e.g. mlr help function gsub.
Operators are listed here along with functions. In this case, the
argument-count is the number of items involved in the infix operator, e.g. we
say x+y so the details for the + operator say that its number of arguments
is 2. Unary operators such as ! and ~ show argument-count of 1; the ternary
? : operator shows an argument-count of 3.
Functions by class
- Arithmetic functions: bitcount, madd, mexp, mmul, msub, pow, %, &, *, **, +, -, .*, .+, .-, ./, /, //, <<, >>, >>>, ^, |, ~.
- Boolean functions: !, !=, !=~, &&, <, <=, <=>, ==, =~, >, >=, ?:, ??, ???, ^^, ||.
- Collections functions: append, arrayify, concat, depth, flatten, get_keys, get_values, haskey, json_parse, json_stringify, leafcount, length, mapdiff, mapexcept, mapselect, mapsum, unflatten.
- Conversion functions: boolean, float, fmtifnum, fmtnum, hexfmt, int, joink, joinkv, joinv, splita, splitax, splitkv, splitkvx, splitnv, splitnvx, string.
- Hashing functions: md5, sha1, sha256, sha512.
- Higher-order-functions functions: any, apply, every, fold, reduce, select, sort.
- Math functions: abs, acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, ceil, cos, cosh, erf, erfc, exp, expm1, floor, invqnorm, log, log10, log1p, logifit, max, min, qnorm, round, roundm, sgn, sin, sinh, sqrt, tan, tanh, urand, urand32, urandelement, urandint, urandrange.
- String functions: capitalize, clean_whitespace, collapse_whitespace, format, gssub, gsub, latin1_to_utf8, lstrip, regextract, regextract_or_else, rstrip, ssub, strip, strlen, sub, substr, substr0, substr1, tolower, toupper, truncate, unformat, unformatx, utf8_to_latin1, ..
- System functions: hostname, os, system, version.
- Time functions: dhms2fsec, dhms2sec, fsec2dhms, fsec2hms, gmt2localtime, gmt2sec, hms2fsec, hms2sec, localtime2gmt, localtime2sec, sec2dhms, sec2gmt, sec2gmtdate, sec2hms, sec2localdate, sec2localtime, strftime, strftime_local, strptime, strptime_local, systime, systimeint, uptime.
- Typing functions: asserting_absent, asserting_array, asserting_bool, asserting_boolean, asserting_empty, asserting_empty_map, asserting_error, asserting_float, asserting_int, asserting_map, asserting_nonempty_map, asserting_not_array, asserting_not_empty, asserting_not_map, asserting_not_null, asserting_null, asserting_numeric, asserting_present, asserting_string, is_absent, is_array, is_bool, is_boolean, is_empty, is_empty_map, is_error, is_float, is_int, is_map, is_nan, is_nonempty_map, is_not_array, is_not_empty, is_not_map, is_not_null, is_null, is_numeric, is_present, is_string, typeof.
Arithmetic functions
bitcount
bitcount (class=arithmetic #args=1) Count of 1-bits.
madd
madd (class=arithmetic #args=3) a + b mod m (integers)
mexp
mexp (class=arithmetic #args=3) a ** b mod m (integers)
mmul
mmul (class=arithmetic #args=3) a * b mod m (integers)
msub
msub (class=arithmetic #args=3) a - b mod m (integers)
pow
pow (class=arithmetic #args=2) Exponentiation. Same as **, but as a function.
%
% (class=arithmetic #args=2) Remainder; never negative-valued (pythonic).
&
& (class=arithmetic #args=2) Bitwise AND.
*
* (class=arithmetic #args=2) Multiplication, with integer*integer overflow to float.
**
** (class=arithmetic #args=2) Exponentiation. Same as pow, but as an infix operator.
+
+ (class=arithmetic #args=1,2) Addition as binary operator; unary plus operator.
-
- (class=arithmetic #args=1,2) Subtraction as binary operator; unary negation operator.
.*
.* (class=arithmetic #args=2) Multiplication, with integer-to-integer overflow.
.+
.+ (class=arithmetic #args=2) Addition, with integer-to-integer overflow.
.-
.- (class=arithmetic #args=2) Subtraction, with integer-to-integer overflow.
./
./ (class=arithmetic #args=2) Integer division, rounding toward zero.
/
/ (class=arithmetic #args=2) Division. Integer / integer is integer when exact, else floating-point: e.g. 6/3 is 2 but 6/4 is 1.5.
//
// (class=arithmetic #args=2) Pythonic integer division, rounding toward negative.
<<
<< (class=arithmetic #args=2) Bitwise left-shift.
>>
>> (class=arithmetic #args=2) Bitwise signed right-shift.
>>>
>>> (class=arithmetic #args=2) Bitwise unsigned right-shift.
^
^ (class=arithmetic #args=2) Bitwise XOR.
|
| (class=arithmetic #args=2) Bitwise OR.
~
~ (class=arithmetic #args=1) Bitwise NOT. Beware '$y=~$x' since =~ is the regex-match operator: try '$y = ~$x'.
Boolean functions
!
! (class=boolean #args=1) Logical negation.
!=
!= (class=boolean #args=2) String/numeric inequality. Mixing number and string results in string compare.
!=~
!=~ (class=boolean #args=2) String (left-hand side) does not match regex (right-hand side), e.g. '$name !=~ "^a.*b$"'.
&&
&& (class=boolean #args=2) Logical AND.
<
< (class=boolean #args=2) String/numeric less-than. Mixing number and string results in string compare.
<=
<= (class=boolean #args=2) String/numeric less-than-or-equals. Mixing number and string results in string compare.
<=>
<=> (class=boolean #args=2) Comparator, nominally for sorting. Given a <=> b, returns <0, 0, >0 as a < b, a == b, or a > b, respectively.
==
== (class=boolean #args=2) String/numeric equality. Mixing number and string results in string compare.
=~
=~ (class=boolean #args=2) String (left-hand side) matches regex (right-hand side), e.g. '$name =~ "^a.*b$"'. Capture groups \1 through \9 are matched from (...) in the right-hand side, and can be used within subsequent DSL statements. See also "Regular expressions" at https://miller.readthedocs.io.
Examples:
With if-statement: if ($url =~ "http.*com") { ... }
Without if-statement: given $line = "index ab09 file", and $line =~ "([a-z][a-z])([0-9][0-9])", then $label = "[\1:\2]", $label is "[ab:09]"
>
> (class=boolean #args=2) String/numeric greater-than. Mixing number and string results in string compare.
>=
>= (class=boolean #args=2) String/numeric greater-than-or-equals. Mixing number and string results in string compare.
?:
?: (class=boolean #args=3) Standard ternary operator.
??
?? (class=boolean #args=2) Absent-coalesce operator. $a ?? 1 evaluates to 1 if $a isn't defined in the current record.
???
??? (class=boolean #args=2) Absent/empty-coalesce operator. $a ??? 1 evaluates to 1 if $a isn't defined in the current record, or has empty value.
^^
^^ (class=boolean #args=2) Logical XOR.
||
|| (class=boolean #args=2) Logical OR.
Collections functions
append
append (class=collections #args=2) Appends second argument to end of first argument, which must be an array.
arrayify
arrayify (class=collections #args=1) Walks through a nested map/array, converting any map with consecutive keys "1", "2", ... into an array. Useful to wrap the output of unflatten.
concat
concat (class=collections #args=variadic) Returns the array concatenation of the arguments. Non-array arguments are treated as single-element arrays. Examples: concat(1,2,3) is [1,2,3] concat([1,2],3) is [1,2,3] concat([1,2],[3]) is [1,2,3]
depth
depth (class=collections #args=1) Prints maximum depth of map/array. Scalars have depth 0.
flatten
flatten (class=collections #args=2,3) Flattens multi-level maps to single-level ones. Useful for nested JSON-like structures for non-JSON file formats like CSV. With two arguments, the first argument is a map (maybe $*) and the second argument is the flatten separator. With three arguments, the first argument is prefix, the second is the flatten separator, and the third argument is a map; flatten($*, ".") is the same as flatten("", ".", $*). See "Flatten/unflatten: converting between JSON and tabular formats" at https://miller.readthedocs.io for more information.
Examples:
flatten({"a":[1,2],"b":3}, ".") is {"a.1": 1, "a.2": 2, "b": 3}.
flatten("a", ".", {"b": { "c": 4 }}) is {"a.b.c" : 4}.
flatten("", ".", {"a": { "b": 3 }}) is {"a.b" : 3}.
get_keys
get_keys (class=collections #args=1) Returns array of keys of map or array
get_values
get_values (class=collections #args=1) Returns array of values of map or array -- in the latter case, returns a copy of the array
haskey
haskey (class=collections #args=2) True/false if map has/hasn't key, e.g. 'haskey($*, "a")' or 'haskey(mymap, mykey)', or true/false if array index is in bounds / out of bounds. Error if 1st argument is not a map or array. Note -n..-1 alias to 1..n in Miller arrays.
json_parse
json_parse (class=collections #args=1) Converts value from JSON-formatted string.
json_stringify
json_stringify (class=collections #args=1,2) Converts value to JSON-formatted string. Default output is single-line. With optional second boolean argument set to true, produces multiline output.
leafcount
leafcount (class=collections #args=1) Counts total number of terminal values in map/array. For single-level map/array, same as length.
length
length (class=collections #args=1) Counts number of top-level entries in array/map. Scalars have length 1.
mapdiff
mapdiff (class=collections #args=variadic) With 0 args, returns empty map. With 1 arg, returns copy of arg. With 2 or more, returns copy of arg 1 with all keys from any of remaining argument maps removed.
mapexcept
mapexcept (class=collections #args=variadic) Returns a map with keys from remaining arguments, if any, unset. Remaining arguments can be strings or arrays of string. E.g. 'mapexcept({1:2,3:4,5:6}, 1, 5, 7)' is '{3:4}' and 'mapexcept({1:2,3:4,5:6}, [1, 5, 7])' is '{3:4}'.
mapselect
mapselect (class=collections #args=variadic) Returns a map with only keys from remaining arguments set. Remaining arguments can be strings or arrays of string. E.g. 'mapselect({1:2,3:4,5:6}, 1, 5, 7)' is '{1:2,5:6}' and 'mapselect({1:2,3:4,5:6}, [1, 5, 7])' is '{1:2,5:6}'.
mapsum
mapsum (class=collections #args=variadic) With 0 args, returns empty map. With >= 1 arg, returns a map with key-value pairs from all arguments. Rightmost collisions win, e.g. 'mapsum({1:2,3:4},{1:5})' is '{1:5,3:4}'.
unflatten
unflatten (class=collections #args=2) Reverses flatten. Useful for nested JSON-like structures for non-JSON file formats like CSV. The first argument is a map, and the second argument is the flatten separator. See also arrayify. See "Flatten/unflatten: converting between JSON and tabular formats" at https://miller.readthedocs.io for more information.
Example:
unflatten({"a.b.c" : 4}, ".") is {"a": "b": { "c": 4 }}.
Conversion functions
boolean
boolean (class=conversion #args=1) Convert int/float/bool/string to boolean.
float
float (class=conversion #args=1) Convert int/float/bool/string to float.
fmtifnum
fmtifnum (class=conversion #args=2) Identical to fmtnum, except returns the first argument as-is if the output would be an error.
Examples:
fmtifnum(3.4, "%.6f") gives 3.400000"
fmtifnum("abc", "%.6f") gives abc"
$* = fmtifnum($*, "%.6f") formats numeric fields in the current record, leaving non-numeric ones alone
fmtnum
fmtnum (class=conversion #args=2) Convert int/float/bool to string using printf-style format string (https://pkg.go.dev/fmt), e.g. '$s = fmtnum($n, "%08d")' or '$t = fmtnum($n, "%.6e")'. This function recurses on array and map values. Example: $x = fmtnum($x, "%.6f")
hexfmt
hexfmt (class=conversion #args=1) Convert int to hex string, e.g. 255 to "0xff".
int
int (class=conversion #args=1) Convert int/float/bool/string to int.
joink
joink (class=conversion #args=2) Makes string from map/array keys. First argument is map/array; second is separator string.
Examples:
joink({"a":3,"b":4,"c":5}, ",") = "a,b,c".
joink([1,2,3], ",") = "1,2,3".
joinkv
joinkv (class=conversion #args=3) Makes string from map/array key-value pairs. First argument is map/array; second is pair-separator string; third is field-separator string. Mnemonic: the "=" comes before the "," in the output and in the arguments to joinkv.
Examples:
joinkv([3,4,5], "=", ",") = "1=3,2=4,3=5"
joinkv({"a":3,"b":4,"c":5}, ":", ";") = "a:3;b:4;c:5"
joinv
joinv (class=conversion #args=2) Makes string from map/array values. First argument is map/array; second is separator string.
Examples:
joinv([3,4,5], ",") = "3,4,5"
joinv({"a":3,"b":4,"c":5}, ",") = "3,4,5"
splita
splita (class=conversion #args=2) Splits string into array with type inference. First argument is string to split; second is the separator to split on.
Example:
splita("3,4,5", ",") = [3,4,5]
splitax
splitax (class=conversion #args=2) Splits string into array without type inference. First argument is string to split; second is the separator to split on.
Example:
splitax("3,4,5", ",") = ["3","4","5"]
splitkv
splitkv (class=conversion #args=3) Splits string by separators into map with type inference. First argument is string to split; second argument is pair separator; third argument is field separator.
Example:
splitkv("a=3,b=4,c=5", "=", ",") = {"a":3,"b":4,"c":5}
splitkvx
splitkvx (class=conversion #args=3) Splits string by separators into map without type inference (keys and values are strings). First argument is string to split; second argument is pair separator; third argument is field separator.
Example:
splitkvx("a=3,b=4,c=5", "=", ",") = {"a":"3","b":"4","c":"5"}
splitnv
splitnv (class=conversion #args=2) Splits string by separator into integer-indexed map with type inference. First argument is string to split; second argument is separator to split on.
Example:
splitnv("a,b,c", ",") = {"1":"a","2":"b","3":"c"}
splitnvx
splitnvx (class=conversion #args=2) Splits string by separator into integer-indexed map without type inference (values are strings). First argument is string to split; second argument is separator to split on.
Example:
splitnvx("3,4,5", ",") = {"1":"3","2":"4","3":"5"}
string
string (class=conversion #args=1) Convert int/float/bool/string/array/map to string.
Hashing functions
md5
md5 (class=hashing #args=1) MD5 hash.
sha1
sha1 (class=hashing #args=1) SHA1 hash.
sha256
sha256 (class=hashing #args=1) SHA256 hash.
sha512
sha512 (class=hashing #args=1) SHA512 hash.
Higher-order-functions functions
any
any (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, yields a boolean true if the argument function returns true for any array/map element, false otherwise. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.
Examples:
Array example: any([10,20,30], func(e) {return $index == e})
Map example: any({"a": "foo", "b": "bar"}, func(k,v) {return $[k] == v})
apply
apply (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, applies the function to each element of the array/map. For arrays, the function should take one argument, for array element; it should return a new element. For maps, it should take two arguments, for map-element key and value; it should return a new key-value pair (i.e. a single-entry map).
Examples:
Array example: apply([1,2,3,4,5], func(e) {return e ** 3}) returns [1, 8, 27, 64, 125].
Map example: apply({"a":1, "b":3, "c":5}, func(k,v) {return {toupper(k): v ** 2}}) returns {"A": 1, "B":9, "C": 25}",
every
every (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, yields a boolean true if the argument function returns true for every array/map element, false otherwise. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.
Examples:
Array example: every(["a", "b", "c"], func(e) {return $[e] >= 0})
Map example: every({"a": "foo", "b": "bar"}, func(k,v) {return $[k] == v})
fold
fold (class=higher-order-functions #args=3) Given a map or array as first argument and a function as second argument, accumulates entries into a final output -- for example, sum or product. For arrays, the function should take two arguments, for accumulated value and array element. For maps, it should take four arguments, for accumulated key and value, and map-element key and value; it should return the updated accumulator as a new key-value pair (i.e. a single-entry map). The start value for the accumulator is taken from the third argument.
Examples:
Array example: fold([1,2,3,4,5], func(acc,e) {return acc + e**3}, 10000) returns 10225.
Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum": accv+ev**2}}, {"sum":10000}) returns 10035.
reduce
reduce (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, accumulates entries into a final output -- for example, sum or product. For arrays, the function should take two arguments, for accumulated value and array element, and return the accumulated element. For maps, it should take four arguments, for accumulated key and value, and map-element key and value; it should return the updated accumulator as a new key-value pair (i.e. a single-entry map). The start value for the accumulator is the first element for arrays, or the first element's key-value pair for maps.
Examples:
Array example: reduce([1,2,3,4,5], func(acc,e) {return acc + e**3}) returns 225.
Map example: reduce({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum_of_squares": accv + ev**2}}) returns {"sum_of_squares": 35}.
select
select (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, includes each input element in the output if the function returns true. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.
Examples:
Array example: select([1,2,3,4,5], func(e) {return e >= 3}) returns [3, 4, 5].
Map example: select({"a":1, "b":3, "c":5}, func(k,v) {return v >= 3}) returns {"b":3, "c": 5}.
sort
sort (class=higher-order-functions #args=1-2) Given a map or array as first argument and string flags or function as optional second argument, returns a sorted copy of the input. With one argument, sorts array elements with numbers first numerically and then strings lexically, and map elements likewise by map keys. If the second argument is a string, it can contain any of "f" for lexical ("n" is for the above default), "c" for case-folded lexical, or "t" for natural sort order. An additional "r" in that string is for reverse. If the second argument is a function, then for arrays it should take two arguments a and b, returning < 0, 0, or > 0 as a < b, a == b, or a > b respectively; for maps the function should take four arguments ak, av, bk, and bv, again returning < 0, 0, or > 0, using a and b's keys and values.
Examples:
Default sorting: sort([3,"A",1,"B",22]) returns [1, 3, 20, "A", "B"].
Note that this is numbers before strings.
Default sorting: sort(["E","a","c","B","d"]) returns ["B", "E", "a", "c", "d"].
Note that this is uppercase before lowercase.
Case-folded ascending: sort(["E","a","c","B","d"], "c") returns ["a", "B", "c", "d", "E"].
Case-folded descending: sort(["E","a","c","B","d"], "cr") returns ["E", "d", "c", "B", "a"].
Natural sorting: sort(["a1","a10","a100","a2","a20","a200"], "t") returns ["a1", "a2", "a10", "a20", "a100", "a200"].
Array with function: sort([5,2,3,1,4], func(a,b) {return b <=> a}) returns [5,4,3,2,1].
Map with function: sort({"c":2,"a":3,"b":1}, func(ak,av,bk,bv) {return bv <=> av}) returns {"a":3,"c":2,"b":1}.
Math functions
abs
abs (class=math #args=1) Absolute value.
acos
acos (class=math #args=1) Inverse trigonometric cosine.
acosh
acosh (class=math #args=1) Inverse hyperbolic cosine.
asin
asin (class=math #args=1) Inverse trigonometric sine.
asinh
asinh (class=math #args=1) Inverse hyperbolic sine.
atan
atan (class=math #args=1) One-argument arctangent.
atan2
atan2 (class=math #args=2) Two-argument arctangent.
atanh
atanh (class=math #args=1) Inverse hyperbolic tangent.
cbrt
cbrt (class=math #args=1) Cube root.
ceil
ceil (class=math #args=1) Ceiling: nearest integer at or above.
cos
cos (class=math #args=1) Trigonometric cosine.
cosh
cosh (class=math #args=1) Hyperbolic cosine.
erf
erf (class=math #args=1) Error function.
erfc
erfc (class=math #args=1) Complementary error function.
exp
exp (class=math #args=1) Exponential function e**x.
expm1
expm1 (class=math #args=1) e**x - 1.
floor
floor (class=math #args=1) Floor: nearest integer at or below.
invqnorm
invqnorm (class=math #args=1) Inverse of normal cumulative distribution function. Note that invqorm(urand()) is normally distributed.
log
log (class=math #args=1) Natural (base-e) logarithm.
log10
log10 (class=math #args=1) Base-10 logarithm.
log1p
log1p (class=math #args=1) log(1-x).
logifit
logifit (class=math #args=3) Given m and b from logistic regression, compute fit: $yhat=logifit($x,$m,$b).
max
max (class=math #args=variadic) Max of n numbers; null loses.
min
min (class=math #args=variadic) Min of n numbers; null loses.
qnorm
qnorm (class=math #args=1) Normal cumulative distribution function.
round
round (class=math #args=1) Round to nearest integer.
roundm
roundm (class=math #args=2) Round to nearest multiple of m: roundm($x,$m) is the same as round($x/$m)*$m.
sgn
sgn (class=math #args=1) +1, 0, -1 for positive, zero, negative input respectively.
sin
sin (class=math #args=1) Trigonometric sine.
sinh
sinh (class=math #args=1) Hyperbolic sine.
sqrt
sqrt (class=math #args=1) Square root.
tan
tan (class=math #args=1) Trigonometric tangent.
tanh
tanh (class=math #args=1) Hyperbolic tangent.
urand
urand (class=math #args=0) Floating-point numbers uniformly distributed on the unit interval. Example: Int-valued example: '$n=floor(20+urand()*11)'.
urand32
urand32 (class=math #args=0) Integer uniformly distributed 0 and 2**32-1 inclusive.
urandelement
urandelement (class=math #args=1) Random sample from the first argument, which must be an non-empty array.
urandint
urandint (class=math #args=2) Integer uniformly distributed between inclusive integer endpoints.
urandrange
urandrange (class=math #args=2) Floating-point numbers uniformly distributed on the interval [a, b).
String functions
capitalize
capitalize (class=string #args=1) Convert string's first character to uppercase.
clean_whitespace
clean_whitespace (class=string #args=1) Same as collapse_whitespace and strip.
collapse_whitespace
collapse_whitespace (class=string #args=1) Strip repeated whitespace from string.
format
format (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2) gives "1:2:".
format("{}:{}:{}", 1,2,3) gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
gssub
gssub (class=string #args=3) Like gsub but does no regexing. No characters are special.
Example:
gssub("ab.d.fg", ".", "X") gives "abXdXfg"
gsub
gsub (class=string #args=3) '$name = gsub($name, "old", "new")': replace all, with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to gsub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io.
Examples:
gsub("ababab", "ab", "XY") gives "XYXYXY"
gsub("abc.def", ".", "X") gives "XXXXXXX"
gsub("abc.def", "\.", "X") gives "abcXdef"
gsub("abcdefg", "[ce]", "X") gives "abXdXfg"
gsub("prefix4529:suffix8567", "(....ix)([0-9]+)", "[\1 : \2]") gives "[prefix : 4529]:[suffix : 8567]"
latin1_to_utf8
latin1_to_utf8 (class=string #args=1) Tries to convert Latin-1-encoded string to UTF-8-encoded string. If argument is array or map, recurses into it. Examples: $y = latin1_to_utf8($x) $* = latin1_to_utf8($*)
lstrip
lstrip (class=string #args=1) Strip leading whitespace from string.
regextract
regextract (class=string #args=2) Extracts a substring (the first, if there are multiple matches), matching a regular expression, from the input. Does not use capture groups; see also the =~ operator which does.
Examples:
regextract("index ab09 file", "[a-z][a-z][0-9][0-9]") gives "ab09"
regextract("index a999 file", "[a-z][a-z][0-9][0-9]") gives (absent), which will result in an assignment not happening.
regextract_or_else
regextract_or_else (class=string #args=3) Like regextract but the third argument is the return value in case the input string (first argument) doesn't match the pattern (second argument).
Examples:
regextract_or_else("index ab09 file", "[a-z][a-z][0-9][0-9]", "nonesuch") gives "ab09"
regextract_or_else("index a999 file", "[a-z][a-z][0-9][0-9]", "nonesuch") gives "nonesuch"
rstrip
rstrip (class=string #args=1) Strip trailing whitespace from string.
ssub
ssub (class=string #args=3) Like sub but does no regexing. No characters are special.
Example:
ssub("abc.def", ".", "X") gives "abcXdef"
strip
strip (class=string #args=1) Strip leading and trailing whitespace from string.
strlen
strlen (class=string #args=1) String length.
sub
sub (class=string #args=3) '$name = sub($name, "old", "new")': replace once (first match, if there are multiple matches), with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to sub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io.
Examples:
sub("ababab", "ab", "XY") gives "XYabab"
sub("abc.def", ".", "X") gives "Xbc.def"
sub("abc.def", "\.", "X") gives "abcXdef"
sub("abcdefg", "[ce]", "X") gives "abXdefg"
sub("prefix4529:suffix8567", "suffix([0-9]+)", "name\1") gives "prefix4529:name8567"
substr
substr (class=string #args=3) substr is an alias for substr0. See also substr1. Miller is generally 1-up with all array and string indices, but, this is a backward-compatibility issue with Miller 5 and below. Arrays are new in Miller 6; the substr function is older.
substr0
substr0 (class=string #args=3) substr0(s,m,n) gives substring of s from 0-up position m to n inclusive. Negative indices -len .. -1 alias to 0 .. len-1. See also substr and substr1.
substr1
substr1 (class=string #args=3) substr1(s,m,n) gives substring of s from 1-up position m to n inclusive. Negative indices -len .. -1 alias to 1 .. len. See also substr and substr0.
tolower
tolower (class=string #args=1) Convert string to lowercase.
toupper
toupper (class=string #args=1) Convert string to uppercase.
truncate
truncate (class=string #args=2) Truncates string first argument to max length of int second argument.
unformat
unformat (class=string #args=2) Using first argument as format string, unpacks second argument into an array of matches, with type-inference. On non-match, returns error -- use is_error() to check.
Examples:
unformat("{}:{}:{}", "1:2:3") gives [1, 2, 3].
unformat("{}h{}m{}s", "3h47m22s") gives [3, 47, 22].
is_error(unformat("{}h{}m{}s", "3:47:22")) gives true.
unformatx
unformatx (class=string #args=2) Same as unformat, but without type-inference.
Examples:
unformatx("{}:{}:{}", "1:2:3") gives ["1", "2", "3"].
unformatx("{}h{}m{}s", "3h47m22s") gives ["3", "47", "22"].
is_error(unformatx("{}h{}m{}s", "3:47:22")) gives true.
utf8_to_latin1
utf8_to_latin1 (class=string #args=1) Tries to convert UTF-8-encoded string to Latin-1-encoded string. If argument is array or map, recurses into it. Examples: $y = utf8_to_latin1($x) $* = utf8_to_latin1($*)
.
. (class=string #args=2) String concatenation. Non-strings are coerced, so you can do '"ax".98' etc.
System functions
hostname
hostname (class=system #args=0) Returns the hostname as a string.
os
os (class=system #args=0) Returns the operating-system name as a string.
system
system (class=system #args=1) Run command string, yielding its stdout minus final carriage return.
version
version (class=system #args=0) Returns the Miller version as a string.
Time functions
dhms2fsec
dhms2fsec (class=time #args=1) Recovers floating-point seconds as in dhms2fsec("5d18h53m20.250000s") = 500000.250000
dhms2sec
dhms2sec (class=time #args=1) Recovers integer seconds as in dhms2sec("5d18h53m20s") = 500000
fsec2dhms
fsec2dhms (class=time #args=1) Formats floating-point seconds as in fsec2dhms(500000.25) = "5d18h53m20.250000s"
fsec2hms
fsec2hms (class=time #args=1) Formats floating-point seconds as in fsec2hms(5000.25) = "01:23:20.250000"
gmt2localtime
gmt2localtime (class=time #args=1,2) Convert from a GMT-time string to a local-time string. Consulting $TZ unless second argument is supplied.
Examples:
gmt2localtime("1999-12-31T22:00:00Z") = "2000-01-01 00:00:00" with TZ="Asia/Istanbul"
gmt2localtime("1999-12-31T22:00:00Z", "Asia/Istanbul") = "2000-01-01 00:00:00"
gmt2sec
gmt2sec (class=time #args=1) Parses GMT timestamp as integer seconds since the epoch.
Example:
gmt2sec("2001-02-03T04:05:06Z") = 981173106
hms2fsec
hms2fsec (class=time #args=1) Recovers floating-point seconds as in hms2fsec("01:23:20.250000") = 5000.250000
hms2sec
hms2sec (class=time #args=1) Recovers integer seconds as in hms2sec("01:23:20") = 5000
localtime2gmt
localtime2gmt (class=time #args=1,2) Convert from a local-time string to a GMT-time string. Consults $TZ unless second argument is supplied.
Examples:
localtime2gmt("2000-01-01 00:00:00") = "1999-12-31T22:00:00Z" with TZ="Asia/Istanbul"
localtime2gmt("2000-01-01 00:00:00", "Asia/Istanbul") = "1999-12-31T22:00:00Z"
localtime2sec
localtime2sec (class=time #args=1,2) Parses local timestamp as integer seconds since the epoch. Consults $TZ environment variable, unless second argument is supplied.
Examples:
localtime2sec("2001-02-03 04:05:06") = 981165906 with TZ="Asia/Istanbul"
localtime2sec("2001-02-03 04:05:06", "Asia/Istanbul") = 981165906"
sec2dhms
sec2dhms (class=time #args=1) Formats integer seconds as in sec2dhms(500000) = "5d18h53m20s"
sec2gmt
sec2gmt (class=time #args=1,2) Formats seconds since epoch as GMT timestamp. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part. Examples: sec2gmt(1234567890) = "2009-02-13T23:31:30Z" sec2gmt(1234567890.123456) = "2009-02-13T23:31:30Z" sec2gmt(1234567890.123456, 6) = "2009-02-13T23:31:30.123456Z"
sec2gmtdate
sec2gmtdate (class=time #args=1) Formats seconds since epoch (integer part) as GMT timestamp with year-month-date. Leaves non-numbers as-is. Example: sec2gmtdate(1440768801.7) = "2015-08-28".
sec2hms
sec2hms (class=time #args=1) Formats integer seconds as in sec2hms(5000) = "01:23:20"
sec2localdate
sec2localdate (class=time #args=1,2) Formats seconds since epoch (integer part) as local timestamp with year-month-date. Leaves non-numbers as-is. Consults $TZ environment variable unless second argument is supplied. Examples: sec2localdate(1440768801.7) = "2015-08-28" with TZ="Asia/Istanbul" sec2localdate(1440768801.7, "Asia/Istanbul") = "2015-08-28"
sec2localtime
sec2localtime (class=time #args=1,2,3) Formats seconds since epoch (integer part) as local timestamp. Consults $TZ environment variable unless third argument is supplied. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part Examples: sec2localtime(1234567890) = "2009-02-14 01:31:30" with TZ="Asia/Istanbul" sec2localtime(1234567890.123456) = "2009-02-14 01:31:30" with TZ="Asia/Istanbul" sec2localtime(1234567890.123456, 6) = "2009-02-14 01:31:30.123456" with TZ="Asia/Istanbul" sec2localtime(1234567890.123456, 6, "Asia/Istanbul") = "2009-02-14 01:31:30.123456"
strftime
strftime (class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
strftime_local
strftime_local (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone. Examples: strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%S %z") = "2015-08-28 16:33:21 +0300" with TZ="Asia/Istanbul" strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z") = "2015-08-28 16:33:21.700 +0300" with TZ="Asia/Istanbul" strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z", "Asia/Istanbul") = "2015-08-28 16:33:21.700 +0300"
strptime
strptime (class=time #args=2) strptime: Parses timestamp as floating-point seconds since the epoch. See also strptime_local.
Examples:
strptime("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801.000000
strptime("2015-08-28T13:33:21.345Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801.345000
strptime("1970-01-01 00:00:00 -0400", "%Y-%m-%d %H:%M:%S %z") = 14400
strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
strptime_local
strptime_local (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"
strptime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S", "Asia/Istanbul") = 1440758001
systime
systime (class=time #args=0) Returns the system time in floating-point seconds since the epoch.
systimeint
systimeint (class=time #args=0) Returns the system time in integer seconds since the epoch.
uptime
uptime (class=time #args=0) Returns the time in floating-point seconds since the current Miller program was started.
Typing functions
asserting_absent
asserting_absent (class=typing #args=1) Aborts with an error if is_absent on the argument returns false, else returns its argument.
asserting_array
asserting_array (class=typing #args=1) Aborts with an error if is_array on the argument returns false, else returns its argument.
asserting_bool
asserting_bool (class=typing #args=1) Aborts with an error if is_bool on the argument returns false, else returns its argument.
asserting_boolean
asserting_boolean (class=typing #args=1) Aborts with an error if is_boolean on the argument returns false, else returns its argument.
asserting_empty
asserting_empty (class=typing #args=1) Aborts with an error if is_empty on the argument returns false, else returns its argument.
asserting_empty_map
asserting_empty_map (class=typing #args=1) Aborts with an error if is_empty_map on the argument returns false, else returns its argument.
asserting_error
asserting_error (class=typing #args=1) Aborts with an error if is_error on the argument returns false, else returns its argument.
asserting_float
asserting_float (class=typing #args=1) Aborts with an error if is_float on the argument returns false, else returns its argument.
asserting_int
asserting_int (class=typing #args=1) Aborts with an error if is_int on the argument returns false, else returns its argument.
asserting_map
asserting_map (class=typing #args=1) Aborts with an error if is_map on the argument returns false, else returns its argument.
asserting_nonempty_map
asserting_nonempty_map (class=typing #args=1) Aborts with an error if is_nonempty_map on the argument returns false, else returns its argument.
asserting_not_array
asserting_not_array (class=typing #args=1) Aborts with an error if is_not_array on the argument returns false, else returns its argument.
asserting_not_empty
asserting_not_empty (class=typing #args=1) Aborts with an error if is_not_empty on the argument returns false, else returns its argument.
asserting_not_map
asserting_not_map (class=typing #args=1) Aborts with an error if is_not_map on the argument returns false, else returns its argument.
asserting_not_null
asserting_not_null (class=typing #args=1) Aborts with an error if is_not_null on the argument returns false, else returns its argument.
asserting_null
asserting_null (class=typing #args=1) Aborts with an error if is_null on the argument returns false, else returns its argument.
asserting_numeric
asserting_numeric (class=typing #args=1) Aborts with an error if is_numeric on the argument returns false, else returns its argument.
asserting_present
asserting_present (class=typing #args=1) Aborts with an error if is_present on the argument returns false, else returns its argument.
asserting_string
asserting_string (class=typing #args=1) Aborts with an error if is_string on the argument returns false, else returns its argument.
is_absent
is_absent (class=typing #args=1) False if field is present in input, true otherwise
is_array
is_array (class=typing #args=1) True if argument is an array.
is_bool
is_bool (class=typing #args=1) True if field is present with boolean value. Synonymous with is_boolean.
is_boolean
is_boolean (class=typing #args=1) True if field is present with boolean value. Synonymous with is_bool.
is_empty
is_empty (class=typing #args=1) True if field is present in input with empty string value, false otherwise.
is_empty_map
is_empty_map (class=typing #args=1) True if argument is a map which is empty.
is_error
is_error (class=typing #args=1) True if if argument is an error, such as taking string length of an integer.
is_float
is_float (class=typing #args=1) True if field is present with value inferred to be float
is_int
is_int (class=typing #args=1) True if field is present with value inferred to be int
is_map
is_map (class=typing #args=1) True if argument is a map.
is_nan
is_nan (class=typing #args=1) True if the argument is the NaN (not-a-number) floating-point value. Note that NaN has the property that NaN != NaN, so you need 'is_nan(x)' rather than 'x == NaN'.
is_nonempty_map
is_nonempty_map (class=typing #args=1) True if argument is a map which is non-empty.
is_not_array
is_not_array (class=typing #args=1) True if argument is not an array.
is_not_empty
is_not_empty (class=typing #args=1) True if field is present in input with non-empty value, false otherwise
is_not_map
is_not_map (class=typing #args=1) True if argument is not a map.
is_not_null
is_not_null (class=typing #args=1) False if argument is null (empty, absent, or JSON null), true otherwise.
is_null
is_null (class=typing #args=1) True if argument is null (empty, absent, or JSON null), false otherwise.
is_numeric
is_numeric (class=typing #args=1) True if field is present with value inferred to be int or float
is_present
is_present (class=typing #args=1) True if field is present in input, false otherwise.
is_string
is_string (class=typing #args=1) True if field is present with string (including empty-string) value
typeof
typeof (class=typing #args=1) Convert argument to type of argument (e.g. "str"). For debug.