miller/docs/src/reference-dsl-builtin-functions.md at 24089151604fa1e65891c4678e0136741eaa3ead

mirror of https://github.com/johnkerl/miller.git synced 2026-01-23 02:14:13 +00:00

DSL functions and verbs for UTF-8 <-> Latin-1 (#997 )

* latin1_to_utf8 and utf8_to_latin1 DSL functions

* doc-build artifacts for previous commit

* Test cases for latin1_to_utf8 and utf8_to_latin1

* extend on-line help

* latin1_to_utf8 and utf8_to_latin1 verbs

* unit-test cases for verbs

* Keep with kebab-case naming convention for verbs

* webdocs

2022-03-20 17:29:40 -04:00

56 KiB

Raw Blame History

Quick links: Flags Verbs Functions Glossary Release docs

# DSL built-in functions

These are functions in the Miller programming language that you can call when you use mlr put and mlr filter. For example, when you type

mlr --icsv --opprint --from example.csv put '
  $color = toupper($color);
  $shape = gsub($shape, "[aeiou]", "*");
'

color  shape    flag  k  index quantity rate
YELLOW tr**ngl* true  1  11    43.6498  9.8870
RED    sq**r*   true  2  15    79.2778  0.0130
RED    c*rcl*   true  3  16    13.8103  2.9010
RED    sq**r*   false 4  48    77.5542  7.4670
PURPLE tr**ngl* false 5  51    81.2290  8.5910
RED    sq**r*   false 6  64    77.1991  9.5310
PURPLE tr**ngl* false 7  65    80.1405  5.8240
YELLOW c*rcl*   true  8  73    63.9785  4.2370
YELLOW c*rcl*   true  9  87    63.5058  8.3350
PURPLE sq**r*   false 10 91    72.3735  8.2430

the toupper and gsub bits are functions.

Overview

At the command line, you can use mlr -f and mlr -F for information much like what's on this page.

Each function takes a specific number of arguments, as shown below, except for functions marked as variadic such as min and max. (The latter compute min and max of any number of arguments.) There is no notion of optional or default-on-absent arguments. All argument-passing is positional rather than by name; arguments are passed by value, not by reference.

At the command line, you can get a list of all functions using mlr -f, with details using mlr -F. (Or, mlr help usage-functions-by-class to get details in the order shown on this page.) You can get detail for a given function using mlr help function namegoeshere, e.g. mlr help function gsub.

Operators are listed here along with functions. In this case, the argument-count is the number of items involved in the infix operator, e.g. we say x+y so the details for the + operator say that its number of arguments is 2. Unary operators such as ! and ~ show argument-count of 1; the ternary ? : operator shows an argument-count of 3.

Functions by class

Arithmetic functions: bitcount, madd, mexp, mmul, msub, pow, %, &, *, **, +, -, .*, .+, .-, ./, /, //, <<, >>, >>>, ^, |, ~.
Boolean functions: !, !=, !=~, &&, <, <=, <=>, ==, =~, >, >=, ?:, ??, ???, ^^, ||.
Collections functions: append, arrayify, concat, depth, flatten, get_keys, get_values, haskey, json_parse, json_stringify, leafcount, length, mapdiff, mapexcept, mapselect, mapsum, unflatten.
Conversion functions: boolean, float, fmtifnum, fmtnum, hexfmt, int, joink, joinkv, joinv, splita, splitax, splitkv, splitkvx, splitnv, splitnvx, string.
Hashing functions: md5, sha1, sha256, sha512.
Higher-order-functions functions: any, apply, every, fold, reduce, select, sort.
Math functions: abs, acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, ceil, cos, cosh, erf, erfc, exp, expm1, floor, invqnorm, log, log10, log1p, logifit, max, min, qnorm, round, roundm, sgn, sin, sinh, sqrt, tan, tanh, urand, urand32, urandelement, urandint, urandrange.
String functions: capitalize, clean_whitespace, collapse_whitespace, format, gssub, gsub, latin1_to_utf8, lstrip, regextract, regextract_or_else, rstrip, ssub, strip, strlen, sub, substr, substr0, substr1, tolower, toupper, truncate, unformat, unformatx, utf8_to_latin1, ..
System functions: hostname, os, system, version.
Time functions: dhms2fsec, dhms2sec, fsec2dhms, fsec2hms, gmt2localtime, gmt2sec, hms2fsec, hms2sec, localtime2gmt, localtime2sec, sec2dhms, sec2gmt, sec2gmtdate, sec2hms, sec2localdate, sec2localtime, strftime, strftime_local, strptime, strptime_local, systime, systimeint, uptime.
Typing functions: asserting_absent, asserting_array, asserting_bool, asserting_boolean, asserting_empty, asserting_empty_map, asserting_error, asserting_float, asserting_int, asserting_map, asserting_nonempty_map, asserting_not_array, asserting_not_empty, asserting_not_map, asserting_not_null, asserting_null, asserting_numeric, asserting_present, asserting_string, is_absent, is_array, is_bool, is_boolean, is_empty, is_empty_map, is_error, is_float, is_int, is_map, is_nan, is_nonempty_map, is_not_array, is_not_empty, is_not_map, is_not_null, is_null, is_numeric, is_present, is_string, typeof.

Arithmetic functions

bitcount

bitcount  (class=arithmetic #args=1) Count of 1-bits.

madd

madd  (class=arithmetic #args=3) a + b mod m (integers)

mexp

mexp  (class=arithmetic #args=3) a ** b mod m (integers)

mmul

mmul  (class=arithmetic #args=3) a * b mod m (integers)

msub

msub  (class=arithmetic #args=3) a - b mod m (integers)

pow

pow  (class=arithmetic #args=2) Exponentiation. Same as **, but as a function.

%

%  (class=arithmetic #args=2) Remainder; never negative-valued (pythonic).

&

&  (class=arithmetic #args=2) Bitwise AND.

*

*  (class=arithmetic #args=2) Multiplication, with integer*integer overflow to float.

**

**  (class=arithmetic #args=2) Exponentiation. Same as pow, but as an infix operator.

+

+  (class=arithmetic #args=1,2) Addition as binary operator; unary plus operator.

-

-  (class=arithmetic #args=1,2) Subtraction as binary operator; unary negation operator.

.*

.*  (class=arithmetic #args=2) Multiplication, with integer-to-integer overflow.

.+

.+  (class=arithmetic #args=2) Addition, with integer-to-integer overflow.

.-

.-  (class=arithmetic #args=2) Subtraction, with integer-to-integer overflow.

./

./  (class=arithmetic #args=2) Integer division, rounding toward zero.

/

/  (class=arithmetic #args=2) Division. Integer / integer is integer when exact, else floating-point: e.g. 6/3 is 2 but 6/4 is 1.5.

//

//  (class=arithmetic #args=2) Pythonic integer division, rounding toward negative.

<<

<<  (class=arithmetic #args=2) Bitwise left-shift.

>>

>>  (class=arithmetic #args=2) Bitwise signed right-shift.

>>>

>>>  (class=arithmetic #args=2) Bitwise unsigned right-shift.

^

^  (class=arithmetic #args=2) Bitwise XOR.

|

|  (class=arithmetic #args=2) Bitwise OR.

~

~  (class=arithmetic #args=1) Bitwise NOT. Beware '$y=~$x' since =~ is the regex-match operator: try '$y = ~$x'.

Boolean functions

!

!  (class=boolean #args=1) Logical negation.

!=

!=  (class=boolean #args=2) String/numeric inequality. Mixing number and string results in string compare.

!=~

!=~  (class=boolean #args=2) String (left-hand side) does not match regex (right-hand side), e.g. '$name !=~ "^a.*b$"'.

&&

&&  (class=boolean #args=2) Logical AND.

<

<  (class=boolean #args=2) String/numeric less-than. Mixing number and string results in string compare.

<=

<=  (class=boolean #args=2) String/numeric less-than-or-equals. Mixing number and string results in string compare.

<=>

<=>  (class=boolean #args=2) Comparator, nominally for sorting. Given a <=> b, returns <0, 0, >0 as a < b, a == b, or a > b, respectively.

==

==  (class=boolean #args=2) String/numeric equality. Mixing number and string results in string compare.

=~

=~  (class=boolean #args=2) String (left-hand side) matches regex (right-hand side), e.g. '$name =~ "^a.*b$"'. Capture groups \1 through \9 are matched from (...) in the right-hand side, and can be used within subsequent DSL statements. See also "Regular expressions" at https://miller.readthedocs.io.
Examples:
With if-statement: if ($url =~ "http.*com") { ... }
Without if-statement: given $line = "index ab09 file", and $line =~ "([a-z][a-z])([0-9][0-9])", then $label = "[\1:\2]", $label is "[ab:09]"

>

>  (class=boolean #args=2) String/numeric greater-than. Mixing number and string results in string compare.

>=

>=  (class=boolean #args=2) String/numeric greater-than-or-equals. Mixing number and string results in string compare.

?:

?:  (class=boolean #args=3) Standard ternary operator.

??

??  (class=boolean #args=2) Absent-coalesce operator. $a ?? 1 evaluates to 1 if $a isn't defined in the current record.

???

???  (class=boolean #args=2) Absent/empty-coalesce operator. $a ??? 1 evaluates to 1 if $a isn't defined in the current record, or has empty value.

^^

^^  (class=boolean #args=2) Logical XOR.

||

||  (class=boolean #args=2) Logical OR.

Collections functions

append

append  (class=collections #args=2) Appends second argument to end of first argument, which must be an array.

arrayify

arrayify  (class=collections #args=1) Walks through a nested map/array, converting any map with consecutive keys "1", "2", ... into an array. Useful to wrap the output of unflatten.

concat

concat  (class=collections #args=variadic) Returns the array concatenation of the arguments. Non-array arguments are treated as single-element arrays.
Examples:
concat(1,2,3) is [1,2,3]
concat([1,2],3) is [1,2,3]
concat([1,2],[3]) is [1,2,3]

depth

depth  (class=collections #args=1) Prints maximum depth of map/array. Scalars have depth 0.

flatten

flatten  (class=collections #args=2,3) Flattens multi-level maps to single-level ones. Useful for nested JSON-like structures for non-JSON file formats like CSV. With two arguments, the first argument is a map (maybe $*) and the second argument is the flatten separator. With three arguments, the first argument is prefix, the second is the flatten separator, and the third argument is a map; flatten($*, ".") is the same as flatten("", ".", $*). See "Flatten/unflatten: converting between JSON and tabular formats" at https://miller.readthedocs.io for more information.
Examples:
flatten({"a":[1,2],"b":3}, ".") is {"a.1": 1, "a.2": 2, "b": 3}.
flatten("a", ".", {"b": { "c": 4 }}) is {"a.b.c" : 4}.
flatten("", ".", {"a": { "b": 3 }}) is {"a.b" : 3}.

get_keys

get_keys  (class=collections #args=1) Returns array of keys of map or array

get_values

get_values  (class=collections #args=1) Returns array of values of map or array -- in the latter case, returns a copy of the array

haskey

haskey  (class=collections #args=2) True/false if map has/hasn't key, e.g. 'haskey($*, "a")' or 'haskey(mymap, mykey)', or true/false if array index is in bounds / out of bounds. Error if 1st argument is not a map or array. Note -n..-1 alias to 1..n in Miller arrays.

json_parse

json_parse  (class=collections #args=1) Converts value from JSON-formatted string.

json_stringify

json_stringify  (class=collections #args=1,2) Converts value to JSON-formatted string. Default output is single-line. With optional second boolean argument set to true, produces multiline output.

leafcount

leafcount  (class=collections #args=1) Counts total number of terminal values in map/array. For single-level map/array, same as length.

length

length  (class=collections #args=1) Counts number of top-level entries in array/map. Scalars have length 1.

mapdiff

mapdiff  (class=collections #args=variadic) With 0 args, returns empty map. With 1 arg, returns copy of arg. With 2 or more, returns copy of arg 1 with all keys from any of remaining argument maps removed.

mapexcept

mapexcept  (class=collections #args=variadic) Returns a map with keys from remaining arguments, if any, unset. Remaining arguments can be strings or arrays of string. E.g. 'mapexcept({1:2,3:4,5:6}, 1, 5, 7)' is '{3:4}' and 'mapexcept({1:2,3:4,5:6}, [1, 5, 7])' is '{3:4}'.

mapselect

mapselect  (class=collections #args=variadic) Returns a map with only keys from remaining arguments set. Remaining arguments can be strings or arrays of string. E.g. 'mapselect({1:2,3:4,5:6}, 1, 5, 7)' is '{1:2,5:6}' and 'mapselect({1:2,3:4,5:6}, [1, 5, 7])' is '{1:2,5:6}'.

mapsum

mapsum  (class=collections #args=variadic) With 0 args, returns empty map. With >= 1 arg, returns a map with key-value pairs from all arguments. Rightmost collisions win, e.g. 'mapsum({1:2,3:4},{1:5})' is '{1:5,3:4}'.

unflatten

unflatten  (class=collections #args=2) Reverses flatten. Useful for nested JSON-like structures for non-JSON file formats like CSV. The first argument is a map, and the second argument is the flatten separator. See also arrayify. See "Flatten/unflatten: converting between JSON and tabular formats" at https://miller.readthedocs.io for more information.
Example:
unflatten({"a.b.c" : 4}, ".") is {"a": "b": { "c": 4 }}.

Conversion functions

boolean

boolean  (class=conversion #args=1) Convert int/float/bool/string to boolean.

float

float  (class=conversion #args=1) Convert int/float/bool/string to float.

fmtifnum

fmtifnum  (class=conversion #args=2) Identical to fmtnum, except returns the first argument as-is if the output would be an error.
Examples:
fmtifnum(3.4, "%.6f") gives 3.400000"
fmtifnum("abc", "%.6f") gives abc"
$* = fmtifnum($*, "%.6f") formats numeric fields in the current record, leaving non-numeric ones alone

fmtnum

fmtnum  (class=conversion #args=2) Convert int/float/bool to string using printf-style format string (https://pkg.go.dev/fmt), e.g. '$s = fmtnum($n, "%08d")' or '$t = fmtnum($n, "%.6e")'. This function recurses on array and map values.
Example:
$x = fmtnum($x, "%.6f")

hexfmt

hexfmt  (class=conversion #args=1) Convert int to hex string, e.g. 255 to "0xff".

int

int  (class=conversion #args=1) Convert int/float/bool/string to int.

joink

joink  (class=conversion #args=2) Makes string from map/array keys. First argument is map/array; second is separator string.
Examples:
joink({"a":3,"b":4,"c":5}, ",") = "a,b,c".
joink([1,2,3], ",") = "1,2,3".

joinkv

joinkv  (class=conversion #args=3) Makes string from map/array key-value pairs. First argument is map/array; second is pair-separator string; third is field-separator string. Mnemonic: the "=" comes before the "," in the output and in the arguments to joinkv.
Examples:
joinkv([3,4,5], "=", ",") = "1=3,2=4,3=5"
joinkv({"a":3,"b":4,"c":5}, ":", ";") = "a:3;b:4;c:5"

joinv

joinv  (class=conversion #args=2) Makes string from map/array values. First argument is map/array; second is separator string.
Examples:
joinv([3,4,5], ",") = "3,4,5"
joinv({"a":3,"b":4,"c":5}, ",") = "3,4,5"

splita

splita  (class=conversion #args=2) Splits string into array with type inference. First argument is string to split; second is the separator to split on.
Example:
splita("3,4,5", ",") = [3,4,5]

splitax

splitax  (class=conversion #args=2) Splits string into array without type inference. First argument is string to split; second is the separator to split on.
Example:
splitax("3,4,5", ",") = ["3","4","5"]

splitkv

splitkv  (class=conversion #args=3) Splits string by separators into map with type inference. First argument is string to split; second argument is pair separator; third argument is field separator.
Example:
splitkv("a=3,b=4,c=5", "=", ",") = {"a":3,"b":4,"c":5}

splitkvx

splitkvx  (class=conversion #args=3) Splits string by separators into map without type inference (keys and values are strings). First argument is string to split; second argument is pair separator; third argument is field separator.
Example:
splitkvx("a=3,b=4,c=5", "=", ",") = {"a":"3","b":"4","c":"5"}

splitnv

splitnv  (class=conversion #args=2) Splits string by separator into integer-indexed map with type inference. First argument is string to split; second argument is separator to split on.
Example:
splitnv("a,b,c", ",") = {"1":"a","2":"b","3":"c"}

splitnvx

splitnvx  (class=conversion #args=2) Splits string by separator into integer-indexed map without type inference (values are strings). First argument is string to split; second argument is separator to split on.
Example:
splitnvx("3,4,5", ",") = {"1":"3","2":"4","3":"5"}

string

string  (class=conversion #args=1) Convert int/float/bool/string/array/map to string.

Hashing functions

md5

md5  (class=hashing #args=1) MD5 hash.

sha1

sha1  (class=hashing #args=1) SHA1 hash.

sha256

sha256  (class=hashing #args=1) SHA256 hash.

sha512

sha512  (class=hashing #args=1) SHA512 hash.

Higher-order-functions functions

any

any  (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, yields a boolean true if the argument function returns true for any array/map element, false otherwise. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.
Examples:
Array example: any([10,20,30], func(e) {return $index == e})
Map example: any({"a": "foo", "b": "bar"}, func(k,v) {return $[k] == v})

apply

apply  (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, applies the function to each element of the array/map. For arrays, the function should take one argument, for array element; it should return a new element. For maps, it should take two arguments, for map-element key and value; it should return a new key-value pair (i.e. a single-entry map).
Examples:
Array example: apply([1,2,3,4,5], func(e) {return e ** 3}) returns [1, 8, 27, 64, 125].
Map example: apply({"a":1, "b":3, "c":5}, func(k,v) {return {toupper(k): v ** 2}}) returns {"A": 1, "B":9, "C": 25}",

every

every  (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, yields a boolean true if the argument function returns true for every array/map element, false otherwise. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.
Examples:
Array example: every(["a", "b", "c"], func(e) {return $[e] >= 0})
Map example: every({"a": "foo", "b": "bar"}, func(k,v) {return $[k] == v})

fold

fold  (class=higher-order-functions #args=3) Given a map or array as first argument and a function as second argument, accumulates entries into a final output -- for example, sum or product. For arrays, the function should take two arguments, for accumulated value and array element. For maps, it should take four arguments, for accumulated key and value, and map-element key and value; it should return the updated accumulator as a new key-value pair (i.e. a single-entry map). The start value for the accumulator is taken from the third argument.
Examples:
Array example: fold([1,2,3,4,5], func(acc,e) {return acc + e**3}, 10000) returns 10225.
Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum": accv+ev**2}}, {"sum":10000}) returns 10035.

reduce

reduce  (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, accumulates entries into a final output -- for example, sum or product. For arrays, the function should take two arguments, for accumulated value and array element, and return the accumulated element. For maps, it should take four arguments, for accumulated key and value, and map-element key and value; it should return the updated accumulator as a new key-value pair (i.e. a single-entry map). The start value for the accumulator is the first element for arrays, or the first element's key-value pair for maps.
Examples:
Array example: reduce([1,2,3,4,5], func(acc,e) {return acc + e**3}) returns 225.
Map example: reduce({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum_of_squares": accv + ev**2}}) returns {"sum_of_squares": 35}.

select

select  (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, includes each input element in the output if the function returns true. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.
Examples:
Array example: select([1,2,3,4,5], func(e) {return e >= 3}) returns [3, 4, 5].
Map example: select({"a":1, "b":3, "c":5}, func(k,v) {return v >= 3}) returns {"b":3, "c": 5}.

sort

sort  (class=higher-order-functions #args=1-2) Given a map or array as first argument and string flags or function as optional second argument, returns a sorted copy of the input. With one argument, sorts array elements with numbers first numerically and then strings lexically, and map elements likewise by map keys. If the second argument is a string, it can contain any of "f" for lexical ("n" is for the above default), "c" for case-folded lexical, or "t" for natural sort order. An additional "r" in that string is for reverse. If the second argument is a function, then for arrays it should take two arguments a and b, returning < 0, 0, or > 0 as a < b, a == b, or a > b respectively; for maps the function should take four arguments ak, av, bk, and bv, again returning < 0, 0, or > 0, using a and b's keys and values.
Examples:
Default sorting: sort([3,"A",1,"B",22]) returns [1, 3, 20, "A", "B"].
  Note that this is numbers before strings.
Default sorting: sort(["E","a","c","B","d"]) returns ["B", "E", "a", "c", "d"].
  Note that this is uppercase before lowercase.
Case-folded ascending: sort(["E","a","c","B","d"], "c") returns ["a", "B", "c", "d", "E"].
Case-folded descending: sort(["E","a","c","B","d"], "cr") returns ["E", "d", "c", "B", "a"].
Natural sorting: sort(["a1","a10","a100","a2","a20","a200"], "t") returns ["a1", "a2", "a10", "a20", "a100", "a200"].
Array with function: sort([5,2,3,1,4], func(a,b) {return b <=> a}) returns [5,4,3,2,1].
Map with function: sort({"c":2,"a":3,"b":1}, func(ak,av,bk,bv) {return bv <=> av}) returns {"a":3,"c":2,"b":1}.

Math functions

abs

abs  (class=math #args=1) Absolute value.

acos

acos  (class=math #args=1) Inverse trigonometric cosine.

acosh

acosh  (class=math #args=1) Inverse hyperbolic cosine.

asin

asin  (class=math #args=1) Inverse trigonometric sine.

asinh

asinh  (class=math #args=1) Inverse hyperbolic sine.

atan

atan  (class=math #args=1) One-argument arctangent.

atan2

atan2  (class=math #args=2) Two-argument arctangent.

atanh

atanh  (class=math #args=1) Inverse hyperbolic tangent.

cbrt

cbrt  (class=math #args=1) Cube root.

ceil

ceil  (class=math #args=1) Ceiling: nearest integer at or above.

cos

cos  (class=math #args=1) Trigonometric cosine.

cosh

cosh  (class=math #args=1) Hyperbolic cosine.

erf

erf  (class=math #args=1) Error function.

erfc

erfc  (class=math #args=1) Complementary error function.

exp

exp  (class=math #args=1) Exponential function e**x.

expm1

expm1  (class=math #args=1) e**x - 1.

floor

floor  (class=math #args=1) Floor: nearest integer at or below.

invqnorm

invqnorm  (class=math #args=1) Inverse of normal cumulative distribution function. Note that invqorm(urand()) is normally distributed.

log

log  (class=math #args=1) Natural (base-e) logarithm.

log10

log10  (class=math #args=1) Base-10 logarithm.

log1p

log1p  (class=math #args=1) log(1-x).

logifit

logifit  (class=math #args=3) Given m and b from logistic regression, compute fit: $yhat=logifit($x,$m,$b).

max

max  (class=math #args=variadic) Max of n numbers; null loses.

min

min  (class=math #args=variadic) Min of n numbers; null loses.

qnorm

qnorm  (class=math #args=1) Normal cumulative distribution function.

round

round  (class=math #args=1) Round to nearest integer.

roundm

roundm  (class=math #args=2) Round to nearest multiple of m: roundm($x,$m) is the same as round($x/$m)*$m.

sgn

sgn  (class=math #args=1) +1, 0, -1 for positive, zero, negative input respectively.

sin

sin  (class=math #args=1) Trigonometric sine.

sinh

sinh  (class=math #args=1) Hyperbolic sine.

sqrt

sqrt  (class=math #args=1) Square root.

tan

tan  (class=math #args=1) Trigonometric tangent.

tanh

tanh  (class=math #args=1) Hyperbolic tangent.

urand

urand  (class=math #args=0) Floating-point numbers uniformly distributed on the unit interval.
Example:
Int-valued example: '$n=floor(20+urand()*11)'.

urand32

urand32  (class=math #args=0) Integer uniformly distributed 0 and 2**32-1 inclusive.

urandelement

urandelement  (class=math #args=1) Random sample from the first argument, which must be an non-empty array.

urandint

urandint  (class=math #args=2) Integer uniformly distributed between inclusive integer endpoints.

urandrange

urandrange  (class=math #args=2) Floating-point numbers uniformly distributed on the interval [a, b).

String functions

capitalize

capitalize  (class=string #args=1) Convert string's first character to uppercase.

clean_whitespace

clean_whitespace  (class=string #args=1) Same as collapse_whitespace and strip.

collapse_whitespace

collapse_whitespace  (class=string #args=1) Strip repeated whitespace from string.

format

format  (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded.
Examples:
format("{}:{}:{}", 1,2)     gives "1:2:".
format("{}:{}:{}", 1,2,3)   gives "1:2:3".
format("{}:{}:{}", 1,2,3,4) gives "1:2:3".

gssub

gssub  (class=string #args=3) Like gsub but does no regexing. No characters are special.
Example:
gssub("ab.d.fg", ".", "X") gives "abXdXfg"

gsub

gsub  (class=string #args=3) '$name = gsub($name, "old", "new")': replace all, with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to gsub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io.
Examples:
gsub("ababab", "ab", "XY") gives "XYXYXY"
gsub("abc.def", ".", "X") gives "XXXXXXX"
gsub("abc.def", "\.", "X") gives "abcXdef"
gsub("abcdefg", "[ce]", "X") gives "abXdXfg"
gsub("prefix4529:suffix8567", "(....ix)([0-9]+)", "[\1 : \2]") gives "[prefix : 4529]:[suffix : 8567]"

latin1_to_utf8

latin1_to_utf8  (class=string #args=1) Tries to convert Latin-1-encoded string to UTF-8-encoded string. If argument is array or map, recurses into it.
Examples:
$y = latin1_to_utf8($x)
$* = latin1_to_utf8($*)

lstrip

lstrip  (class=string #args=1) Strip leading whitespace from string.

regextract

regextract  (class=string #args=2) Extracts a substring (the first, if there are multiple matches), matching a regular expression, from the input. Does not use capture groups; see also the =~ operator which does.
Examples:
regextract("index ab09 file", "[a-z][a-z][0-9][0-9]") gives "ab09"
regextract("index a999 file", "[a-z][a-z][0-9][0-9]") gives (absent), which will result in an assignment not happening.

regextract_or_else

regextract_or_else  (class=string #args=3) Like regextract but the third argument is the return value in case the input string (first argument) doesn't match the pattern (second argument).
Examples:
regextract_or_else("index ab09 file", "[a-z][a-z][0-9][0-9]", "nonesuch") gives "ab09"
regextract_or_else("index a999 file", "[a-z][a-z][0-9][0-9]", "nonesuch") gives "nonesuch"

rstrip

rstrip  (class=string #args=1) Strip trailing whitespace from string.

ssub

ssub  (class=string #args=3) Like sub but does no regexing. No characters are special.
Example:
ssub("abc.def", ".", "X") gives "abcXdef"

strip

strip  (class=string #args=1) Strip leading and trailing whitespace from string.

strlen

strlen  (class=string #args=1) String length.

sub

sub  (class=string #args=3) '$name = sub($name, "old", "new")': replace once (first match, if there are multiple matches), with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to sub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io.
Examples:
sub("ababab", "ab", "XY") gives "XYabab"
sub("abc.def", ".", "X") gives "Xbc.def"
sub("abc.def", "\.", "X") gives "abcXdef"
sub("abcdefg", "[ce]", "X") gives "abXdefg"
sub("prefix4529:suffix8567", "suffix([0-9]+)", "name\1") gives "prefix4529:name8567"

substr

substr  (class=string #args=3) substr is an alias for substr0. See also substr1. Miller is generally 1-up with all array and string indices, but, this is a backward-compatibility issue with Miller 5 and below. Arrays are new in Miller 6; the substr function is older.

substr0

substr0  (class=string #args=3) substr0(s,m,n) gives substring of s from 0-up position m to n inclusive. Negative indices -len .. -1 alias to 0 .. len-1. See also substr and substr1.

substr1

substr1  (class=string #args=3) substr1(s,m,n) gives substring of s from 1-up position m to n inclusive. Negative indices -len .. -1 alias to 1 .. len. See also substr and substr0.

tolower

tolower  (class=string #args=1) Convert string to lowercase.

toupper

toupper  (class=string #args=1) Convert string to uppercase.

truncate

truncate  (class=string #args=2) Truncates string first argument to max length of int second argument.

unformat

unformat  (class=string #args=2) Using first argument as format string, unpacks second argument into an array of matches, with type-inference. On non-match, returns error -- use is_error() to check.
Examples:
unformat("{}:{}:{}",  "1:2:3") gives [1, 2, 3].
unformat("{}h{}m{}s", "3h47m22s") gives [3, 47, 22].
is_error(unformat("{}h{}m{}s", "3:47:22")) gives true.

unformatx

unformatx  (class=string #args=2) Same as unformat, but without type-inference.
Examples:
unformatx("{}:{}:{}",  "1:2:3") gives ["1", "2", "3"].
unformatx("{}h{}m{}s", "3h47m22s") gives ["3", "47", "22"].
is_error(unformatx("{}h{}m{}s", "3:47:22")) gives true.

utf8_to_latin1

utf8_to_latin1  (class=string #args=1) Tries to convert UTF-8-encoded string to Latin-1-encoded string. If argument is array or map, recurses into it.
Examples:
$y = utf8_to_latin1($x)
$* = utf8_to_latin1($*)

.

.  (class=string #args=2) String concatenation. Non-strings are coerced, so you can do '"ax".98' etc.

System functions

hostname

hostname  (class=system #args=0) Returns the hostname as a string.

os

os  (class=system #args=0) Returns the operating-system name as a string.

system

system  (class=system #args=1) Run command string, yielding its stdout minus final carriage return.

version

version  (class=system #args=0) Returns the Miller version as a string.

Time functions

dhms2fsec

dhms2fsec  (class=time #args=1) Recovers floating-point seconds as in dhms2fsec("5d18h53m20.250000s") = 500000.250000

dhms2sec

dhms2sec  (class=time #args=1) Recovers integer seconds as in dhms2sec("5d18h53m20s") = 500000

fsec2dhms

fsec2dhms  (class=time #args=1) Formats floating-point seconds as in fsec2dhms(500000.25) = "5d18h53m20.250000s"

fsec2hms

fsec2hms  (class=time #args=1) Formats floating-point seconds as in fsec2hms(5000.25) = "01:23:20.250000"

gmt2localtime

gmt2localtime  (class=time #args=1,2) Convert from a GMT-time string to a local-time string. Consulting $TZ unless second argument is supplied.
Examples:
gmt2localtime("1999-12-31T22:00:00Z") = "2000-01-01 00:00:00" with TZ="Asia/Istanbul"
gmt2localtime("1999-12-31T22:00:00Z", "Asia/Istanbul") = "2000-01-01 00:00:00"

gmt2sec

gmt2sec  (class=time #args=1) Parses GMT timestamp as integer seconds since the epoch.
Example:
gmt2sec("2001-02-03T04:05:06Z") = 981173106

hms2fsec

hms2fsec  (class=time #args=1) Recovers floating-point seconds as in hms2fsec("01:23:20.250000") = 5000.250000

hms2sec

hms2sec  (class=time #args=1) Recovers integer seconds as in hms2sec("01:23:20") = 5000

localtime2gmt

localtime2gmt  (class=time #args=1,2) Convert from a local-time string to a GMT-time string. Consults $TZ unless second argument is supplied.
Examples:
localtime2gmt("2000-01-01 00:00:00") = "1999-12-31T22:00:00Z" with TZ="Asia/Istanbul"
localtime2gmt("2000-01-01 00:00:00", "Asia/Istanbul") = "1999-12-31T22:00:00Z"

localtime2sec

localtime2sec  (class=time #args=1,2) Parses local timestamp as integer seconds since the epoch. Consults $TZ environment variable, unless second argument is supplied.
Examples:
localtime2sec("2001-02-03 04:05:06") = 981165906 with TZ="Asia/Istanbul"
localtime2sec("2001-02-03 04:05:06", "Asia/Istanbul") = 981165906"

sec2dhms

sec2dhms  (class=time #args=1) Formats integer seconds as in sec2dhms(500000) = "5d18h53m20s"

sec2gmt

sec2gmt  (class=time #args=1,2) Formats seconds since epoch as GMT timestamp. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part.
Examples:
sec2gmt(1234567890)           = "2009-02-13T23:31:30Z"
sec2gmt(1234567890.123456)    = "2009-02-13T23:31:30Z"
sec2gmt(1234567890.123456, 6) = "2009-02-13T23:31:30.123456Z"

sec2gmtdate

sec2gmtdate  (class=time #args=1) Formats seconds since epoch (integer part) as GMT timestamp with year-month-date. Leaves non-numbers as-is.
Example:
sec2gmtdate(1440768801.7) = "2015-08-28".

sec2hms

sec2hms  (class=time #args=1) Formats integer seconds as in sec2hms(5000) = "01:23:20"

sec2localdate

sec2localdate  (class=time #args=1,2) Formats seconds since epoch (integer part) as local timestamp with year-month-date. Leaves non-numbers as-is. Consults $TZ environment variable unless second argument is supplied.
Examples:
sec2localdate(1440768801.7) = "2015-08-28" with TZ="Asia/Istanbul"
sec2localdate(1440768801.7, "Asia/Istanbul") = "2015-08-28"

sec2localtime

sec2localtime  (class=time #args=1,2,3) Formats seconds since epoch (integer part) as local timestamp. Consults $TZ environment variable unless third argument is supplied. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part
Examples:
sec2localtime(1234567890)           = "2009-02-14 01:31:30"        with TZ="Asia/Istanbul"
sec2localtime(1234567890.123456)    = "2009-02-14 01:31:30"        with TZ="Asia/Istanbul"
sec2localtime(1234567890.123456, 6) = "2009-02-14 01:31:30.123456" with TZ="Asia/Istanbul"
sec2localtime(1234567890.123456, 6, "Asia/Istanbul") = "2009-02-14 01:31:30.123456"

strftime

strftime  (class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ")  = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"

strftime_local

strftime_local  (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%S %z")  = "2015-08-28 16:33:21 +0300" with TZ="Asia/Istanbul"
strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z") = "2015-08-28 16:33:21.700 +0300" with TZ="Asia/Istanbul"
strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z", "Asia/Istanbul") = "2015-08-28 16:33:21.700 +0300"

strptime

strptime  (class=time #args=2) strptime: Parses timestamp as floating-point seconds since the epoch. See also strptime_local.
Examples:
strptime("2015-08-28T13:33:21Z",      "%Y-%m-%dT%H:%M:%SZ")   = 1440768801.000000
strptime("2015-08-28T13:33:21.345Z",  "%Y-%m-%dT%H:%M:%SZ")   = 1440768801.345000
strptime("1970-01-01 00:00:00 -0400", "%Y-%m-%d %H:%M:%S %z") = 14400
strptime("1970-01-01 00:00:00 EET",   "%Y-%m-%d %H:%M:%S %Z") = -7200

strptime_local

strptime_local  (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z",    "%Y-%m-%dT%H:%M:%SZ") = 1440758001     with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"
strptime_local("2015-08-28 13:33:21",     "%Y-%m-%d %H:%M:%S")  = 1440758001     with TZ="Asia/Istanbul"
strptime_local("2015-08-28 13:33:21",     "%Y-%m-%d %H:%M:%S", "Asia/Istanbul") = 1440758001

systime

systime  (class=time #args=0) Returns the system time in floating-point seconds since the epoch.

systimeint

systimeint  (class=time #args=0) Returns the system time in integer seconds since the epoch.

uptime

uptime  (class=time #args=0) Returns the time in floating-point seconds since the current Miller program was started.

Typing functions

asserting_absent

asserting_absent  (class=typing #args=1) Aborts with an error if is_absent on the argument returns false, else returns its argument.

asserting_array

asserting_array  (class=typing #args=1) Aborts with an error if is_array on the argument returns false, else returns its argument.

asserting_bool

asserting_bool  (class=typing #args=1) Aborts with an error if is_bool on the argument returns false, else returns its argument.

asserting_boolean

asserting_boolean  (class=typing #args=1) Aborts with an error if is_boolean on the argument returns false, else returns its argument.

asserting_empty

asserting_empty  (class=typing #args=1) Aborts with an error if is_empty on the argument returns false, else returns its argument.

asserting_empty_map

asserting_empty_map  (class=typing #args=1) Aborts with an error if is_empty_map on the argument returns false, else returns its argument.

asserting_error

asserting_error  (class=typing #args=1) Aborts with an error if is_error on the argument returns false, else returns its argument.

asserting_float

asserting_float  (class=typing #args=1) Aborts with an error if is_float on the argument returns false, else returns its argument.

asserting_int

asserting_int  (class=typing #args=1) Aborts with an error if is_int on the argument returns false, else returns its argument.

asserting_map

asserting_map  (class=typing #args=1) Aborts with an error if is_map on the argument returns false, else returns its argument.

asserting_nonempty_map

asserting_nonempty_map  (class=typing #args=1) Aborts with an error if is_nonempty_map on the argument returns false, else returns its argument.

asserting_not_array

asserting_not_array  (class=typing #args=1) Aborts with an error if is_not_array on the argument returns false, else returns its argument.

asserting_not_empty

asserting_not_empty  (class=typing #args=1) Aborts with an error if is_not_empty on the argument returns false, else returns its argument.

asserting_not_map

asserting_not_map  (class=typing #args=1) Aborts with an error if is_not_map on the argument returns false, else returns its argument.

asserting_not_null

asserting_not_null  (class=typing #args=1) Aborts with an error if is_not_null on the argument returns false, else returns its argument.

asserting_null

asserting_null  (class=typing #args=1) Aborts with an error if is_null on the argument returns false, else returns its argument.

asserting_numeric

asserting_numeric  (class=typing #args=1) Aborts with an error if is_numeric on the argument returns false, else returns its argument.

asserting_present

asserting_present  (class=typing #args=1) Aborts with an error if is_present on the argument returns false, else returns its argument.

asserting_string

asserting_string  (class=typing #args=1) Aborts with an error if is_string on the argument returns false, else returns its argument.

is_absent

is_absent  (class=typing #args=1) False if field is present in input, true otherwise

is_array

is_array  (class=typing #args=1) True if argument is an array.

is_bool

is_bool  (class=typing #args=1) True if field is present with boolean value. Synonymous with is_boolean.

is_boolean

is_boolean  (class=typing #args=1) True if field is present with boolean value. Synonymous with is_bool.

is_empty

is_empty  (class=typing #args=1) True if field is present in input with empty string value, false otherwise.

is_empty_map

is_empty_map  (class=typing #args=1) True if argument is a map which is empty.

is_error

is_error  (class=typing #args=1) True if if argument is an error, such as taking string length of an integer.

is_float

is_float  (class=typing #args=1) True if field is present with value inferred to be float

is_int

is_int  (class=typing #args=1) True if field is present with value inferred to be int

is_map

is_map  (class=typing #args=1) True if argument is a map.

is_nan

is_nan  (class=typing #args=1) True if the argument is the NaN (not-a-number) floating-point value. Note that NaN has the property that NaN != NaN, so you need 'is_nan(x)' rather than 'x == NaN'.

is_nonempty_map

is_nonempty_map  (class=typing #args=1) True if argument is a map which is non-empty.

is_not_array

is_not_array  (class=typing #args=1) True if argument is not an array.

is_not_empty

is_not_empty  (class=typing #args=1) True if field is present in input with non-empty value, false otherwise

is_not_map

is_not_map  (class=typing #args=1) True if argument is not a map.

is_not_null

is_not_null  (class=typing #args=1) False if argument is null (empty, absent, or JSON null), true otherwise.

is_null

is_null  (class=typing #args=1) True if argument is null (empty, absent, or JSON null), false otherwise.

is_numeric

is_numeric  (class=typing #args=1) True if field is present with value inferred to be int or float

is_present

is_present  (class=typing #args=1) True if field is present in input, false otherwise.

is_string

is_string  (class=typing #args=1) True if field is present with string (including empty-string) value

typeof

typeof  (class=typing #args=1) Convert argument to type of argument (e.g. "str"). For debug.

56 KiB Raw Blame History