4.5 KiB
DSL user-defined functions
As of Miller 5.0.0 you can define your own functions, as well as subroutines.
User-defined functions
Here's the obligatory example of a recursive function to compute the factorial function:
mlr --opprint --from data/small put '
func f(n) {
if (is_numeric(n)) {
if (n > 0) {
return n * f(n-1);
} else {
return 1;
}
}
# implicitly return absent-null if non-numeric
}
$ox = f($x + NR);
$oi = f($i);
'
a b i x y ox oi pan pan 1 0.3467901443380824 0.7268028627434533 0.46705354854811026 1 eks pan 2 0.7586799647899636 0.5221511083334797 3.680838410072862 2 wye wye 3 0.20460330576630303 0.33831852551664776 1.7412511955594865 6 eks wye 4 0.38139939387114097 0.13418874328430463 18.588348778962008 24 wye pan 5 0.5732889198020006 0.8636244699032729 211.38730958519247 120
Properties of user-defined functions:
-
Function bodies start with
funcand a parameter list, defined outside ofbegin,end, or otherfuncorsubrblocks. (I.e. the Miller DSL has no nested functions.) -
A function (uniqified by its name) may not be redefined: either by redefining a user-defined function, or by redefining a built-in function. However, functions and subroutines have separate namespaces: you can define a subroutine
log(for logging messages to stderr, say) which does not clash with the mathematicallog(logarithm) function. -
Functions may be defined either before or after use -- there is an object-binding/linkage step at startup. More specifically, functions may be either recursive or mutually recursive.
-
Functions may be defined and called either within
mlr filterormlr put. -
Argument values may be reassigned: they are not read-only.
-
When a return value is not implicitly returned, this results in a return value of absent-null. (In the example above, if there were records for which the argument to
fis non-numeric, the assignments would be skipped.) See also the null-data reference page. -
See the section on Local variables for information on scope and extent of arguments, as well as for information on the use of local variables within functions.
-
See the section on Expressions from files for information on the use of
-fand-eflags.
User-defined subroutines
Example:
mlr --opprint --from data/small put -q '
begin {
@call_count = 0;
}
subr s(n) {
@call_count += 1;
if (is_numeric(n)) {
if (n > 1) {
call s(n-1);
} else {
print "numcalls=" . @call_count;
}
}
}
print "NR=" . NR;
call s(NR);
'
NR=1 numcalls=1 NR=2 numcalls=3 NR=3 numcalls=6 NR=4 numcalls=10 NR=5 numcalls=15
Properties of user-defined subroutines:
-
Subroutine bodies start with
subrand a parameter list, defined outside ofbegin,end, or otherfuncorsubrblocks. (I.e. the Miller DSL has no nested subroutines.) -
A subroutine (uniqified by its name) may not be redefined. However, functions and subroutines have separate namespaces: you can define a subroutine
logwhich does not clash with the mathematicallogfunction. -
Subroutines may be defined either before or after use -- there is an object-binding/linkage step at startup. More specifically, subroutines may be either recursive or mutually recursive. Subroutines may call functions.
-
Subroutines may be defined and called either within
mlr putormlr put. -
Subroutines have read/write access to
$-variables and@-variables. -
Argument values may be reassigned: they are not read-only.
-
See the section on local variables for information on scope and extent of arguments, as well as for information on the use of local variables within functions.
-
See the section on Expressions from files for information on the use of
-fand-eflags.