Fix "%%" in strptime; more test cases for strptime (#951)

* strptime debug mode

* Fix strptime %% format code

* internal/pkg/pbnjay-strptime/strptime_test.go; reg-test files

* Web-doc improvements for strftime and strptime

* doc-build artifacts
This commit is contained in:
John Kerl 2022-02-21 12:21:04 -05:00 committed by GitHub
parent 43ff9108ee
commit 47a427b00c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
16 changed files with 443 additions and 73 deletions

View file

@ -2561,7 +2561,7 @@ FUNCTIONS FOR FILTER/PUT
ssub("abc.def", ".", "X") gives "abcXdef"
strftime
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also "DSL datetime/timezone functions" at https://miller.readthedocs.io for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
@ -2591,7 +2591,7 @@ FUNCTIONS FOR FILTER/PUT
strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
strptime_local
(class=time #args=2,3) Like stpftime but consults the $TZ environment variable to get local time zone.
(class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"
@ -3183,5 +3183,5 @@ SEE ALSO
2022-02-20 MILLER(1)
2022-02-21 MILLER(1)
</pre>

View file

@ -2540,7 +2540,7 @@ FUNCTIONS FOR FILTER/PUT
ssub("abc.def", ".", "X") gives "abcXdef"
strftime
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also "DSL datetime/timezone functions" at https://miller.readthedocs.io for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
@ -2570,7 +2570,7 @@ FUNCTIONS FOR FILTER/PUT
strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
strptime_local
(class=time #args=2,3) Like stpftime but consults the $TZ environment variable to get local time zone.
(class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"
@ -3162,4 +3162,4 @@ SEE ALSO
2022-02-20 MILLER(1)
2022-02-21 MILLER(1)

View file

@ -1247,7 +1247,7 @@ sec2localtime(1234567890.123456, 6, "Asia/Istanbul") = "2009-02-14 01:31:30.1234
### strftime
<pre class="pre-non-highlight-non-pair">
strftime (class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also "DSL datetime/timezone functions" at https://miller.readthedocs.io for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
strftime (class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
@ -1277,7 +1277,7 @@ strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
### strptime_local
<pre class="pre-non-highlight-non-pair">
strptime_local (class=time #args=2,3) Like stpftime but consults the $TZ environment variable to get local time zone.
strptime_local (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"

View file

@ -25,7 +25,7 @@ See also the [section on time-related
functions](reference-dsl-builtin-functions.md#time-functions) for
information auto-generated from Miller's online-help strings.
# Epoch seconds
## Epoch seconds
[Seconds since the epoch](https://en.wikipedia.org/wiki/Unix_time), or _Unix
Time_, is seconds (positive, zero, or negative) since midnight January 1 1970
@ -81,7 +81,7 @@ purple square false 10 91 72.3735 8.2430 1634784588.045422
The [systimeint](reference-dsl-builtin-functions.md#systimeint) DSL functions
is nothing more than a keystroke-saver for `int(systime())`.
# UTC times with standard format
## UTC times with standard format
One way to make epoch-seconds human-readable, while maintaining some of their
benefits such as being independent of timezone and daylight savings, is to use
@ -115,7 +115,7 @@ We also have [sec2gmtdate](reference-dsl-builtin-functions.md#sec2gmtdate) DSL f
1930-11-18
</pre>
# Local times with standard format; specifying timezones
## Local times with standard format; specifying timezones
You can use similar formatting for dates in your preferred timezone, not just UTC/GMT.
We have the
@ -231,23 +231,98 @@ We also have the
1969-12-31T22:00:00Z
</pre>
# GMT and local times with custom formats
## Custom formats: strptime and strftime
The to-string and from-string functions we've seen so far are low-keystroking:
with a little bit of typing you can convert datetimes to/from epoch seconds.
The minus, however, is flexibility. This is where the
[strftime](reference-dsl-builtin-functions.md#strftime),
[strftime](reference-dsl-builtin-functions.md#strftime) and
[strptime](reference-dsl-builtin-functions.md#strptime) functions come into play.
Notes:
* The names `strftime` and `strptime` far predate Miller; they were chosen for familiarity. The `f` is for _format_: from epoch-seconds to human-readable string. The `p` is for _parse_: for doing the reverse.
* Even though Miller is written in Go as of Miller 6, it still preserves [C-like](https://en.wikipedia.org/wiki/C_date_and_time_functions#strftime) `strftime` and `strptime` semantics.
* For `strftime`, this is thanks to [https://github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime).
* For `stpftime`, this is thanks to [https://github.com/pbnjay/strptime](https://github.com/pbnjay/strptime).
* See [https://devhints.io/strftime](https://devhints.io/strftime) for sample format strings you can use.
* Even though Miller is written in Go as of Miller 6, it still largely preserves [C-like](https://en.wikipedia.org/wiki/C_date_and_time_functions#strftime) `strftime` and `strptime` semantics. As noted below, not all format strings used by the C library are recognized.
* For `strftime`, this is thanks to [https://github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime), with a Miller-specific modification for fractional seconds.
* For `strftime`, this is thanks to [https://github.com/pbnjay/strptime](https://github.com/pbnjay/strptime), with Miller-specific modifications.
Some examples:
Available format strings for `strftime`, taken directly from [https://github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime):
| Pattern | Description |
|---------|-------------|
| `%A` | national representation of the full weekday name |
| `%a` | national representation of the abbreviated weekday |
| `%B` | national representation of the full month name |
| `%b` | national representation of the abbreviated month name |
| `%C` | (year / 100) as decimal number; single digits are preceded by a zero |
| `%c` | national representation of time and date |
| `%D` | equivalent to `%m/%d/%y` |
| `%d` | day of the month as a decimal number (01-31) |
| `%e` | the day of the month as a decimal number (1-31); single digits are preceded by a blank |
| `%F` | equivalent to `%Y-%m-%d` |
| `%H` | the hour (24-hour clock) as a decimal number (00-23) |
| `%h` | same as `%b` |
| `%I` | the hour (12-hour clock) as a decimal number (01-12) |
| `%j` | the day of the year as a decimal number (001-366) |
| `%k` | the hour (24-hour clock) as a decimal number (0-23); single digits are preceded by a blank |
| `%l` | the hour (12-hour clock) as a decimal number (1-12); single digits are preceded by a blank |
| `%M` | the minute as a decimal number (00-59) |
| `%m` | the month as a decimal number (01-12) |
| `%n` | a newline |
| `%p` | national representation of either "ante meridiem" (a.m.) or "post meridiem" (p.m.) as appropriate. |
| `%R` | equivalent to `%H:%M` |
| `%r` | equivalent to `%I:%M:%S %p` |
| `%S` | the second as a decimal number (00-60) |
| `%1S`, ..., `%9S` | the second as a decimal number (00-60) with 1..9 decimal places, respectively |
| `%T` | equivalent to `%H:%M:%S` |
| `%t` | a tab |
| `%U` | the week number of the year (Sunday as the first day of the week) as a decimal number (00-53) |
| `%u` | the weekday (Monday as the first day of the week) as a decimal number (1-7) |
| `%V` | the week number of the year (Monday as the first day of the week) as a decimal number (01-53) |
| `%v` | equivalent to `%e-%b-%Y` |
| `%W` | the week number of the year (Monday as the first day of the week) as a decimal number (00-53) |
| `%w` | the weekday (Sunday as the first day of the week) as a decimal number (0-6) |
| `%X` | national representation of the time |
| `%x` | national representation of the date |
| `%Y` | the year with century as a decimal number |
| `%y` | the year without century as a decimal number (00-99) |
| `%Z` | the time zone name |
| `%z` | the time zone offset from UTC |
| `%%` | a `%` |
Available format strings for `strptime`:
| Pattern | Description |
|---------|-------------|
| `%%` | A literal '%' character. |
| `%b` | Month as locales abbreviated name. |
| `%B` | Month as locales full name. |
| `%d` | Day of the month as a zero-padded decimal number. |
| `%f` | Microsecond as a decimal number, zero-padded on the left. |
| `%H` | Hour (24-hour clock) as a zero-padded decimal number. |
| `%I` | Hour (12-hour clock) as a zero-padded decimal number. |
| `%j` | Three-digit day of year, like 004 or 363. |
| `%m` | Month as a zero-padded decimal number. |
| `%M` | Minute as a zero-padded decimal number. |
| `%p` | Locales equivalent of either AM or PM. |
| `%S` | Second as a zero-padded decimal number. |
| `%y` | Year without century as a zero-padded decimal number. |
| `%Y` | Year with century as a decimal number. |
| `%z` | UTC offset in the form +HHMM or -HHMM. |
| `%Z` | Time zone name. UTC, EST, CST -- only if you're in that timezone. |
Examples:
<pre class="pre-highlight-in-pair">
<b>mlr -n put 'end {</b>
<b> print strftime(0, "%Y-%m-%dT%H:%M:%SZ");</b>
<b> print strftime(0, "%FT%TZ");</b>
<b>}'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
1970-01-01T00:00:00Z
1970-01-01T00:00:00Z
</pre>
<pre class="pre-highlight-in-pair">
<b>mlr -n put 'end {</b>
@ -267,9 +342,43 @@ Thursday, January 1, 1970
09:33 PM
</pre>
Unfortunately, names from `%A` and `%B` are only available in English, as an
artifact of a design choice in the Go `time` library which Miller (and its
`strftime` / `strptime` supporting packages as noted above) rely on.
Unfortunately, names from `%A` and `%B` are only available in English, as an artifact of a design
choice in the Go `time` library which Miller (and its `strftime` / `strptime` supporting packages as
noted above) rely on.
## A note on timezones
A note on timezones for `strptime`:
* Three-letter timezone names such as `CST` are recognized _only if you're in them_. (`UTC` is an exception.) This is because these aren't globally unique: `CST` can stand for `Central Standard Time`, `_Cuba Standard Time`, `_China Standard Time`, etc.
* Timezone specifiers which _are_ globally unique are of the form `-0400` and `+0500`.
* Specifiers like `-04:30`, `UTC-8`, and `Asia/Istanbul` were not supported in Miller 5 (which used the C `strptime` library), and are likewise not supported in Miller 6. See however the `TZ` environment-variable examples below.
* If you wish to match a final `Z` in the input, use a final `Z` in the format string. For example (see [ISO8601](https://en.wikipedia.org/wiki/ISO_8601)) you can match the timestamp `1970-01-01T00:00:00Z` using the format string `%FT%TZ`.
## Fractional seconds
For historical reasons, Miller's `strftime` and `strptime` use different format specifications for fractional seconds. Examples:
<pre class="pre-highlight-in-pair">
<b>mlr -n put 'end {</b>
<b> print strftime(123456.789, "%Y-%m-%d %H:%M:%S");</b>
<b> print strftime(123456.789, "%Y-%m-%d %H:%M:%1S");</b>
<b> print strftime(123456.789, "%Y-%m-%d %H:%M:%3S");</b>
<b> print strftime(123456.789, "%Y-%m-%d %H:%M:%6S");</b>
<b> print strptime("1970-01-02 10:17:36.789000", "%Y-%m-%d %H:%M:%S");</b>
<b> print strptime("1970-01-02 10:17:36.789000", "%Y-%m-%d %H:%M:%S.%f");</b>
<b>}'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
1970-01-02 10:17:36
1970-01-02 10:17:36.7
1970-01-02 10:17:36.789
1970-01-02 10:17:36.789000
(error)
123456.789
</pre>
## strptime_local and strftime_local
We also have
[strftimelocal](reference-dsl-builtin-functions.md#strftimelocal) and
@ -327,7 +436,7 @@ Thursday, January 1, 1970
1582992000
</pre>
# Relative times
## Relative times
You can get the seconds since the Miller process start using
[uptime](reference-dsl-builtin-functions.md#uptime):
@ -365,7 +474,7 @@ sec2dhms (class=time #args=1) Formats integer seconds as in sec2dhms(500000) =
sec2hms (class=time #args=1) Formats integer seconds as in sec2hms(5000) = "01:23:20"
</pre>
# References
## References
* List of formatting characters for `strftime` and `strptime`: [https://devhints.io/strftime](https://devhints.io/strftime)
* Non-Miller-specific list of formatting characters for `strftime` and `strptime`: [https://devhints.io/strftime](https://devhints.io/strftime)
* List of valid timezone names: [https://en.wikipedia.org/wiki/List_of_tz_database_time_zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)

View file

@ -9,7 +9,7 @@ See also the [section on time-related
functions](reference-dsl-builtin-functions.md#time-functions) for
information auto-generated from Miller's online-help strings.
# Epoch seconds
## Epoch seconds
[Seconds since the epoch](https://en.wikipedia.org/wiki/Unix_time), or _Unix
Time_, is seconds (positive, zero, or negative) since midnight January 1 1970
@ -59,7 +59,7 @@ GENMD-EOF
The [systimeint](reference-dsl-builtin-functions.md#systimeint) DSL functions
is nothing more than a keystroke-saver for `int(systime())`.
# UTC times with standard format
## UTC times with standard format
One way to make epoch-seconds human-readable, while maintaining some of their
benefits such as being independent of timezone and daylight savings, is to use
@ -84,7 +84,7 @@ mlr -n put 'end {
}'
GENMD-EOF
# Local times with standard format; specifying timezones
## Local times with standard format; specifying timezones
You can use similar formatting for dates in your preferred timezone, not just UTC/GMT.
We have the
@ -163,23 +163,94 @@ mlr -n put 'end {
}'
GENMD-EOF
# GMT and local times with custom formats
## Custom formats: strptime and strftime
The to-string and from-string functions we've seen so far are low-keystroking:
with a little bit of typing you can convert datetimes to/from epoch seconds.
The minus, however, is flexibility. This is where the
[strftime](reference-dsl-builtin-functions.md#strftime),
[strftime](reference-dsl-builtin-functions.md#strftime) and
[strptime](reference-dsl-builtin-functions.md#strptime) functions come into play.
Notes:
* The names `strftime` and `strptime` far predate Miller; they were chosen for familiarity. The `f` is for _format_: from epoch-seconds to human-readable string. The `p` is for _parse_: for doing the reverse.
* Even though Miller is written in Go as of Miller 6, it still preserves [C-like](https://en.wikipedia.org/wiki/C_date_and_time_functions#strftime) `strftime` and `strptime` semantics.
* For `strftime`, this is thanks to [https://github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime).
* For `stpftime`, this is thanks to [https://github.com/pbnjay/strptime](https://github.com/pbnjay/strptime).
* See [https://devhints.io/strftime](https://devhints.io/strftime) for sample format strings you can use.
* Even though Miller is written in Go as of Miller 6, it still largely preserves [C-like](https://en.wikipedia.org/wiki/C_date_and_time_functions#strftime) `strftime` and `strptime` semantics. As noted below, not all format strings used by the C library are recognized.
* For `strftime`, this is thanks to [https://github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime), with a Miller-specific modification for fractional seconds.
* For `strftime`, this is thanks to [https://github.com/pbnjay/strptime](https://github.com/pbnjay/strptime), with Miller-specific modifications.
Some examples:
Available format strings for `strftime`, taken directly from [https://github.com/lestrrat-go/strftime](https://github.com/lestrrat-go/strftime):
| Pattern | Description |
|---------|-------------|
| `%A` | national representation of the full weekday name |
| `%a` | national representation of the abbreviated weekday |
| `%B` | national representation of the full month name |
| `%b` | national representation of the abbreviated month name |
| `%C` | (year / 100) as decimal number; single digits are preceded by a zero |
| `%c` | national representation of time and date |
| `%D` | equivalent to `%m/%d/%y` |
| `%d` | day of the month as a decimal number (01-31) |
| `%e` | the day of the month as a decimal number (1-31); single digits are preceded by a blank |
| `%F` | equivalent to `%Y-%m-%d` |
| `%H` | the hour (24-hour clock) as a decimal number (00-23) |
| `%h` | same as `%b` |
| `%I` | the hour (12-hour clock) as a decimal number (01-12) |
| `%j` | the day of the year as a decimal number (001-366) |
| `%k` | the hour (24-hour clock) as a decimal number (0-23); single digits are preceded by a blank |
| `%l` | the hour (12-hour clock) as a decimal number (1-12); single digits are preceded by a blank |
| `%M` | the minute as a decimal number (00-59) |
| `%m` | the month as a decimal number (01-12) |
| `%n` | a newline |
| `%p` | national representation of either "ante meridiem" (a.m.) or "post meridiem" (p.m.) as appropriate. |
| `%R` | equivalent to `%H:%M` |
| `%r` | equivalent to `%I:%M:%S %p` |
| `%S` | the second as a decimal number (00-60) |
| `%1S`, ..., `%9S` | the second as a decimal number (00-60) with 1..9 decimal places, respectively |
| `%T` | equivalent to `%H:%M:%S` |
| `%t` | a tab |
| `%U` | the week number of the year (Sunday as the first day of the week) as a decimal number (00-53) |
| `%u` | the weekday (Monday as the first day of the week) as a decimal number (1-7) |
| `%V` | the week number of the year (Monday as the first day of the week) as a decimal number (01-53) |
| `%v` | equivalent to `%e-%b-%Y` |
| `%W` | the week number of the year (Monday as the first day of the week) as a decimal number (00-53) |
| `%w` | the weekday (Sunday as the first day of the week) as a decimal number (0-6) |
| `%X` | national representation of the time |
| `%x` | national representation of the date |
| `%Y` | the year with century as a decimal number |
| `%y` | the year without century as a decimal number (00-99) |
| `%Z` | the time zone name |
| `%z` | the time zone offset from UTC |
| `%%` | a `%` |
Available format strings for `strptime`:
| Pattern | Description |
|---------|-------------|
| `%%` | A literal '%' character. |
| `%b` | Month as locales abbreviated name. |
| `%B` | Month as locales full name. |
| `%d` | Day of the month as a zero-padded decimal number. |
| `%f` | Microsecond as a decimal number, zero-padded on the left. |
| `%H` | Hour (24-hour clock) as a zero-padded decimal number. |
| `%I` | Hour (12-hour clock) as a zero-padded decimal number. |
| `%j` | Three-digit day of year, like 004 or 363. |
| `%m` | Month as a zero-padded decimal number. |
| `%M` | Minute as a zero-padded decimal number. |
| `%p` | Locales equivalent of either AM or PM. |
| `%S` | Second as a zero-padded decimal number. |
| `%y` | Year without century as a zero-padded decimal number. |
| `%Y` | Year with century as a decimal number. |
| `%z` | UTC offset in the form +HHMM or -HHMM. |
| `%Z` | Time zone name. UTC, EST, CST -- only if you're in that timezone. |
Examples:
GENMD-RUN-COMMAND
mlr -n put 'end {
print strftime(0, "%Y-%m-%dT%H:%M:%SZ");
print strftime(0, "%FT%TZ");
}'
GENMD-EOF
GENMD-RUN-COMMAND
mlr -n put 'end {
@ -192,9 +263,35 @@ mlr -n put 'end {
}'
GENMD-EOF
Unfortunately, names from `%A` and `%B` are only available in English, as an
artifact of a design choice in the Go `time` library which Miller (and its
`strftime` / `strptime` supporting packages as noted above) rely on.
Unfortunately, names from `%A` and `%B` are only available in English, as an artifact of a design
choice in the Go `time` library which Miller (and its `strftime` / `strptime` supporting packages as
noted above) rely on.
## A note on timezones
A note on timezones for `strptime`:
* Three-letter timezone names such as `CST` are recognized _only if you're in them_. (`UTC` is an exception.) This is because these aren't globally unique: `CST` can stand for `Central Standard Time`, `_Cuba Standard Time`, `_China Standard Time`, etc.
* Timezone specifiers which _are_ globally unique are of the form `-0400` and `+0500`.
* Specifiers like `-04:30`, `UTC-8`, and `Asia/Istanbul` were not supported in Miller 5 (which used the C `strptime` library), and are likewise not supported in Miller 6. See however the `TZ` environment-variable examples below.
* If you wish to match a final `Z` in the input, use a final `Z` in the format string. For example (see [ISO8601](https://en.wikipedia.org/wiki/ISO_8601)) you can match the timestamp `1970-01-01T00:00:00Z` using the format string `%FT%TZ`.
## Fractional seconds
For historical reasons, Miller's `strftime` and `strptime` use different format specifications for fractional seconds. Examples:
GENMD-RUN-COMMAND
mlr -n put 'end {
print strftime(123456.789, "%Y-%m-%d %H:%M:%S");
print strftime(123456.789, "%Y-%m-%d %H:%M:%1S");
print strftime(123456.789, "%Y-%m-%d %H:%M:%3S");
print strftime(123456.789, "%Y-%m-%d %H:%M:%6S");
print strptime("1970-01-02 10:17:36.789000", "%Y-%m-%d %H:%M:%S");
print strptime("1970-01-02 10:17:36.789000", "%Y-%m-%d %H:%M:%S.%f");
}'
GENMD-EOF
## strptime_local and strftime_local
We also have
[strftimelocal](reference-dsl-builtin-functions.md#strftimelocal) and
@ -230,7 +327,7 @@ mlr -n put 'end {
}'
GENMD-EOF
# Relative times
## Relative times
You can get the seconds since the Miller process start using
[uptime](reference-dsl-builtin-functions.md#uptime):
@ -256,7 +353,7 @@ GENMD-RUN-COMMAND
mlr -F | grep hms
GENMD-EOF
# References
## References
* List of formatting characters for `strftime` and `strptime`: [https://devhints.io/strftime](https://devhints.io/strftime)
* Non-Miller-specific list of formatting characters for `strftime` and `strptime`: [https://devhints.io/strftime](https://devhints.io/strftime)
* List of valid timezone names: [https://en.wikipedia.org/wiki/List_of_tz_database_time_zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)

View file

@ -548,7 +548,10 @@ func (regtester *RegTester) executeSingleCmdFile(
}
// Write the .should-fail file
if actualExitCode != 0 {
if actualExitCode == 0 {
// Remove it, if it exists.
os.Remove(expectFailFileName)
} else {
err = regtester.storeFile(expectFailFileName, "")
if err != nil {
fmt.Printf("%s: %v\n", expectedStderrFileName, err)
@ -558,7 +561,6 @@ func (regtester *RegTester) executeSingleCmdFile(
fmt.Printf("wrote %s\n", expectedStdoutFileName)
}
}
}
for pe := postCompareExpectedActualPairs.Front(); pe != nil; pe = pe.Next() {

View file

@ -1009,8 +1009,7 @@ is supplied.`,
help: `Formats seconds since the epoch as timestamp. Format strings are as at
https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S"
through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no
decimal places.) See also "DSL datetime/timezone functions" at ` +
lib.DOC_URL + ` for more information on the differences from the C library ("man strftime" on your system).
decimal places.) See also ` + lib.DOC_URL + `/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system).
See also strftime_local.`,
examples: []string{
`strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"`,
@ -1050,7 +1049,7 @@ See also strftime_local.`,
{
name: "strptime_local",
class: FUNC_CLASS_TIME,
help: `Like stpftime but consults the $TZ environment variable to get local time zone.`,
help: `Like strftime but consults the $TZ environment variable to get local time zone.`,
examples: []string{
`strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"`,
`strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"`,

View file

@ -49,6 +49,7 @@ package strptime
import (
"errors"
"fmt"
"os"
"strings"
"time"
@ -56,6 +57,8 @@ import (
const _ignoreUnsupported = false
var _debug = os.Getenv("MLR_DEBUG_STRPTIME") != ""
// Parse accepts a percent-encoded strptime format string, converts it for use with
// time.Parse, and returns the resulting time.Time value. If non-date-related format
// text does not match within the string value, then ErrFormatMismatch will be returned.
@ -95,12 +98,19 @@ func MustParse(value, format string) time.Time {
}
// Check verifies that format is a fully-supported strptime format string for this implementation.
// Not used by Miller.
func Check(format string) error {
format = expandShorthands(format)
parts := strings.Split(format, "%")
for _, ps := range parts {
// since we split on '%', this is the format code
// Since we split on '%', this is the format code
// This is for "%%"
if ps == "" {
continue
}
c := int(ps[0])
if c == '%' {
continue
@ -116,9 +126,18 @@ func Check(format string) error {
func strptime_tz(
strptime_input, strptime_format string, ignoreUnsupported bool, useTZ bool, location *time.Location,
) (time.Time, error) {
if _debug {
fmt.Printf("================================================================ STRPTIME ENTER\n")
fmt.Printf("strptime_input \"%s\"\n", strptime_input)
fmt.Printf("strptime_format \"%s\"\n", strptime_format)
defer fmt.Printf("================================================================ STRPTIME EXIT\n")
}
// E.g. re-write "%F" to "%Y-%m-%d".
strptime_format = expandShorthands(strptime_format)
if _debug {
fmt.Printf("strptime_input \"%s\"\n", strptime_input)
}
// The job of strptime is to map "format strings" like "%Y-%m-%d %H:%M:%S" to
// Go-library "templates" like "2006 01 02 15 04 05".
@ -144,36 +163,69 @@ func strptime_tz(
sii := 0
partsBetweenPercentSigns := strings.Split(strptime_format, "%")
for partsIndex, partBetweenPercentSigns := range partsBetweenPercentSigns {
nparts := len(partsBetweenPercentSigns)
for partsIndex := 0; partsIndex < nparts; /* increment in loop */ {
partBetweenPercentSigns := partsBetweenPercentSigns[partsIndex]
if _debug {
fmt.Printf("\n")
fmt.Printf("partsIndex %d: \"%s\"\n", partsIndex, partBetweenPercentSigns)
}
if partsIndex == 0 {
// Check for prefix text. It must be an exact match, e.g. with input "foo 2021" and
// format "foo %Y", "foo " == "foo ". Or, if the format starts with a "%", we're
// checking "" == "".
if strptime_input[:len(partBetweenPercentSigns)] != partBetweenPercentSigns {
if _debug {
fmt.Printf("\"%s\" != \"%s\"\n",
strptime_input[:len(partBetweenPercentSigns)], partBetweenPercentSigns,
)
}
return time.Time{}, ErrFormatMismatch
}
sii += len(partBetweenPercentSigns)
partsIndex++
continue
}
// Handle %% straight off, as this is a special case.
if partBetweenPercentSigns == "" {
if _debug {
fmt.Printf("formatCode '%c'\n", '%')
}
if strptime_input[sii:sii+1] != "%" {
if _debug {
fmt.Println("did not match %%")
}
return time.Time{}, ErrFormatMismatch
}
if _debug {
fmt.Printf("templateComponent \"%s\"\n", "%")
fmt.Printf("inputComponent \"%s\"\n", "%")
}
sii += 1
partsIndex += 2 // TODO: TYPE ME UP
continue
}
// Since we split on '%', this is the format code
formatCode := int(partBetweenPercentSigns[0])
// TODO: I don't think this is right. And, needs a unit-test case.
if formatCode == '%' { // Handle %% straight off, as this is just a text-match.
if partBetweenPercentSigns != strptime_input[sii:sii+len(partBetweenPercentSigns)] {
return time.Time{}, ErrFormatMismatch
}
sii += len(partBetweenPercentSigns)
continue
}
// Check if the format code is supported, and map the strptime-style format code to the
// Go-library (time.Parse) template component, e.g. 'Y' -> "2006".
templateComponent, supported := formatMap[formatCode]
if !supported && !ignoreUnsupported {
if _debug {
fmt.Printf("formatCode '%c' is unsupported\n", formatCode)
}
return time.Time{}, ErrFormatUnsupported
}
if _debug {
fmt.Printf("formatCode '%c'\n", formatCode)
fmt.Printf("templateComponent \"%s\"\n", templateComponent)
}
// Check the intervening text between format strings, e.g. the ":" in "%Y:%m". There may be
// some edge cases where this isn't quite right but if that's the case you've got other
@ -187,8 +239,14 @@ func strptime_tz(
sil = strings.Index(strptime_input[sii:], partBetweenPercentSigns[1:])
}
if sil == -1 {
if _debug {
fmt.Printf("format/template mismatch 1\n")
}
return time.Time{}, ErrFormatMismatch
}
if _debug {
fmt.Printf("inputComponent \"%s\"\n", strptime_input[sii:sii+sil])
}
if supported {
// Accumulate the go-lib style template and input strings.
@ -198,6 +256,9 @@ func strptime_tz(
} else {
sil = len(templateComponent)
if sil > len(strptime_input)-sii {
if _debug {
fmt.Printf("format/template mismatch 2\n")
}
return time.Time{}, ErrFormatMismatch
}
}
@ -221,13 +282,21 @@ func strptime_tz(
} else {
sii += (len(partBetweenPercentSigns) - 1) + sil
}
partsIndex++
}
if sii < len(strptime_input) {
// Extra text on end of strptime_input
if _debug {
fmt.Printf("Extra text on end of strptime_input\n")
}
return time.Time{}, ErrFormatMismatch
}
if _debug {
fmt.Printf("goLibInput \"%s\"\n", strptime_input)
fmt.Printf("goLibTemplate \"%s\"\n", strptime_format)
}
// Now call the Go time library with template and input formatted the way it wants.
if useTZ {
if location != nil {

View file

@ -0,0 +1,79 @@
package strptime
import (
"testing"
"github.com/stretchr/testify/assert"
)
type testDataType struct {
input string
format string
errNil bool
output int64
}
var testData = []testDataType{
{
"1970-01-01T00:00:00Z",
"%Y-%m-%dT%H:%M:%SZ",
true,
0,
},
{
"1970-01-01 00:00:00 -0400",
"%Y-%m-%d %H:%M:%S %z",
true,
14400, // 1970-01-01T04:00:00Z
},
{
"1970-01-01%00:00:00Z",
"%Y-%m-%d%%%H:%M:%SZ",
true,
0,
},
{
"1970-01-01T00:00:00Z",
"%FT%TZ",
true,
0,
},
{
"1970:363",
"%Y:%j",
true,
31276800, // 1970-12-29T00:00:00Z
},
{
"1970-01-01 10:20:30 PM",
"%F %r",
true,
80430, // 1970-01-01T22:20:30Z
},
{
"01/02/70 14:20",
"%D %R",
true,
138000, // 1970-01-02T14:20:00Z
},
{
"01/02/70 14:20",
"%D %X", // no such format code
false,
0,
},
}
func TestStrptime(t *testing.T) {
for _, item := range testData {
tval, err := Parse(item.input, item.format)
if item.errNil {
assert.Nil(t, err)
seconds := tval.Unix()
assert.Equal(t, seconds, item.output)
} else {
assert.NotNil(t, err)
}
}
}

View file

@ -2540,7 +2540,7 @@ FUNCTIONS FOR FILTER/PUT
ssub("abc.def", ".", "X") gives "abcXdef"
strftime
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also "DSL datetime/timezone functions" at https://miller.readthedocs.io for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
@ -2570,7 +2570,7 @@ FUNCTIONS FOR FILTER/PUT
strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
strptime_local
(class=time #args=2,3) Like stpftime but consults the $TZ environment variable to get local time zone.
(class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"
@ -3162,4 +3162,4 @@ SEE ALSO
2022-02-20 MILLER(1)
2022-02-21 MILLER(1)

View file

@ -2,12 +2,12 @@
.\" Title: mlr
.\" Author: [see the "AUTHOR" section]
.\" Generator: ./mkman.rb
.\" Date: 2022-02-20
.\" Date: 2022-02-21
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
.TH "MILLER" "1" "2022-02-20" "\ \&" "\ \&"
.TH "MILLER" "1" "2022-02-21" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Portability definitions
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -3947,7 +3947,7 @@ ssub("abc.def", ".", "X") gives "abcXdef"
.RS 0
.\}
.nf
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also "DSL datetime/timezone functions" at https://miller.readthedocs.io for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
(class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local.
Examples:
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z"
strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
@ -4013,7 +4013,7 @@ strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
.RS 0
.\}
.nf
(class=time #args=2,3) Like stpftime but consults the $TZ environment variable to get local time zone.
(class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone.
Examples:
strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul"
strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul"

View file

@ -0,0 +1 @@
mlr -n put -f ${CASEDIR}/mlr

View file

@ -0,0 +1,10 @@
0
0
14400
0
0
31276800
80430
138000
(error)
31536000.123456

View file

@ -0,0 +1,12 @@
end {
print strptime("1970-01-01T00:00:00Z", "%Y-%m-%dT%H:%M:%SZ");
print strptime("1970-01-01T00:00:00Z", "%Y-%m-%dT%H:%M:%SZ");
print strptime("1970-01-01 00:00:00 -0400", "%Y-%m-%d %H:%M:%S %z");
print strptime("1970-01-01%00:00:00Z", "%Y-%m-%d%%%H:%M:%SZ");
print strptime("1970-01-01T00:00:00Z", "%FT%TZ");
print strptime("1970:363", "%Y:%j");
print strptime("1970-01-01 10:20:30 PM", "%F %r");
print strptime("01/02/70 14:20", "%D %R");
print strptime("01/02/70 14:20", "%D %X"); # no such format code
print fmtnum(strptime("1971-01-01T00:00:00.123456Z", "%Y-%m-%dT%H:%M:%S.%fZ"), "%.6f");
}

View file

@ -1,20 +1,12 @@
=============================================================== RELEASES
* plan 6.1.0
o strptime/882
- UT-per-se cases
m strptime/strftime tabulate options
- UT case for %% matching
? https://github.com/bykof/gostradamus
? https://golangrepo.com/repo/leekchan-timeutil-go-date-time
? port mlr5 c -> go?
o unsparsify -f CSV by default -- ? into CSV record-writer -- ? caveat that record 1 controls all ...
o mlr join --left-fields a,b,c
o several needs-doc issues
o fmt/unfmt/regex doc
o FAQ/examples reorg
k strptime/882
k fmtifnum, & recursive fmtnum/fmtifnum
k unicode string literals
k natural sort order