diff --git a/docs/content/cdc-ingestion/overview.md b/docs/content/cdc-ingestion/overview.md index 09def54df2e3..90e2235d3c9b 100644 --- a/docs/content/cdc-ingestion/overview.md +++ b/docs/content/cdc-ingestion/overview.md @@ -82,29 +82,29 @@ behaviors of `RENAME TABLE` and `DROP COLUMN` will be ignored, `RENAME COLUMN` w ### Temporal Functions -Temporal functions can convert date and time to another form. A common use case is to generate partition values. +Temporal functions can convert date and epoch time to another form. A common use case is to generate partition values. {{< generated/temporal_functions >}} -The data type of temporal-column can be one of the following cases: +The data type of the temporal-column can be one of the following cases: 1. DATE, DATETIME or TIMESTAMP. - 2. Any integer numeric type (such as INT and BIGINT). In this case, the data will be considered as epoch time of `1970-01-01 00:00:00`. -You should set precision of the value (default is 0). Currently, There are four valid precisions: `0` (for epoch seconds), -`3` (for epoch milliseconds), `6`(for epoch microseconds) and `9` (for epoch nanoseconds). -Take the time point `1970-01-01 00:00:00.123456789` as an example, the epoch seconds are 0, the epoch milliseconds are 123, -the epoch microseconds are 123456, and the epoch nanoseconds are 123456789. The precision should match the input values. -You can set precision in this way: `date_format(epoch_col, yyyy-MM-dd, 0)`. - +You should set precision of the value (default is 0). 3. STRING. In this case, if you didn't set the time unit, the data will be considered as formatted string of DATE, DATETIME or TIMESTAMP value. Otherwise, the data will be considered as string value of epoch time. So you must set time unit in the latter case. +The precision represents the unit of the epoch time. Currently, There are four valid precisions: `0` (for epoch seconds), +`3` (for epoch milliseconds), `6`(for epoch microseconds) and `9` (for epoch nanoseconds). Take the time point +`1970-01-01 00:00:00.123456789` as an example, the epoch seconds are 0, the epoch milliseconds are 123, the epoch microseconds +are 123456, and the epoch nanoseconds are 123456789. The precision should match the input values. You can set precision +in this way: `date_format(epoch_col, yyyy-MM-dd, 0)`. + `date_format` is a flexible function which is able to convert the temporal value to various formats with different format strings. A most common format string is `yyyy-MM-dd HH:mm:ss.SSS`. Another example is `yyyy-ww` which can extract the year and the week-of-the-year from the input. Note that the output is affected by the locale. For example, in some regions the first day of a week is Monday while in others is Sunday, so if you use `date_format(date_col, yyyy-ww)` and the input of -date_col is 2024/01/07 (Sunday), the output maybe `2024-01` (if the first day of a week is Monday) or `2024-02` (if the +date_col is 2024-01-07 (Sunday), the output maybe `2024-01` (if the first day of a week is Monday) or `2024-02` (if the first day of a week is Sunday). ### Other Functions diff --git a/docs/layouts/shortcodes/generated/temporal_functions.html b/docs/layouts/shortcodes/generated/temporal_functions.html index 6d0fa3ac4ca6..d652a5a0efc9 100644 --- a/docs/layouts/shortcodes/generated/temporal_functions.html +++ b/docs/layouts/shortcodes/generated/temporal_functions.html @@ -26,31 +26,31 @@ -
year(time-column [, time-unit])
+
year(temporal-column [, precision])
Extract year from the input. Output is an INT value represent the year. -
month(date-column [, time-unit])
+
month(temporal-column [, precision])
Extract month of year from the input. Output is an INT value represent the month of year. -
day(time-column [, time-unit])
+
day(temporal-column [, precision])
Extract day of month from the input. Output is an INT value represent the day of month. -
hour(time-column [, time-unit])
+
hour(temporal-column [, precision])
Extract hour from the input. Output is an INT value represent the hour. -
minute(time-column [, time-unit])
+
minute(temporal-column [, precision])
Extract minute from the input. Output is an INT value represent the minute. -
second(time-column [, time-unit])
+
second(temporal-column [, precision])
Extract second from the input. Output is an INT value represent the second. -
date_format(time-column, format-string [, time-unit])
+
date_format(temporal-column, format-string [, precision])
Convert the input to desired formatted string. Output type is STRING.