Skip to content

Commit

Permalink
[FEAT] - Tutorials update (Marco) (#311)
Browse files Browse the repository at this point in the history
  • Loading branch information
marcopeix authored Apr 29, 2024
1 parent 24ccc22 commit 8fee338
Show file tree
Hide file tree
Showing 4 changed files with 277 additions and 38 deletions.
167 changes: 143 additions & 24 deletions nbs/docs/tutorials/0_anomaly_detection.ipynb

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions nbs/docs/tutorials/12_longhorizon.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,13 @@
"nixtla_client = NixtlaClient()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load the data"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -231,6 +238,13 @@
"input_seq = Y_df[-1104:-96] # Gets a sequence of 1008 observations (1008 = 42 days * 24h/day)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Forecasting"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -288,6 +302,13 @@
"nixtla_client.plot(Y_df[-168:], fcst_df, models=['TimeGPT'], level=[90], time_col='ds', target_col='y')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
66 changes: 61 additions & 5 deletions nbs/docs/tutorials/6_multiple_series.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,11 @@
"id": "752a293c-d477-45e7-93d9-23fc15a23c8f",
"metadata": {},
"source": [
"TimeGPT provides a robust solution for multi-series forecasting, which involves analyzing multiple data series concurrently, rather than a single one. The tool can be fine-tuned using a broad collection of series, enabling you to tailor the model to suit your specific needs or tasks."
"TimeGPT provides a robust solution for multi-series forecasting, which involves analyzing multiple data series concurrently, rather than a single one. The tool can be fine-tuned using a broad collection of series, enabling you to tailor the model to suit your specific needs or tasks.\n",
"\n",
"Note that the forecasts are still univariate. This means that although TimeGPT is a global model, it won't consider the inter-feature relationships within the target series. However, TimeGPT does support the use of exogenous variables such as categorical variables (e.g., category, brand), numerical variables (e.g., temperature, prices), or even special holidays.\n",
"\n",
"Let's see this in action."
]
},
{
Expand Down Expand Up @@ -84,6 +88,14 @@
"load_dotenv()"
]
},
{
"cell_type": "markdown",
"id": "61e6a645",
"metadata": {},
"source": [
"As always, we start off by intializing an instance of `NixtlaClient`."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -119,12 +131,24 @@
"nixtla_client = NixtlaClient()"
]
},
{
"cell_type": "markdown",
"id": "4c1519c9",
"metadata": {},
"source": [
"## Load the data"
]
},
{
"cell_type": "markdown",
"id": "2bd0934b-8b12-4c33-be3c-6b8d2bf86f54",
"metadata": {},
"source": [
"The following dataset contains prices of different electricity markets. Let see how can we forecast them. The main argument of the forecast method is the input data frame with the historical values of the time series you want to forecast. This data frame can contain information from many time series. Use the `unique_id` column to identify the different time series of your dataset."
"The following dataset contains prices of different electricity markets in Europe. \n",
"\n",
"Mutliple series are automatically detected in TimeGPT using the `unique_id` column. This column contains labels for each series. If there are multiple unique values in that column, then it knows it is handling a multi-series scneario.\n",
"\n",
"In this particular case, the `unique_id` column contains the value BE, DE, FR, JPM, and NP."
]
},
{
Expand Down Expand Up @@ -243,12 +267,20 @@
"nixtla_client.plot(df)"
]
},
{
"cell_type": "markdown",
"id": "51d11ba4",
"metadata": {},
"source": [
"## Forecasting"
]
},
{
"cell_type": "markdown",
"id": "1dbe558a-ac0f-475b-abd6-838121863307",
"metadata": {},
"source": [
"We just have to pass the dataframe to create forecasts for all the time series at once. "
"To forecast all series at once, we simply pass the dataframe to the `df` argument. TimeGPt will automatically forecast all series."
]
},
{
Expand Down Expand Up @@ -401,20 +433,30 @@
"nixtla_client.plot(df, timegpt_fcst_multiseries_df, max_insample_length=365, level=[80, 90])"
]
},
{
"cell_type": "markdown",
"id": "bd689e11",
"metadata": {},
"source": [
"From the figure above, we can see that the model effectively generated predictions for each unique series in the dataset."
]
},
{
"cell_type": "markdown",
"id": "32b60af1-fa48-4de8-bcee-73aff1e4e709",
"metadata": {},
"source": [
"#### Historical forecast"
"## Historical forecast"
]
},
{
"cell_type": "markdown",
"id": "2a790ca0-b995-4c1a-a0e4-4e5c8b8df9eb",
"metadata": {},
"source": [
"You can also compute prediction intervals for historical forecasts adding the `add_history=True` parameter as follows:"
"You can also compute prediction intervals for historical forecasts adding the `add_history=True`.\n",
"\n",
"To specify the confidence interval, we use the `level` argument. Here, we pass the list `[80, 90]`. This will compute a 80% and 90% confidence interval."
]
},
{
Expand Down Expand Up @@ -571,6 +613,20 @@
" level=[80, 90],\n",
")"
]
},
{
"cell_type": "markdown",
"id": "a48df7da",
"metadata": {},
"source": [
"In the figure above, we now see the historical predictions made by TimeGPT for each series, along with the 80% and 90% confidence intervals."
]
},
{
"cell_type": "markdown",
"id": "c2adb0d1",
"metadata": {},
"source": []
}
],
"metadata": {
Expand Down
61 changes: 52 additions & 9 deletions nbs/docs/tutorials/9_cross_validation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"id": "6de758ee-a0d2-4b3f-acff-eed419dd17c5",
"metadata": {},
"source": [
"# Cross Validation"
"# Cross-validation"
]
},
{
Expand Down Expand Up @@ -63,6 +63,14 @@
"load_dotenv()"
]
},
{
"cell_type": "markdown",
"id": "ca8110a6",
"metadata": {},
"source": [
"We start off by initializing an instance of `NixtlaClient`."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -98,12 +106,22 @@
"nixtla_client = NixtlaClient()"
]
},
{
"cell_type": "markdown",
"id": "fd57a883",
"metadata": {},
"source": [
"## Launching cross-validation"
]
},
{
"cell_type": "markdown",
"id": "937ccb60-8a1b-4a58-9111-d9fb9d8d727c",
"metadata": {},
"source": [
"The `cross_validation` method within the `TimeGPT` class is an advanced functionality crafted to perform systematic validation on time series forecasting models. This method necessitates a dataframe comprising time-ordered data and employs a rolling-window scheme to meticulously evaluate the model's performance across different time periods, thereby ensuring the model's reliability and stability over time. \n",
"The `cross_validation` method within the `TimeGPT` class is an advanced functionality crafted to perform systematic validation on time series forecasting models. This method necessitates a dataframe comprising time-ordered data and employs a rolling-window scheme to meticulously evaluate the model's performance across different time periods, thereby ensuring the model's reliability and stability over time. The animation below shows how TimeGPT performs cross-validation.\n",
"\n",
"![](https://raw.githubusercontent.com/Nixtla/statsforecast/main/nbs/imgs/ChainedWindows.gif) \n",
"\n",
"Key parameters include `freq`, which denotes the data's frequency and is automatically inferred if not specified. The `id_col`, `time_col`, and `target_col` parameters designate the respective columns for each series' identifier, time step, and target values. The method offers customization through parameters like `n_windows`, indicating the number of separate time windows on which the model is assessed, and `step_size`, determining the gap between these windows. If `step_size` is unspecified, it defaults to the forecast horizon `h`. \n",
"\n",
Expand All @@ -120,6 +138,7 @@
"outputs": [],
"source": [
"pm_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/peyton_manning.csv')\n",
"\n",
"timegpt_cv_df = nixtla_client.cross_validation(\n",
" pm_df, \n",
" h=7, \n",
Expand Down Expand Up @@ -159,12 +178,20 @@
" display(fig)"
]
},
{
"cell_type": "markdown",
"id": "be475644",
"metadata": {},
"source": [
"## Cross-validation with prediction intervals"
]
},
{
"cell_type": "markdown",
"id": "c84e9a89-8de1-462f-a8d8-e45347031d23",
"metadata": {},
"source": [
"To asses the performance of `TimeGPT` with distributional forecasts, you can produce prediction intervals using the `level` argument."
"It is also possible to generate prediction intervals during cross-validation. To do so, we simply use the `level` argument."
]
},
{
Expand Down Expand Up @@ -206,12 +233,28 @@
" display(fig)"
]
},
{
"cell_type": "markdown",
"id": "72b8f68b",
"metadata": {},
"source": [
"## Cross-validation with exogenous variables"
]
},
{
"cell_type": "markdown",
"id": "5c27f048",
"metadata": {},
"source": [
"### Time features"
]
},
{
"cell_type": "markdown",
"id": "84388bb9-54c3-408e-bae2-46e39ffc3ee5",
"metadata": {},
"source": [
"You can also include `date_features` to see their impact in forecasting accuracy:"
"It is possible to include exogenous variables when performing cross-validation. Here we use the `date_features` parameter to create labels for each month. These features are then used by the model to make predictions during cross-validation."
]
},
{
Expand Down Expand Up @@ -256,18 +299,18 @@
},
{
"cell_type": "markdown",
"id": "b2cc956f-2a98-46be-922f-5fec1252c4e8",
"id": "4ca2ffe2",
"metadata": {},
"source": [
"#### Exogenous variables"
"### Dynamic features"
]
},
{
"cell_type": "markdown",
"id": "a95ea323-cd6d-43cb-aed1-f10cf23c5a61",
"metadata": {},
"source": [
"Additionally you can pass exogenous variables to better inform `TimeGPT` about the data. You just simply have to add the exogenous regressors after the target column."
"Additionally you can pass dynamic exogenous variables to better inform `TimeGPT` about the data. You just simply have to add the exogenous regressors after the target column."
]
},
{
Expand Down Expand Up @@ -330,9 +373,9 @@
"id": "77c8c469-bbb5-45ef-bd49-07bfdbc51b6b",
"metadata": {},
"source": [
"#### Compare different models\n",
"## Cross-validation with different TimeGPT instances\n",
"\n",
"Also, you can generate cross validation for different instances of `TimeGPT` using the `model` argument."
"Also, you can generate cross validation for different instances of `TimeGPT` using the `model` argument. Here we use the base model and the model for long-horizon forecasting."
]
},
{
Expand Down

0 comments on commit 8fee338

Please sign in to comment.