+ Unlock the power of accurate predictions and confidently navigate uncertainty. Reduce uncertainty and resource limitations. With TimeGPT, you can effortlessly access state-of-the-art models to make data-driven decisions. Whether you’re a bank forecasting market trends or a startup predicting product demand, TimeGPT democratizes access to cutting-edge predictive insights.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Introduction
+
Nixtla’s TimeGPT is a generative pre-trained forecasting model for time series data. TimeGPT can produce accurate forecasts for new time series without training, using only historical values as inputs. TimeGPT can be used across a plethora of tasks including demand forecasting, anomaly detection, financial forecasting, and more.
+
The TimeGPT model “reads” time series data much like the way humans read a sentence – from left to right. It looks at windows of past data, which we can think of as “tokens”, and predicts what comes next. This prediction is based on patterns the model identifies in past data and extrapolates into the future.
+
The API provides an interface to TimeGPT, allowing users to leverage its forecasting capabilities to predict future events. TimeGPT can also be used for other time series-related tasks, such as what-if scenarios, anomaly detection, and more.
+
+
+
+
+
+
Usage
+
+
import os
+
+from nixtlats import TimeGPT
+
+
You can instantiate the TimeGPT class providing your credentials.
You can test the validate of your token calling the validate_token method:
+
+
timegpt.validate_token()
+
+
INFO:nixtlats.timegpt:Happy Forecasting! :), If you have questions or need support, please email ops@nixtla.io
+
+
+
True
+
+
+
Now you can start making forecasts! Let’s import an example on the classic AirPassengers dataset. This dataset contains the monthly number of airline passengers in Australia between 1949 and 1960. First, let’s load the dataset and plot it:
Make sure the target variable column does not have missing or non-numeric values.
+
Do not include gaps/jumps in the datestamps (for the given frequency) between the first and late datestamps. The forecast function will not impute missing dates.
+
The format of the datestamp column should be readable by Pandas (see this link for more details).
+
+
+
+
+
Next, forecast the next 12 months using the SDK forecast method. Set the following parameters:
+
+
df: A pandas dataframe containing the time series data.
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
TimeGPT-1 is currently optimized for short horizon forecasting. While the forecast mehtod will allow any positive and large horizon, the accuracy of the forecasts might degrade. We are currently working to improve the accuracy on longer forecasts.
+
+
+
+
+
Using DateTime index to infer frequency
+
The freq parameter, which indicates the time unit between consecutive data points, is particularly critical. Fortunately, you can pass a DataFrame with a DateTime index to the forecasting method, ensuring that your time series data is equipped with necessary temporal features. By assigning a suitable freq parameter to the DateTime index of a DataFrame, you inform the model about the consistent interval between observations — be it days (‘D’), months (‘M’), or another suitable frequency.
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Inferred freq: MS
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
As long as Spark is installed and configured, TimeGPT will be able to use it. If executing on a distributed Spark cluster, make use the nixtlats library is installed across all the workers.
+
+
Executing on Spark
+
To run the forecasts distributed on Spark, just pass in a Spark DataFrame instead.
+
Instantiate TimeGPT class.
+
+
from nixtlats import TimeGPT
+
+timegpt = TimeGPT(token=os.environ['TIMEGPT_TOKEN'])
+
+
Use Spark as an engine.
+
+
from pyspark.sql import SparkSession
+
+spark = SparkSession.builder.getOrCreate()
+
+
Setting default log level to "WARN".
+To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
+23/11/08 02:44:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
+23/11/08 02:44:31 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Exogenous variables or external factors are crucial in time series forecasting as they provide additional information that might influence the prediction. These variables could include holiday markers, marketing spending, weather data, or any other external data that correlate with the time series data you are forecasting.
+
For example, if you’re forecasting ice cream sales, temperature data could serve as a useful exogenous variable. On hotter days, ice cream sales may increase.
+
To incorporate exogenous variables in TimeGPT, you’ll need to pair each point in your time series data with the corresponding external data.
To produce forecasts we have to add the future values of the exogenous variables. Let’s read this dataset. In this case we want to predict 24 steps ahead, therefore each unique id will have 24 observations.
Anomaly detection in time series data plays a pivotal role in numerous sectors including finance, healthcare, security, and infrastructure. In essence, time series data represents a sequence of data points indexed (or listed or graphed) in time order, often with equal intervals. As systems and processes become increasingly digitized and interconnected, the need to monitor and ensure their normal behavior grows proportionally. Detecting anomalies can indicate potential problems, malfunctions, or even malicious activities. By promptly identifying these deviations from the expected pattern, organizations can take preemptive measures, optimize processes, or protect resources. TimeGPT includes the detect_anomalies method to detect anomalies automatically.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
The detect_anomalies method is designed to process a dataframe containing series and subsequently label each observation based on its anomalous nature. The method evaluates each observation of the input dataframe against its context within the series, using statistical measures to determine its likelihood of being an anomaly. By default, the method identifies anomalies based on a 99 percent prediction interval. Observations that fall outside this interval are considered anomalies. The resultant dataframe will feature an added label, anomaly, that is set to 1 for anomalous observations and 0 otherwise.
While the default behavior of the detect_anomalies method is to operate using a 99 percent prediction interval, users have the flexibility to adjust this threshold to their requirements. This is achieved by modifying the level argument. Decreasing the value of the level argument will result in a narrower prediction interval, subsequently identifying more observations as anomalies. See the next example.
Conversely, increasing the value will make prediction intervals larger, detecting fewer anomalies. This customization allows users to calibrate the sensitivity of the method to align with their specific use case, ensuring the most relevant and actionable insights are derived from the data.
Additionally you can pass exogenous variables to better inform TimeGPT about the data. You just simply have to add the exogenous regressors after the target column.
Exogenous variables or external factors are crucial in time series forecasting as they provide additional information that might influence the prediction. These variables could include holiday markers, marketing spending, weather data, or any other external data that correlate with the time series data you are forecasting.
+
For example, if you’re forecasting ice cream sales, temperature data could serve as a useful exogenous variable. On hotter days, ice cream sales may increase.
+
To incorporate exogenous variables in TimeGPT, you’ll need to pair each point in your time series data with the corresponding external data.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
Let’s see an example on predicting day-ahead electricity prices. The following dataset contains the hourly electricity price (y column) for five markets in Europe and US, identified by the unique_id column. The columns from Exogenous1 to day_6 are exogenous variables that TimeGPT will use to predict the prices.
To produce forecasts we also have to add the future values of the exogenous variables. Let’s read this dataset. In this case, we want to predict 24 steps ahead, therefore each unique_id will have 24 observations.
Fine-tuning is a powerful process for utilizing TimeGPT more effectively. Foundation models are pre-trained on vast amounts of data, capturing wide-ranging features and patterns. These models can then be specialized for specific contexts or domains. With fine-tuning, the model’s parameters are refined to forecast a new task, allowing it to tailor its vast pre-existing knowledge toward the requirements of the new data. Fine-tuning thus serves as a crucial bridge, linking TimeGPT’s broad capabilities to your tasks specificities.
+
Concretely, the process of fine-tuning consists of performing a certain number of training iterations on your input data minimizing the forecasting error. The forecasts will then be produced with the updated model. To control the number of iterations, use the finetune_steps argument of the forecast method.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
In this code, finetune_steps=10 means the model will go through 10 iterations of training on your time series data.
+
Keep in mind that fine-tuning can be a bit of trial and error. You might need to adjust the number of finetune_steps based on your specific needs and the complexity of your data. It’s recommended to monitor the model’s performance during fine-tuning and adjust as needed. Be aware that more finetune_steps may lead to longer training times and could potentially lead to overfitting if not managed properly.
+
Remember, fine-tuning is a powerful feature, but it should be used thoughtfully and carefully.
Our time series model offers a powerful feature that allows users to retrieve historical forecasts alongside the prospective predictions. This functionality is accessible through the forecast method by setting the add_history=True argument.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
Let’s add fitted values. When add_history is set to True, the output DataFrame will include not only the future forecasts determined by the h argument, but also the historical predictions. Currently, the historical forecasts are not affected by h, and have a fix horizon depending on the frequency of the data. The historical forecasts are produced in a rolling window fashion, and concatenated.
Let’s plot the results. This consolidated view of past and future predictions can be invaluable for understanding the model’s behavior and for evaluating its performance over time.
Please note, however, that the initial values of the series are not included in these historical forecasts. This is because our model, TimeGPT, requires a certain number of initial observations to generate reliable forecasts. Therefore, while interpreting the output, it’s important to be aware that the first few observations serve as the basis for the model’s predictions and are not themselves predicted values.
Calendar variables and special dates are one of the most common types of exogenous variables used in forecasting applications. They provide additional context on the current state of the time series, especially for window-based models such as TimeGPT-1. These variables often include adding information on each observation’s month, week, day, or hour. For example, in high-frequency hourly data, providing the current month of the year provides more context than the limited history available in the input window to improve the forecasts.
+
In this tutorial we will show how to add calendar variables automatically to a dataset using the date_features function.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
Given the predominance usage of calendar variables, we included an automatic creation of common calendar variables to the forecast method as a pre-processing step. To automatically add calendar variables, use the date_features argument.
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
+
+
+
<AxesSubplot:ylabel='features'>
+
+
+
+
+
+
Here’s a breakdown of how the date_features parameter works:
+
+
date_features (bool or list of str or callable): This parameter specifies which date attributes to consider.
+
+
If set to True, the model will automatically add the most common date features related to the frequency of the given dataframe (df). For a daily frequency, this could include features like day of the week, month, and year.
+
If provided a list of strings, it will consider those specific date attributes. For example, date_features=['weekday', 'month'] will only add the day of the week and month as features.
+
If provided a callable, it should be a function that takes dates as input and returns the desired feature. This gives flexibility in computing custom date features.
+
+
date_features_to_one_hot (bool or list of str): After determining the date features, one might want to one-hot encode them, especially if they are categorical in nature (like weekdays). One-hot encoding transforms these categorical features into a binary matrix, making them more suitable for many machine learning algorithms.
+
+
If date_features=True, then by default, all computed date features will be one-hot encoded.
+
If provided a list of strings, only those specific date features will be one-hot encoded.
+
+
+
By leveraging the date_features and date_features_to_one_hot parameters, one can efficiently incorporate the temporal effects of date attributes into their forecasting model, potentially enhancing its accuracy and interpretability.
The first step is to fetch your time series data. The data must include timestamps and the associated values. For instance, you might be working with stock prices, and your data could look something like the following. In this example we use OpenBB.
Let’s see that this dataset has irregular timestamps. The dayofweek attribute from pandas’ DatetimeIndex returns the day of the week with Monday=0, Sunday=6. So, checking if dayofweek > 4 is essentially checking if the date falls on a Saturday (5) or Sunday (6), which are typically non-business days (weekends).
+
+
(pltr_df['date'].dt.dayofweek >4).sum()
+
+
0
+
+
+
As we can see the timestamp is irregular. Let’s inspect the Close series.
To forecast this data, you can use our forecast method. Importantly, remember to specify the frequency of the data using the freq argument. In this case, it would be ‘B’ for business days. We also need to define the time_col to select the index of the series (by default is ds), and the target_col to forecast our target variable, in this case we will forecast Close:
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
+
+
+
+
fcst_pltr_df.head()
+
+
+
+
+
+
+
+
+
date
+
TimeGPT
+
+
+
+
+
0
+
2023-09-25
+
14.365891
+
+
+
1
+
2023-09-26
+
14.460796
+
+
+
2
+
2023-09-27
+
14.413015
+
+
+
3
+
2023-09-28
+
14.488708
+
+
+
4
+
2023-09-29
+
14.470786
+
+
+
+
+
+
+
+
Remember, for business days, the frequency is ‘B’. For other frequencies, you can refer to the pandas offset aliases documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases.
+
By specifying the frequency, you’re helping the forecast method better understand the pattern in your data, resulting in more accurate and reliable forecasts.
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
+INFO:nixtlats.timegpt:Calling Historical Forecast Endpoint...
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
+
+
+
+
+
+
But what if we want to predict all the time series at once? We can do that reshaping our dataframe. Currently, the dataframe is in wide format (each series is a column), but we need to have them in long format (stacked one each other). We can do it with:
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
In time series forecasting, the variables that we predict are often influenced not just by their past values, but also by other factors or variables. These external variables, known as exogenous variables, can provide vital additional context that can significantly improve the accuracy of our forecasts. One such factor, and the focus of this tutorial, is the company’s revenue. Revenue figures can provide a key indicator of a company’s financial health and growth potential, both of which can heavily influence its stock price. That we can obtain from openbb.
The first thing we observe in our dataset is that we have information available only up until the end of the first quarter of 2023. Our data is represented in a quarterly frequency, and our goal is to leverage this information to forecast the daily stock prices for the next 14 days beyond this date.
+
However, to accurately compute such a forecast that includes the revenue as an exogenous variable, we need to have an understanding of the future values of the revenue. This is critical because these future revenue values can significantly influence the stock price.
+
Since we’re aiming to predict 14 daily stock prices, we only need to forecast the revenue for the upcoming quarter. This approach allows us to create a cohesive forecasting pipeline where the output of one forecast (revenue) is used as an input to another (stock price), thereby leveraging all available information for the most accurate predictions possible.
Continuing from where we left off, the next crucial step in our forecasting pipeline is to adjust the frequency of our data to match the stock prices’ frequency, which is represented on a business day basis. To accomplish this, we need to resample both the historical and future forecasted revenue data.
IMPORTANT NOTE: It’s crucial to highlight that in this process, we are assigning the same revenue value to all days within the given quarter. This simplification is necessary due to the disparity in granularity between quarterly revenue data and daily stock price data. However, it’s vital to treat this assumption with caution in practical applications. The impact of quarterly revenue figures on daily stock prices can vary significantly within the quarter based on a range of factors, including changing market expectations, other financial news, and events. In this tutorial, we use this assumption to illustrate the process of incorporating exogenous variables into our forecasting model, but in real-world scenarios, a more nuanced approach may be needed, depending on the available data and the specific use case.
And then we can pass the future revenue in the forecast method using the X_df argument. Since the revenue is in the historic dataframe, that information will be used in the model.
INFO:nixtlats.timegpt:Validating inputs...
+INFO:nixtlats.timegpt:Preprocessing dataframes...
+INFO:nixtlats.timegpt:Calling Forecast Endpoint...
+WARNING:nixtlats.timegpt:The specified horizon "h" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.
TimeGPT provides a robust solution for multi-series forecasting, which involves analyzing multiple data series concurrently, rather than a single one. The tool can be fine-tuned using a broad collection of series, enabling you to tailor the model to suit your specific needs or tasks.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
The following dataset contains prices of different electricity markets. Let see how can we forecast them. The main argument of the forecast method is the input data frame with the historical values of the time series you want to forecast. This data frame can contain information from many time series. Use the unique_id column to identify the different time series of your dataset.
Prediction intervals provide a measure of the uncertainty in the forecasted values. In time series forecasting, a prediction interval gives an estimated range within which a future observation will fall, based on the level of confidence or uncertainty you set. This level of uncertainty is crucial for making informed decisions, risk assessments, and planning.
+
For instance, a 95% prediction interval means that 95 out of 100 times, the actual future value will fall within the estimated range. Therefore, a wider interval indicates greater uncertainty about the forecast, while a narrower interval suggests higher confidence.
+
When using TimeGPT for time series forecasting, you have the option to set the level of prediction intervals according to your requirements. TimeGPT uses conformal prediction to calibrate the intervals.
+
+
import os
+
+import pandas as pd
+from nixtlats import TimeGPT
When using TimeGPT for time series forecasting, you can set the level (or levels) of prediction intervals according to your requirements. Here’s how you could do it:
It’s essential to note that the choice of prediction interval level depends on your specific use case. For high-stakes predictions, you might want a wider interval to account for more uncertainty. For less critical forecasts, a narrower interval might be acceptable.
+
+
Historical Forecast
+
You can also compute prediction intervals for historical forecasts adding the add_history=True parameter as follows:
+ TimeGPT, developed by Nixtla, is a generative pre-trained transformer model specialized in prediction tasks. TimeGPT was trained on the largest collection of data in history – over 100 billion rows of financial, weather, energy, and web data – and democratizes the power of time-series analysis. This tool is capable of discerning patterns and predicting future data points in a matter of seconds.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Install
+
pip install nixtlats
+
+
+
How to use
+
Just import the library, set your credentials, and start forecasting in two lines of code!
Constructs all the necessary attributes for the TimeGPT object.
+
+
+
+
+
+
+
+
+
+
+
Type
+
Default
+
Details
+
+
+
+
+
token
+
str
+
+
The authorization token to interact with the TimeGPT API.
+
+
+
environment
+
Optional
+
None
+
Custom environment. Pass only if provided.
+
+
+
max_retries
+
int
+
6
+
The maximum number of attempts to make when calling the API before giving up. It defines how many times the client will retry the API call if it fails. Default value is 6, indicating the client will attempt the API call up to 6 times in total
+
+
+
retry_interval
+
int
+
10
+
The interval in seconds between consecutive retry attempts. This is the waiting period before the client tries to call the API again after a failed attempt. Default value is 10 seconds, meaning the client waits for 10 seconds between retries.
+
+
+
max_wait_time
+
int
+
360
+
The maximum total time in seconds that the client will spend on all retry attempts before giving up. This sets an upper limit on the cumulative waiting time for all retry attempts. If this time is exceeded, the client will stop retrying and raise an exception. Default value is 360 seconds, meaning the client will cease retrying if the total time spent on retries exceeds 360 seconds. The client throws a ReadTimeout error after 60 seconds of inactivity. If you want to catch these errors, use max_wait_time >> 60.
+
+
+
+
+
+
+
TimeGPT.validate_token
+
+
TimeGPT.validate_token (log:bool=True)
+
+
Returns True if your token is valid.
+
Now you can start to make forecasts! Let’s import an example:
The DataFrame on which the function will operate. Expected to contain at least the following columns: - time_col: Column name in df that contains the time indices of the time series. This is typically a datetime column with regular intervals, e.g., hourly, daily, monthly data points. - target_col: Column name in df that contains the target variable of the time series, i.e., the variable we wish to predict or analyze. Additionally, you can pass multiple time series (stacked in the dataframe) considering an additional column: - id_col: Column name in df that identifies unique time series. Each unique value in this column corresponds to a unique time series.
+
+
+
forecasts_df
+
Optional
+
None
+
DataFrame with columns [unique_id, ds] and models.
+
+
+
id_col
+
str
+
unique_id
+
Column that identifies each serie.
+
+
+
time_col
+
str
+
ds
+
Column that identifies each timestep, its values can be timestamps or integers.
+
+
+
target_col
+
str
+
y
+
Column that contains the target.
+
+
+
unique_ids
+
Union
+
None
+
Time Series to plot. If None, time series are selected randomly.
+
+
+
plot_random
+
bool
+
True
+
Select time series to plot randomly.
+
+
+
models
+
Optional
+
None
+
List of models to plot.
+
+
+
level
+
Optional
+
None
+
List of prediction intervals to plot if paseed.
+
+
+
max_insample_length
+
Optional
+
None
+
Max number of train/insample observations to be plotted.
+
+
+
plot_anomalies
+
bool
+
False
+
Plot anomalies for each prediction interval.
+
+
+
engine
+
str
+
matplotlib
+
Library used to plot. ‘plotly’, ‘plotly-resampler’ or ‘matplotlib’.
+
+
+
resampler_kwargs
+
Optional
+
None
+
Kwargs to be passed to plotly-resampler constructor. For further custumization (“show_dash”) call the method, store the plotting object and add the extra arguments to its show_dash method.
The DataFrame on which the function will operate. Expected to contain at least the following columns: - time_col: Column name in df that contains the time indices of the time series. This is typically a datetime column with regular intervals, e.g., hourly, daily, monthly data points. - target_col: Column name in df that contains the target variable of the time series, i.e., the variable we wish to predict or analyze. Additionally, you can pass multiple time series (stacked in the dataframe) considering an additional column: - id_col: Column name in df that identifies unique time series. Each unique value in this column corresponds to a unique time series.
Column that identifies each timestep, its values can be timestamps or integers.
+
+
+
target_col
+
str
+
y
+
Column that contains the target.
+
+
+
X_df
+
Optional
+
None
+
DataFrame with [unique_id, ds] columns and df’s future exogenous.
+
+
+
level
+
Optional
+
None
+
Confidence levels between 0 and 100 for prediction intervals.
+
+
+
finetune_steps
+
int
+
0
+
Number of steps used to finetune TimeGPT in the new data.
+
+
+
clean_ex_first
+
bool
+
True
+
Clean exogenous signal before making forecasts using TimeGPT.
+
+
+
validate_token
+
bool
+
False
+
If True, validates token before sending requests.
+
+
+
add_history
+
bool
+
False
+
Return fitted values of the model.
+
+
+
date_features
+
Union
+
False
+
Features computed from the dates. Can be pandas date attributes or functions that will take the dates as input. If True automatically adds most used date features for the frequency of df.
+
+
+
model
+
str
+
timegpt-1
+
Model to use as a string. Options are: timegpt-1, and timegpt-1-long-horizon. We recommend using timegpt-1-long-horizon for forecasting if you want to predict more than one seasonal period given the frequency of your data.
+
+
+
date_features_to_one_hot
+
Union
+
True
+
Apply one-hot encoding to these date features. If date_features=True, then all date features are one-hot encoded by default.
+
+
+
num_partitions
+
Optional
+
None
+
Number of partitions to use. Only used in distributed environments (spark, ray, dask). If None, the number of partitions will be equal to the available parallel resources.
+
+
+
Returns
+
pandas.DataFrame
+
+
DataFrame with TimeGPT forecasts for point predictions and probabilistic predictions (if level is not None).
Detect anomalies in your time series using TimeGPT.
+
+
+
+
+
Type
+
Default
+
Details
+
+
+
+
+
df
+
DataFrame
+
+
The DataFrame on which the function will operate. Expected to contain at least the following columns: - time_col: Column name in df that contains the time indices of the time series. This is typically a datetime column with regular intervals, e.g., hourly, daily, monthly data points. - target_col: Column name in df that contains the target variable of the time series, i.e., the variable we wish to predict or analyze. Additionally, you can pass multiple time series (stacked in the dataframe) considering an additional column: - id_col: Column name in df that identifies unique time series. Each unique value in this column corresponds to a unique time series.
Column that identifies each timestep, its values can be timestamps or integers.
+
+
+
target_col
+
str
+
y
+
Column that contains the target.
+
+
+
level
+
Union
+
99
+
Confidence level between 0 and 100 for detecting the anomalies.
+
+
+
clean_ex_first
+
bool
+
True
+
Clean exogenous signal before making forecasts using TimeGPT.
+
+
+
validate_token
+
bool
+
False
+
If True, validates token before sending requests.
+
+
+
date_features
+
Union
+
False
+
Features computed from the dates. Can be pandas date attributes or functions that will take the dates as input. If True automatically adds most used date features for the frequency of df.
+
+
+
date_features_to_one_hot
+
Union
+
True
+
Apply one-hot encoding to these date features. If date_features=True, then all date features are one-hot encoded by default.
+
+
+
model
+
str
+
timegpt-1
+
Model to use as a string. Options are: timegpt-1, and timegpt-1-long-horizon. We recommend using timegpt-1-long-horizon for forecasting if you want to predict more than one seasonal period given the frequency of your data.
+
+
+
Returns
+
pandas.DataFrame
+
+
DataFrame with anomalies flagged with 1 detected by TimeGPT.