Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New versions of how-to-guides for computing at scale #302

Merged
merged 16 commits into from
Apr 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ With `Nixtla`, you can easily interact with TimeGPT through simple API calls, ma
Get `Nixtla` up and running with a simple pip command:

```python
pip install nixtla>=0.1.0
pip install nixtla>=0.4.0
```

## 🎈 Quick Start
Expand All @@ -45,11 +45,11 @@ Get started with TimeGPT now:
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv')

from nixtla import NixtlaClient
nixtla = NixtlaClient(
nixtla_client = NixtlaClient(
# defaults to os.environ.get("NIXTLA_API_KEY")
api_key = 'my_api_key_provided_by_nixtla'
)
fcst_df = nixtla.forecast(df, h=24, level=[80, 90])
fcst_df = nixtla_client.forecast(df, h=24, level=[80, 90])
```

![](./nbs/img/forecast_readme.png)
Binary file added nbs/assets/2_api_key_process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added nbs/assets/australia_hierarchy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added nbs/assets/australia_tourism.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,32 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Setting Up Your Authentication API Key"
"# Setting up your API key"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An **API key** is a unique string of characters that serves as a key to authenticate your requests to `TimeGTP`. This tutorial will explain how to set up your API key when using the Nixtla SDK. \n",
"Do you give everyone the keys to your house? Likely not. An **API key** is like a key to a house, and should be kept private. It is a unique string of characters that serves as a key to authenticate your requests to `TimeGTP`. This tutorial will explain how to set up your API key when using the Nixtla SDK. \n",
"\n",
"Upon [registration](https://dashboard.nixtla.io/), you will recibe an email asking you to confirm your signup. After confirming, you will receive access to your dashboard. There, under `API Keys`, you will find your API key. To integrate your API key into your development workflow with the Nixtla SDK, you have two methods. "
"Upon [registration](https://dashboard.nixtla.io/), you will receive an email asking you to confirm your signup. After confirming, you will receive access to your dashboard. There, under `API Keys`, you will find your API key.\n",
"\n",
"You can follow the API Key Configuration Process detailed in this tutorial. A scematic is given below."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"../../assets/2_api_key_process.png?\" alt=\"Figure 1. API key set up process\" width=\"600\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Direct copy and paste \n",
"## 1. Unsecure: direct copy and paste \n",
"\n",
"- **Step 1**: Copy the API key found in the `API Keys` of your [dashboard]((https://dashboard.nixtla.io/)). \n",
"- **Step 2**: Instantiate the `NixtlaClient` class by directly pasting your API key into the code, as shown below:"
Expand All @@ -40,43 +49,26 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This approach is straightforward and best for quick tests or scripts that won’t be shared."
"This approach is straightforward and best for quick tests or scripts that won’t be shared.\n",
"\n",
"::: {.callout-important}\n",
"This approach is considered unsecure, as your API key will be part of your source code.\n",
"::: "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Using an environment variable"
"## 2. Secure: using an environment variable"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **Step 1:** Store your API key in an environment variable named `NIXTLA_API_KEY`. This can be done for a session or permanently, depending on your preference.\n",
"- **Step 2:** When you instantiate the `NixtlaClient` class, the SDK will automatically look for the `NIXTLA_API_KEY` environment variable and use it to authenticate your requests."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"from dotenv import load_dotenv\n",
"load_dotenv()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from nixtla import NixtlaClient\n",
"nixtla_client = NixtlaClient()"
"- **Step 1**: Store your API key in an environment variable named `NIXTLA_API_KEY`. This can be done (a) temporarily for a session or (b) permanently, depending on your preference.\n",
"- **Step 2**: When you instantiate the `NixtlaClient` class, the SDK will automatically look for the `NIXTLA_API_KEY` environment variable and use it to authenticate your requests."
]
},
{
Expand All @@ -92,28 +84,31 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"There are several ways to set an environment variable. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### a. From the Terminal\n",
"Use the `export` command to set `NIXTLA_API_KEY`. \n",
"### a. Temporary: From the Terminal\n",
"\n",
"This approach is useful if you are working from a terminal, and need a temporary solution. \n",
"\n",
"#### Linux / Mac\n",
"Open a terminal and use the `export` command to set `NIXTLA_API_KEY`. \n",
"\n",
"``` bash\n",
"export NIXTLA_API_KEY=your_api_key\n",
"```\n",
"\n",
"#### Windows\n",
"For Windows users, open a Powershell window and use the `Set` command to set `NIXTLA_API_KEY`. \n",
"``` powershell\n",
"Set NIXTLA_API_KEY=your_api_key\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### b. Using a `.env` file\n",
"### b. Permanent: Using a `.env` file\n",
"\n",
"For a more persistent solution that can be version-controlled if private, or for ease of use across different projects, place your API key in a `.env` file.\n",
"For a more persistent solution place your API key in a `.env` file located in the folder of your Python script.\n",
"\n",
"``` bash\n",
"# Inside a file named .env\n",
Expand Down Expand Up @@ -161,7 +156,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Validate your API key\n",
"## 3. Validate your API key\n",
"\n",
"You can always find your API key in the `API Keys` section of your dashboard. To check the status of your API key, use the `validate_api_key` method of the `NixtlaClient` class. This method will return `True` if the API key is valid and `False` otherwise. "
]
Expand Down
105 changes: 105 additions & 0 deletions nbs/docs/how-to-guides/0_computing_at_scale.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
{
MMenchero marked this conversation as resolved.
Show resolved Hide resolved
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Computing at Scale with TimeGPT"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Handling large datasets is a common challenge in time series forecasting. For example, when working with retail data, you may have to forecast sales for thousands of products across hundreds of stores. Similarly, when dealing with electricity consumption data, you may need to predict consumption for thousands of households across various regions.\n",
"\n",
"Nixtla's `TimeGPT` enables you to use several distributed computing frameworks to manage large datasets efficiently. `TimeGPT` currently supports `Spark`, `Dask`, and `Ray` through `Fugue`.\n",
"\n",
"In this notebook, we will explain how to leverage these frameworks using `TimeGPT`. \n",
"\n",
"**Outline:**\n",
"1. [Getting Started](#1-getting-started)\n",
"2. [Forecasting at Scale](#2-forecasting-at-scale) \n",
"3. [Important Considerations](#3-important-considerations) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting started \n",
"\n",
"To use `TimeGPT` with any of the supported distributed computing frameworks, you first need an API Key, just as you would when not using any distributed computing.\n",
"\n",
"Upon [registration](https://dashboard.nixtla.io/), you will receive an email asking you to confirm your signup. After confirming, you will receive access to your dashboard. There, under`API Keys`, you will find your API Key. Next, you need to integrate your API Key into your development workflow with the Nixtla SDK. For guidance on how to do this, please refer to the [Setting Up Your Authentication Key tutorial](https://docs.nixtla.io/docs/setting_up_your_authentication_api_key)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Forecasting at Scale "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using `TimeGPT` with any of the supported distributed computing frameworks is straightforward, as `TimeGPT` will read a `pandas` DataFrame and then use the corresponding framework. Thus, the usage is almost identical to the non-distributed case. \n",
"\n",
"1. Instantiate a `NixtlaClient` class.\n",
"2. Load your data as a `pandas` DataFrame.\n",
"3. Initialize the distributed computing framework. \n",
" - [Spark](https://docs.nixtla.io/docs/1_computing_at_scale_spark)\n",
" - [Dask](https://docs.nixtla.io/docs/2_computing_at_scale_dask)\n",
" - [Ray](https://docs.nixtla.io/docs/3_computing_at_scale_ray)\n",
"4. Use any of the `NixtlaClient` class methods.\n",
"5. Stop the distributed computing framework, if necessary. \n",
"\n",
"These are the general steps that you will need to follow to use `TimeGPT` with any of the supported distributed computing frameworks. For a detailed explanation and a complete example, please refer to the guide for the specific framework linked above."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"::: {.callout-important}\n",
"Parallelization in these frameworks is done along the various time series within your dataset. Therefore, it is essential that your dataset includes multiple time series, each with a unique id. \n",
":::"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Important Considerations "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### When to Use a Distributed Computing Framework\n",
"\n",
"Consider using a distributed computing framework if your dataset:\n",
"\n",
"- Consists of millions of observations over multiple time series.\n",
"- Is too large to fit into the memory of a single machine.\n",
"- Would be too slow to process on a single machine.\n",
"\n",
"### Choosing the Right Framework\n",
"\n",
"When selecting a distributed computing framework, take into account your existing infrastructure and the skill set of your team. Although `TimeGPT` can be used with any of the supported frameworks with minimal code changes, choosing the right one should align with your specific needs and resources. This will ensure that you leverage the full potential of `TimeGPT` while handling large datasets efficiently."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "python3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading
Loading