From 2540ebfa85c219fadf3e2951a4d4ae797a3595ff Mon Sep 17 00:00:00 2001 From: Chetan Thote <49151585+chetanthote@users.noreply.github.com> Date: Thu, 11 Jul 2024 19:18:27 +0530 Subject: [PATCH] Added notebooks for load data (#103) * Create load-CSV-data-S3 * Added notebooks for Load data sections of UI * Modified with suggested changes * Modified with suggested changes * Remove extra header --------- Co-authored-by: chetan thote Co-authored-by: Kevin D Smith --- authors/chetan-thote.toml | 4 + notebooks/load-csv-data-s3/meta.toml | 11 + notebooks/load-csv-data-s3/notebook.ipynb | 360 +++++++++++++++++++ notebooks/load-data-kakfa/meta.toml | 12 + notebooks/load-data-kakfa/notebook.ipynb | 404 ++++++++++++++++++++++ 5 files changed, 791 insertions(+) create mode 100644 authors/chetan-thote.toml create mode 100644 notebooks/load-csv-data-s3/meta.toml create mode 100644 notebooks/load-csv-data-s3/notebook.ipynb create mode 100644 notebooks/load-data-kakfa/meta.toml create mode 100644 notebooks/load-data-kakfa/notebook.ipynb diff --git a/authors/chetan-thote.toml b/authors/chetan-thote.toml new file mode 100644 index 00000000..e2519fd7 --- /dev/null +++ b/authors/chetan-thote.toml @@ -0,0 +1,4 @@ +name="Chetan Thote" +title="Product Team" +image="singlestore" +external=false diff --git a/notebooks/load-csv-data-s3/meta.toml b/notebooks/load-csv-data-s3/meta.toml new file mode 100644 index 00000000..1be7078c --- /dev/null +++ b/notebooks/load-csv-data-s3/meta.toml @@ -0,0 +1,11 @@ +[meta] +authors=["chetan-thote"] +title="Sales Data Analysis Dataset From Amazon S3" +description="""\ + The Sales Data Analysis use case demonstrates how to utilize Singlestore's powerful querying capabilities to analyze sales data stored in a CSV file.""" +difficulty="beginner" +tags=["starter", "loaddata", "s3"] +lesson_areas=["Ingest"] +icon="database" +destinations=["spaces"] +minimum_tier="free-shared" diff --git a/notebooks/load-csv-data-s3/notebook.ipynb b/notebooks/load-csv-data-s3/notebook.ipynb new file mode 100644 index 00000000..f570ff20 --- /dev/null +++ b/notebooks/load-csv-data-s3/notebook.ipynb @@ -0,0 +1,360 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "97f96c34-81a9-495a-a55d-c565695e87f0", + "metadata": {}, + "source": [ + "
\n", + "
\n", + " \n", + "
\n", + "
\n", + "
SingleStore Notebooks
\n", + "

Sales Data Analysis Dataset From Amazon S3

\n", + "
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "612bd378-f145-42f1-b8ce-32557a4c00cd", + "metadata": {}, + "source": [ + "
\n", + " \n", + "
\n", + "

Note

\n", + "

This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to Start using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "481ce5ae-2ee0-4b63-b3f3-a4b53a5bc381", + "metadata": {}, + "source": [ + "The Sales Data Analysis use case demonstrates how to utilize Singlestore's powerful querying capabilities to analyze sales data stored in a CSV file. This demo showcases typical operations that businesses perform to gain insights from their sales data, such as calculating total sales, identifying top-selling products, and analyzing sales trends over time. By working through this example, new users will learn how to load CSV data into Singlestore, execute aggregate functions, and perform time-series analysis, which are essential skills for leveraging the full potential of Singlestore in a business intelligence context." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "72fe6854-5b6e-4b79-a2d0-79bda0e18429", + "metadata": {}, + "source": [ + "

Demo Flow

" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "5ed26ab8-1217-4fbd-be0c-4e7728314671", + "metadata": {}, + "source": [ + "" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "46fb95a8-1402-4b97-b04a-560741f96181", + "metadata": {}, + "source": [ + "## How to use this notebook" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "a701cd90-dd42-4a06-b7a1-e0a2132af558", + "metadata": {}, + "source": [ + "" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "2d22fd53-2c18-40e5-bb38-6d8ebc06f1b8", + "metadata": {}, + "source": [ + "## Create a database\n", + "\n", + "We need to create a database to work with in the following examples." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "1624ccea-0c15-4048-ab2a-fe2178e5912a", + "metadata": {}, + "outputs": [], + "source": [ + "shared_tier_check = %sql show variables like 'is_shared_tier'\n", + "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n", + " %sql DROP DATABASE IF EXISTS SalesAnalysis;\n", + " %sql CREATE DATABASE SalesAnalysis;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "901e6ec1-2530-497a-857e-7973bb9714f1", + "metadata": {}, + "source": [ + "

Create Table

" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "7ac4285d-0d2d-44ec-8b1e-eef7b4f9358c", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "CREATE TABLE `SalesData` (\n", + " `Date` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " `Store_ID` bigint(20) DEFAULT NULL,\n", + " `ProductID` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " `Product_Name` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " `Product_Category` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " `Quantity_Sold` bigint(20) DEFAULT NULL,\n", + " `Price` float DEFAULT NULL,\n", + " `Total_Sales` float DEFAULT NULL\n", + ")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "1de959eb-4f17-45d4-af74-42f45684d67b", + "metadata": {}, + "source": [ + "

Load Data Using Pipelines

" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "84f592b8-a12e-41d8-bff0-fe96175992b9", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "CREATE PIPELINE SalesData_Pipeline AS\n", + "LOAD DATA S3 's3://singlestoreloaddata/SalesData/sales_data.csv'\n", + "CONFIG '{ \\\"region\\\": \\\"ap-south-1\\\" }'\n", + "/*\n", + "CREDENTIALS '{\"aws_access_key_id\": \"\",\n", + " \"aws_secret_access_key\": \"\"}'\n", + " */\n", + "INTO TABLE SalesData\n", + "FIELDS TERMINATED BY ','\n", + "LINES TERMINATED BY '\\r\\n'\n", + "IGNORE 1 lines;\n", + "\n", + "\n", + "START PIPELINE SalesData_Pipeline;" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "352e340a-a613-4ec5-94a5-c4e1f3565757", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT * FROM SalesData LIMIT 10" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "4508d431-7683-4ac9-a4e8-d939c47dd1fc", + "metadata": {}, + "source": [ + "

Sample Queries

\n", + "\n", + "We will try to execute some Analytical Queries" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "55ac6134-976c-4f27-bc2b-140835b64f13", + "metadata": {}, + "source": [ + "Top-Selling Products" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d666c04b-ccb0-47cc-a1e7-efaa7a590d27", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT product_name, SUM(quantity_sold) AS total_quantity_sold FROM SalesData\n", + " GROUP BY product_name ORDER BY total_quantity_sold DESC LIMIT 5;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "87c36700-0db8-405f-97c0-e13a6a2ae0cb", + "metadata": {}, + "source": [ + "Sales Trends Over Time" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "b46d72c7-07a3-4e23-8fe4-c238b5517ef6", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT date, SUM(total_sales) AS total_sales FROM SalesData\n", + "GROUP BY date ORDER BY total_sales desc limit 5;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "e6c232a1-acce-4d25-aebd-1a89aafba47d", + "metadata": {}, + "source": [ + "Total Sales by Store" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "af571f6c-0145-4466-9ed7-000d37e4738f", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT Store_ID, SUM(total_sales) AS total_sales FROM SalesData\n", + "GROUP BY Store_ID ORDER BY total_sales DESC limit 5;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "9bf1d7f3-c636-4ac0-b2be-e48eaca747ef", + "metadata": {}, + "source": [ + "Sales Contribution by Product (Percentage)" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "5613b3e8-72d2-48dc-a7ae-47911df24cd2", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT product_name, SUM(total_sales) * 100.0 / (SELECT SUM(total_sales) FROM SalesData) AS sales_percentage FROM SalesData\n", + " GROUP BY product_name ORDER BY sales_percentage DESC limit 5;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "afed201d-d9f2-49cc-8a14-df35103abd4e", + "metadata": {}, + "source": [ + "Top Days with Highest Sale" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "7fd8d785-7861-4570-88b3-0185c2c9c298", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT date, SUM(total_sales) AS total_sales FROM SalesData\n", + " GROUP BY date ORDER BY total_sales DESC LIMIT 5;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "6738b6e4-5e8b-45db-b3dc-ebcb73bcf629", + "metadata": {}, + "source": [ + "## Conclusion\n", + "\n", + "
\n", + " \n", + "
\n", + "

Action Required

\n", + "

If you created a new database in your Standard or Premium Workspace, you can drop the database by running the cell below. Note: this will not drop your database for Free Starter Workspaces. To drop a Free Starter Workspace, terminate the Workspace using the UI.

\n", + "
\n", + "
\n", + "\n", + "We have shown how to insert data from a Amazon S3 using `Pipelines` to SingleStoreDB. These techniques should enable you to\n", + "integrate your Amazon S3 with SingleStoreDB." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "d5053a52-5579-4fea-9594-5250f6fcc289", + "metadata": {}, + "outputs": [], + "source": [ + "shared_tier_check = %sql show variables like 'is_shared_tier'\n", + "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n", + " %sql DROP DATABASE IF EXISTS SalesAnalysis;" + ] + }, + { + "cell_type": "markdown", + "id": "2dcc585a-43c2-4598-93bf-888143dd5e29", + "metadata": {}, + "source": [ + "
\n", + "
" + ] + } + ], + "metadata": { + "jupyterlab": { + "notebooks": { + "version_major": 6, + "version_minor": 4 + } + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/load-data-kakfa/meta.toml b/notebooks/load-data-kakfa/meta.toml new file mode 100644 index 00000000..7d895bee --- /dev/null +++ b/notebooks/load-data-kakfa/meta.toml @@ -0,0 +1,12 @@ +[meta] +authors=["chetan-thote"] +title="Real-Time Event Monitoring Dataset From Kafka" +description="""\ + The Real-Time Event Monitoring use case illustrates how to leverage Singlestore's capabilities to process and analyze streaming data from a Kafka data source. + """ +difficulty="beginner" +tags=["starter", "loaddata", "kafka"] +lesson_areas=["Ingest"] +icon="database" +destinations=["spaces"] +minimum_tier="free-shared" diff --git a/notebooks/load-data-kakfa/notebook.ipynb b/notebooks/load-data-kakfa/notebook.ipynb new file mode 100644 index 00000000..ac28ae67 --- /dev/null +++ b/notebooks/load-data-kakfa/notebook.ipynb @@ -0,0 +1,404 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "14762a67-4baa-493e-a182-89de7fcbbaf2", + "metadata": {}, + "source": [ + "
\n", + "
\n", + " \n", + "
\n", + "
\n", + "
SingleStore Notebooks
\n", + "

Real-Time Event Monitoring Dataset From Kafka

\n", + "
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "25c2b147-47cb-4755-8b8f-95c93cc9e35d", + "metadata": {}, + "source": [ + "
\n", + " \n", + "
\n", + "

Note

\n", + "

This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to Start using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "ee90231c-d301-4d3b-a72e-99cf5338f0f5", + "metadata": {}, + "source": [ + "

Introduction

" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "f6f20e3f-c17a-4a11-b394-3b02b8fb5d31", + "metadata": {}, + "source": [ + "The Real-Time Event Monitoring use case illustrates how to leverage Singlestore's capabilities to process and analyze streaming data from a Kafka data source. This demo showcases the ability to ingest real-time events, such as application logs or user activities, and perform immediate analysis to gain actionable insights. By working through this example, new users will learn how to set up a Kafka data pipeline, ingest streaming data into Singlestore, and execute real-time queries to monitor event types, user activity patterns, and detect anomalies. This use case highlights the power of Singlestore in providing timely and relevant information for decision-making in dynamic environments." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "2d209d08-ee22-4cdd-81be-51d1f742cb91", + "metadata": {}, + "source": [ + "" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "a7bdf2ca-0ca0-4a67-b860-0df79df38878", + "metadata": {}, + "source": [ + "## How to use this notebook" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "63d529ea-4f84-4ffe-9c93-691e787b5613", + "metadata": {}, + "source": [ + "" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "5f963a4f-0eb0-4282-bc2f-f8bf48eef971", + "metadata": {}, + "source": [ + "## Create a database\n", + "\n", + "We need to create a database to work with in the following examples." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "8ccfe96a-05e7-4547-9df9-97e4ed6b3998", + "metadata": {}, + "outputs": [], + "source": [ + "shared_tier_check = %sql show variables like 'is_shared_tier'\n", + "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n", + " %sql DROP DATABASE IF EXISTS EventAnalysis;\n", + " %sql CREATE DATABASE EventAnalysis;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "a06e69b8-1e19-4ab6-b724-4bd32f235994", + "metadata": {}, + "source": [ + "
\n", + " \n", + "
\n", + "

Action Required

\n", + "

If you have a Free Starter Workspace deployed already, select the database from drop-down menu at the top of this notebook. It updates the connection_url to connect to that database.

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "8b5ffbab-62f7-4052-a415-c511b5deb7bf", + "metadata": {}, + "source": [ + "

Create Table

" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f089b404-5907-4236-a05f-ad0e5bf8157a", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "CREATE TABLE `eventsdata` (\n", + " `user_id` varchar(120) DEFAULT NULL,\n", + " `event_name` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n", + " `advertiser` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n", + " `campaign` varchar(110) DEFAULT NULL,\n", + " `gender` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n", + " `income` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n", + " `page_url` varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n", + " `region` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n", + " `country` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL\n", + ")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "057f3cbf-7a49-4954-bd04-f8f42839dfc7", + "metadata": {}, + "source": [ + "

Load Data using Pipeline

" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "7a7163c9-0ca5-40a9-b503-811376e1af2b", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "CREATE PIPELINE `eventsdata`\n", + "AS LOAD DATA KAFKA 'public-kafka.memcompute.com:9092/ad_events'\n", + "BATCH_INTERVAL 2500\n", + "ENABLE OUT_OF_ORDER OPTIMIZATION\n", + "DISABLE OFFSETS METADATA GC\n", + "INTO TABLE `eventsdata`\n", + "FIELDS TERMINATED BY '\\t' ENCLOSED BY '' ESCAPED BY '\\\\'\n", + "LINES TERMINATED BY '\\n' STARTING BY ''\n", + "(\n", + " `events`.`user_id`,\n", + " `events`.`event_name`,\n", + " `events`.`advertiser`,\n", + " `events`.`campaign`,\n", + " `events`.`gender`,\n", + " `events`.`income`,\n", + " `events`.`page_url`,\n", + " `events`.`region`,\n", + " `events`.`country`\n", + ")\n", + "\n", + "START PIPELINE `eventsdata`" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "0b75627d-684c-4900-bb3c-1ec539ac3671", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT COUNT(*) FROM `eventsdata`" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "15366453-7483-4e4f-a67f-439b66dfb4f4", + "metadata": {}, + "source": [ + "

Sample Queries

" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "94c011f2-2662-4c12-b70b-e6601ed7bdca", + "metadata": {}, + "source": [ + "Events by Region" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "3195c978-7356-45ba-8864-832f75ec90c7", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT events.country\n", + "AS `events.country`,\n", + "COUNT(events.country) AS 'events.countofevents'\n", + "FROM eventsdata AS events\n", + "GROUP BY 1 ORDER BY 2 DESC LIMIT 5;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "0a2d68aa-1ea4-49a0-9cbe-04030e754342", + "metadata": {}, + "source": [ + "Events by Top 5 Advertisers" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "890ce930-ebbe-4415-861a-60820fbf631d", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT\n", + " events.advertiser AS `events.advertiser`,\n", + " COUNT(*) AS `events.count`\n", + "FROM eventsdata AS events\n", + "WHERE\n", + " (events.advertiser LIKE '%Subway%' OR events.advertiser LIKE '%McDonalds%' OR events.advertiser LIKE '%Starbucks%' OR events.advertiser LIKE '%Dollar General%' OR events.advertiser LIKE '%YUM! Brands%')\n", + "GROUP BY 1\n", + "ORDER BY 2 DESC;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "094a0e46-fbd9-440b-843d-ba5736e48a51", + "metadata": {}, + "source": [ + "Ad visitors by gender and income" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "270a21bd-7166-4f01-9ee0-8f77cc263a30", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SELECT * FROM (\n", + "SELECT *, DENSE_RANK() OVER (ORDER BY z___min_rank) as z___pivot_row_rank, RANK() OVER (PARTITION BY z__pivot_col_rank ORDER BY z___min_rank) as z__pivot_col_ordering, CASE WHEN z___min_rank = z___rank THEN 1 ELSE 0 END AS z__is_highest_ranked_cell FROM (\n", + "SELECT *, MIN(z___rank) OVER (PARTITION BY `events.income`) as z___min_rank FROM (\n", + "SELECT *, RANK() OVER (ORDER BY CASE WHEN z__pivot_col_rank=1 THEN (CASE WHEN `events.count` IS NOT NULL THEN 0 ELSE 1 END) ELSE 2 END, CASE WHEN z__pivot_col_rank=1 THEN `events.count` ELSE NULL END DESC, `events.count` DESC, z__pivot_col_rank, `events.income`) AS z___rank FROM (\n", + "SELECT *, DENSE_RANK() OVER (ORDER BY CASE WHEN `events.gender` IS NULL THEN 1 ELSE 0 END, `events.gender`) AS z__pivot_col_rank FROM (\n", + "SELECT\n", + " events.gender AS `events.gender`,\n", + " events.income AS `events.income`,\n", + " COUNT(*) AS `events.count`\n", + "FROM eventsdata AS events\n", + "WHERE\n", + " (events.income <> 'unknown' OR events.income IS NULL)\n", + "GROUP BY 1,2) ww\n", + ") bb WHERE z__pivot_col_rank <= 16384\n", + ") aa\n", + ") xx\n", + ") zz\n", + "WHERE (z__pivot_col_rank <= 50 OR z__is_highest_ranked_cell = 1) AND (z___pivot_row_rank <= 500 OR z__pivot_col_ordering = 1) ORDER BY z___pivot_row_rank;" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "8716cb1f-b1f4-4ec8-9f74-df48cc7b4154", + "metadata": {}, + "source": [ + "Pipeline will keep pushing data from the kafka topic. Once your data is loaded you can stop the pipeline using below command" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "35573b60-4d2c-4861-9fad-c53312993dd3", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "STOP PIPELINE eventsdata" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "30a9b5de-79d0-481c-99cb-7321cbad95d9", + "metadata": {}, + "source": [ + "## Conclusion\n", + "\n", + "
\n", + " \n", + "
\n", + "

Action Required

\n", + "

If you created a new database in your Standard or Premium Workspace, you can drop the database by running the cell below. Note: this will not drop your database for Free Starter Workspaces. To drop a Free Starter Workspace, terminate the Workspace using the UI.

\n", + "
\n", + "
\n", + "\n", + "We have shown how to connect to Kafka using `Pipelines` and insert data into SinglestoreDB. These techniques should enable you to\n", + "integrate your Kafka topics with SingleStoreDB." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "ac2472f8-bca5-419a-82e4-0e39ea328522", + "metadata": {}, + "source": [ + "Drop the pipeline using below command" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "7486de45-9c10-43c4-9f0d-2b9d68671b22", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "DROP PIPELINE eventsdata" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "204475a5-9f22-4ec7-8a61-86e802c52055", + "metadata": {}, + "outputs": [], + "source": [ + "shared_tier_check = %sql show variables like 'is_shared_tier'\n", + "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n", + " %sql DROP DATABASE IF EXISTS EventAnalysis;" + ] + }, + { + "cell_type": "markdown", + "id": "330a667f-19e3-4af8-97d7-1d9d28cfe002", + "metadata": {}, + "source": [ + "
\n", + "
" + ] + } + ], + "metadata": { + "jupyterlab": { + "notebooks": { + "version_major": 6, + "version_minor": 4 + } + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}