Skip to content

Latest commit

 

History

History
154 lines (100 loc) · 4.9 KB

README.md

File metadata and controls

154 lines (100 loc) · 4.9 KB

snowBrain

Deploy with Vercel

snowbrain_v2_wide.mp4

SnowBrain is an open-source prototype that serves as your personal data analyst. It converses in SQL, remembers previous discussions, and even draws data visualizations for you.

This project is a unique blend of Snowflake, Langchain, OpenAI, Pinecone, NEXTjs, and FastAPI, among other technologies. It's all about reimagining the simplicity of SQL querying. Dive in and discover a new way of interacting with your data.

Tech Stack

Features

  • Snowflake to Vector Database: Automatic conversion of all Snowflake DDL to a vector database.
  • Conversational Memory: Maintain context and improve the quality of interactions.
  • Snowflake Integration: Integrate with Snowflake schema for automatic SQL generation and visualization.
  • Pinecone Vector Database: Leverage Pinecone's vector database management system for efficient searching capabilities.
  • Secure Authentication: Employ Clerk.dev for secure and hassle-free user authentication.
  • Rate Limit Handling: Utilize Upstash Redis for managing rate limits.
  • Fast API: High-performance Python web framework for building APIs.

Example Queries

snowBrain is designed to make complex data querying simple. Here are some example queries you can try:

  • Total revenue per product category: "Show me the total revenue for each product category."
  • Top customers by sales: "Who are the top 10 customers by sales?"
  • Average order value per region: "What is the average order value for each region?"
  • Order volume: "How many orders were placed last week?"
  • Product price listing: "Display the list of products with their prices."

Installation

Follow these steps to get snowBrain up and running in your local environment.

  1. Update Environment Variables

    Make sure to update the environment variables as necessary. Refer to the example provided:

    .env.example
  2. Auto fetch All Schema DDL

    You can do this by running the following command:

    python3 embed/snowflake_ddl_fetcher.py

    Make sure to install requirements using

    pip3 install -r embed/requirements.txt
  3. Convert DDL Documents to Vector & Upload to Pinecone

    Use the following command to do this:

    python3 embed/embed.py
  4. Install Dependencies for the Code Plugin

    Navigate to the code plugin directory and install the necessary dependencies using Poetry:

    cd code-plugin && poetry install
  5. Deploy FastAPI to Modal Labs

    Run the following command to deploy your FastAPI (make sure to add a secrets file in modal labs):

    modal deploy main.py

    After deploying, make sure to store the endpoint in your environment variables:

    MODAL_API_ENDPOINT=
    MODAL_AUTH_TOKEN=random_secret
  6. Install packages

    Install packages using the following command:

    bun install
  7. Run Locally

    Test the setup locally using the following command:

    bun run dev

    Test the build

    bun run build
  8. Deploy to Vercel

    Finally, when you're ready, deploy the project to Vercel.


Note: Vercel build is automatically blocked on folders code-plugin, embed and readme.md. You can additionally add a build block command in vercel's dashboard.


One-Click Deploy

Deploy with Vercel


Contributing

Here's how you can contribute:


Credits

Thanks to @jaredpalmer, @shuding_, @shadcn, @thorwebdev