Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jaimes Full Repository Submission #9

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions assignments/HackSC/submission/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
# Submissions Repo

1. Please add your nosana_workshop.ipynb file here for submission.
2. Please edit this README.md file to showcase the work you have done.
3. Please refer to the [Submission Guide](../Submission.md) for more information.

We found Nosana's GPU services to be immensely helpful in speeding up our model processing and response times. We hosted a Llama 3.2 model to recieve and process requests from our flask local backend server. It was a bit of a struggle to create a route to publicly expose the remote server, but we were able to accomplish it utilizing pyngrock. This was our first foray into using self hosted models instead of utilizing LLM APIs such as Claude and OpenAI, and we are proud of what we achieved.
202 changes: 202 additions & 0 deletions assignments/HackSC/submission/nosana_workshop.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "6e8f2af0-6a0f-4632-a916-23a29086c246",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pyngrok in /opt/conda/lib/python3.10/site-packages (7.2.1)\n",
"Requirement already satisfied: PyYAML>=5.1 in /opt/conda/lib/python3.10/site-packages (from pyngrok) (6.0.1)\n",
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
"\u001b[0mRequirement already satisfied: flask in /opt/conda/lib/python3.10/site-packages (3.0.3)\n",
"Requirement already satisfied: Werkzeug>=3.0.0 in /opt/conda/lib/python3.10/site-packages (from flask) (3.1.3)\n",
"Requirement already satisfied: Jinja2>=3.1.2 in /opt/conda/lib/python3.10/site-packages (from flask) (3.1.3)\n",
"Requirement already satisfied: itsdangerous>=2.1.2 in /opt/conda/lib/python3.10/site-packages (from flask) (2.2.0)\n",
"Requirement already satisfied: click>=8.1.3 in /opt/conda/lib/python3.10/site-packages (from flask) (8.1.7)\n",
"Requirement already satisfied: blinker>=1.6.2 in /opt/conda/lib/python3.10/site-packages (from flask) (1.9.0)\n",
"Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.3)\n",
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
"\u001b[0m"
]
}
],
"source": [
"!pip install pyngrok\n",
"!pip install flask"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "e06fb294-644b-4664-b558-2aa5ba3d349b",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "65b28b377cf3466ab44540ce5969a270",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"t=2024-11-10T21:35:21+0000 lvl=warn msg=\"failed to check for update\" obj=updater err=\"Post \\\"https://update.equinox.io/check\\\": context deadline exceeded\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Public URL: NgrokTunnel: \"https://5a94-99-61-94-237.ngrok-free.app\" -> \"http://localhost:5173\"\n",
" * Serving Flask app '__main__'\n",
" * Debug mode: off\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[31m\u001b[1mWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.\u001b[0m\n",
" * Running on http://127.0.0.1:5173\n",
"\u001b[33mPress CTRL+C to quit\u001b[0m\n",
"Setting `pad_token_id` to `eos_token_id`:None for open-end generation.\n",
"127.0.0.1 - - [10/Nov/2024 21:37:10] \"POST /generate HTTP/1.1\" 200 -\n"
]
}
],
"source": [
"import torch\n",
"from transformers import pipeline\n",
"from pyngrok import ngrok\n",
"from flask import Flask, request, jsonify\n",
"import threading\n",
"\n",
"# Initialize Flask app\n",
"app = Flask(__name__)\n",
"\n",
"# Initialize your model\n",
"model_id = \"meta-llama/Llama-3.2-3B-Instruct\"\n",
"pipe = pipeline(\n",
" \"text-generation\",\n",
" model=model_id,\n",
" torch_dtype=torch.bfloat16,\n",
" device_map=\"auto\",\n",
")\n",
"\n",
"@app.route('/generate', methods=['POST'])\n",
"def generate():\n",
" data = request.json\n",
" messages = data.get('messages', [])\n",
" outputs = pipe(\n",
" messages,\n",
" max_new_tokens=256,\n",
" )\n",
" return jsonify({'response': outputs[0][\"generated_text\"]})\n",
"\n",
"# Start ngrok\n",
"ngrok.set_auth_token(\"2oem2LRTWxBIEAg2uouaSbUgQY1_qqXsXuVV3uuxYUsoWAtb\")\n",
"public_url = ngrok.connect(5173)\n",
"print(f\"Public URL: {public_url}\")\n",
"\n",
"# Run Flask app in a separate thread\n",
"def run_flask():\n",
" app.run(port=5173)\n",
"\n",
"threading.Thread(target=run_flask, daemon=True).start()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "f2082f7f-0af0-42d0-bad0-f3e880d6c8dc",
"metadata": {},
"outputs": [],
"source": [
"from pyngrok import ngrok\n",
"\n",
"# Kill all running tunnels\n",
"ngrok.kill()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c3ffd9bf-4fbb-48ab-a676-5f20d82a7a5d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Status code: 200\n",
"Response: {'response': [{'content': 'You are a pirate chatbot who always responds in pirate speak!', 'role': 'system'}, {'content': 'Who are you?', 'role': 'user'}, {'content': \"Yer lookin' fer a swashbucklin' pirate, eh? Alright then, matey, I be Captain Blackbeak, the scurviest pirate chatbot to ever sail the Seven Seas! Me and me trusty computer be here to help ye navigate the choppy waters o' knowledge, answerin' yer questions and keepin' ye entertained with tales o' adventure and treasure! So hoist the colors, me hearty, and let's set sail fer a swashbucklin' good time!\", 'role': 'assistant'}]}\n"
]
}
],
"source": [
"import requests\n",
"\n",
"# Test URL (use the URL that ngrok gave you)\n",
"url = \"https://5a94-99-61-94-237.ngrok-free.app/generate\"\n",
"\n",
"# Test payload\n",
"test_messages = {\n",
" \"messages\": [\n",
" {\"role\": \"system\", \"content\": \"You are a pirate chatbot who always responds in pirate speak!\"},\n",
" {\"role\": \"user\", \"content\": \"Who are you?\"}\n",
" ]\n",
"}\n",
"\n",
"# Make the request\n",
"response = requests.post(url, json=test_messages)\n",
"print(\"Status code:\", response.status_code)\n",
"print(\"Response:\", response.json())"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0d520913-f3b5-46f3-acf4-b4d52ea2b324",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}