This repository demonstrates how to use Google's Gemini LLM model with features like function calling and importing prompt-engineered models to build powerful, interactive AI agents.
The codebase is modular and customizable, allowing you to adapt the behavior and context of the AI by modifying key components like prompts and chat history.
- Gemini Model Integration:
- Learn how to connect and interact with Google's Gemini model API.
- Use any
gemini
model to generate responses and chat with your AI agent.
- Function Calling with LLMs:
- Explore the "function calling" feature to execute predefined actions based on the AI's reasoning.
- Prompt Engineering:
- Understand how to guide the AI's behavior with carefully designed prompts.
- Use the
prompts.py
file to set instructions, available actions, and examples for the AI agent.
- Importing Prompt-Engineered Models:
- Learn how to retrieve and use a prompt-engineered trained model from Google AI Studio.
- Save and reuse chat history or prompts for advanced customization using
content/model_content.py
.
The repository is organized into modular files to make it easy to customize and extend the AI agent.
You can use this repo to build AI Agent for any uses case. Below are the details about the each file, and how you can edit them for your use.
config.ini
- Add
GEMINI_API_KEY
, you can get one from https://aistudio.google.com. - Add
MODEL_NAME
i.e.gemini-1.5-flash
orgemini-1.5-pro
.
- Add
model_content.py
:- Stores raw prompt/chat history or data for prompt-engineered models, useful for initializing a chat context.
- If you have a prompt-trained Gemini model, provide
content
andgeneration_config_b64
values from theGet Code
option.
prompts.py
:- Contains the system's instructions and examples for interaction.
- You can edit
system_prompt
to define the AI's role, working style, and available actions. - Update
user_prompt
to set the initial query for the AI.
actions.py
:- Contains implementations for the available actions (e.g.,
get_response_time
). - Extend this file to add more actions and functionalities.
- Contains implementations for the available actions (e.g.,
main.py
:- Demonstrates how to use the Gemini model to process user queries and execute actions via function calling.
- Includes a structured loop where the AI thinks, acts, pauses for results, and provides an answer.
try_prompt_trained_gemini.py
:- Shows how to load and use a trained prompt-engineered model.
- Implements functionality to reuse trained models for generating consistent and improved responses.
-
Python 3.10 or higher.
-
A Gemini API Key to use the
google generative ai
library. -
Install the required Python packages by running:
pip install "google-generativeai>=0.8.2"
-
Clone the repository:
git clone https://github.com/WaizKhan7/AI-agent.git cd AI-agent
-
Set values for
GEMINI_API_KEY
andMODEL_NAME
you want to use i.e.gemini-1.5-flash
orgemini-1.5-pro
. -
Run
main.py
to interact with the AI agent:python3 main.py
-
To use a trained prompt-engineered model, set
content.model_content.py
file and run:python3 try_prompt_trained_gemini.py
User: what is the response time of url 'github.com/WaizKhan7'?
Loop: 1
----------------------
Model Response:
Thought: I need to determine the response time for the given URL. I'll use the `get_response_time` function.
Action:
```json
{
"function_name": "get_response_time",
"function_parms": {
"url": "github.com/WaizKhan7"
}
}
```
PAUSE
-- running get_response_time {'url': 'github.com/WaizKhan7'}
({'url': 'github.com/WaizKhan7'}) - Action_Response: 0.2
Loop: 2
----------------------
Model Response:
Thought:The Action_Response provides the response time for the given URL. I can now formulate an answer.
Answer: The response time for github.com/WaizKhan7 is 0.2 seconds.
- API Quota: The Gemini model may have usage limits or quotas. Use sleep intervals in loops to avoid errors.
- Error Handling: Ensure robust handling for invalid JSON, unexpected function calls, or API request failures.
- Customization Effort: Users need to modify prompts and actions according to their specific use cases.
- Add more predefined actions and extend functionality.
- Integrate with other APIs or services for enhanced interactivity.
- Provide a web-based interface for easier interaction with the AI agent.