Current Technical Limitations Consideration/Discussion #81
Replies: 1 comment
-
While this is thoughtful and helpful, you did not engage with existing conversations and struggles. This is a meta conversation that just adds to the noise. Please review the rules of the community:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello Everyone
After following the project from its inception, it has become clear that it's moving fast (as intended) and breaking things (as expected).
However, I just want to point out that, in building the framework for a "Swarm-Style" agent system, we must consider the technical challenges we currently face, as it will help when building the tools and sets of tools that will make this possible.
My main concern is the context window for GPTs (the current tech we're discussing) and how that impacts the SOB and Executive tiers as they attempt to do their job within the swarm.
There are technical limitations to the number of instructions GPTs can follow (I believe 8,000 characters in your "Configuration" prompt currently) and the documentation files you upload, which I believe is currently set at 10 at a time, with 20 as the max.
The inevitable truth is that, at some point, the context is lost, and the agent performing a task (regardless of tier) will suffer from performance degradation depending on the Swarm's task. This is especially alarming for tiers 0 to 1 agents.
I've experienced this when creating GPTs for simple tasks like learning your email writing style and tone to help you create new emails in your personalized tone and style of writing for a specific scenario (in my case, business).
Another technical limitation that I have run into so far is the ability to initiate a GPT from external sources other than the chat window. While there is technically a way to do this with the new "Assistants API" - these Assistants and GPTs seem to be separate layers and the Assistants API has a very limited usage rate per minute from OpenAI. It looks like (to me) that OpenAI has built this as a bridge from the older "Plugins" model to the new Agent/Assistant model as they transition into the "GPTs Store" era.
The good news is that, through the use of "Actions" in GPTs, they are able to perform API functions with external systems.
Having said all of this, I am curious to hear what your thoughts are on getting around these aforementioned limitations. It is possible that the solutions are already out there, and I'm not aware of them.
And, finally, speaking of solutions that are "already out there" - should we consider NOT building this on top of GPTs alone?
I'm sure many of you are familiar with MemGPT (which tries to get around the context issue) and systems like "ChatDev" which already have inter-agent communication and "AutoGPT" which is already encouraging agent autonomy to a certain degree for a very specific singular task.
I've played around with all of these tools, and what I can tell you is that if we can get people who deeply understand these systems and can dissect the step by step process of how we go about navigating the contextual-integrity challenges then we can achieve the ultimate goal of this project.
I'll give you an example:
From the current technical limitations, it is clear that the SOB directive will face the challenge of contextual integrity. Therefore, it seems to me that the best course of action would be that the SOB "Agents" will not be singular agents, but a collection of agents that have it's own checks and balances to form the "entity" of that SOB Agent and that entity will then proceed to carry out the tasks via specialized task entities that will then communicate the task, there would be a separate entity that would then check the task with it's own context integrity window intact for maximum performance.
So, when brainstorming multiple different aspects of "The Swarm" - my suggestion is that the next step should be to break down these aspects into "things that can be done within the current technological constraints" and that will help us actually get it done by either combining the use of different available technologies or changing the current technology proposed altogether.
Would love to hear your thoughts on this.
Cheers!
Beta Was this translation helpful? Give feedback.
All reactions