Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Presets Features #16

Open
CogNetSys opened this issue Aug 11, 2023 · 22 comments
Open

Feature Request: Presets Features #16

CogNetSys opened this issue Aug 11, 2023 · 22 comments
Labels
enhancement New feature or request

Comments

@CogNetSys
Copy link
Contributor

Developers and AI companies are slowly starting to realize that all of the power is in the prompt. I call them Personas.

I'd like to see a few features implemented for the Presets sidebar:

1. "Add New Folder" Feature:
For sorting presets and chats. There should be a parent folder called "Personas" in which all presets reside, presets (or personas) maintain lists of their associated child chat sessions like you have them now. The button should add top level or parent level folders, at the same level as the Personas folder, for projects, issues, routines, or whatever. It should permit nesting of folders too.

2. Custom Folder and Chat Session Features:
Custom icons or colors to provide an additional layer of visual sorting. There should be both custom icons as well as multiple colors. Change the color, change the icon, and rename any/all chats, personas, and folders. We should provide an icon library. I think I saw one already in GPT-Runner somewhere.

3. "Add New Preset" button:
Add new persona quickly and easily. It should just copy an existing xxx.gpt.md template (the template is so generic that I see no need to offer the user the ability to see or edit it). It's nice to name a preset so that it can be identified and changed but if we can do that part on the frontend then the name of the file is less relevant and could be named a datestamp or something incrementable.

  1. "Copy Preset":
    Multiple versions of the same preset but with slight differences. Also to create custom workflows by dragging and dropping copies of presets to folders and assigning them to a developer who can access them through the local LAN. Another one of your awesome ideas of creating a website available through localhost. Brilliant!

  2. "Export Chat Session":
    I think I saw you have this on your TODO list somewhere, but yeah, export history seems like a necessary feature.

Of course I'd also like to recommend changing the name from "presets" to "personas" or "AI Personas" because the system prompt is very specific and creates a true persona with complex features and responses and it seems more appropriate to call them "Personas". Personas make them more human too, whereas presets de-humanizes them. Just my opinion though.

@CogNetSys
Copy link
Contributor Author

I would be willing to contribute my library of Personas to your store idea when and if you implement it.

@2214962083
Copy link
Collaborator

Good idea! i am refactoring the preset file tree data structure and will support this, but maybe it's a long time. thank you!

@2214962083 2214962083 added the enhancement New feature or request label Aug 14, 2023
@CogNetSys
Copy link
Contributor Author

CogNetSys commented Aug 16, 2023

...

@CogNetSys
Copy link
Contributor Author

Well, technically the Voice App is a standalone application (an AppImage) that plugs into GPT-Runner via API calls and it controls any aspect of GPTR that we expose a keyboard shortcut for. Keyboard shortcuts and tabing. My Voice App can run ANY application via keyboard shortcuts. There is absolutely nothing inside of GPT-RUnner that needs to change to support my voice app beside adding accessibility features. I do have to craft a script to add all of GPT-Runner's keyboard shortcuts into the Voice App but that will only take me one day to do.

@2214962083
Copy link
Collaborator

Hi @CogNetSys

Your suggestions are very good, especially your design diagram, I think we can strive in this direction.

Your voice control over the entire application is an interesting feature, which means that we need to add features to the enumeration and collect them, so that this enumeration can be customized with shortcut keys and voice keywords.

I have some doubts:

For the design drawings, I can understand the agent, but what are the application tools and machine learning for?

To provide some information, I am skilled in front-end development, and I am not familiar with machine learning, so are you confident in implementing machine learning features?

@2214962083
Copy link
Collaborator

BTW, I sincerely invite you to become a maintainer of the GPT-Runner repository.
Here is the invitation link: https://github.com/nicepkg/gpt-runner/invitations

@CogNetSys
Copy link
Contributor Author

Thank you for the invite, I believe I have a lot to contribute and I truly look forward to working with your team.

which means that we need to add features to the enumeration and collect them, so that this enumeration can be customized with shortcut keys and voice keywords.

Precisely. Exposing commands to keyboard shortcuts creates an API-like interface that can be used for automation and voice control, as well as screen readers and the option to use AI's multimodal capabilities to assist people with disabilities (for example AI can capture all sounds and convert them to text for those who are hearing impaired or the ability for an AI to describe the contents of a video, graphic, or audio clip).

@CogNetSys
Copy link
Contributor Author

CogNetSys commented Aug 17, 2023

Im a 54 y/o Business and Technology Consultant who has over 20 years experience working with entrepreneurs converting their business ideas into reality (since 1995). I dont just propose architectures, designs, or system solutions, I actually implement them too. The depth and breadth of my knowledge and experience permeates into every nook-and-cranny of the IT world. Information Technology is my "bitch". There is absolutely NOTHING in all of technology that I cannot do or learn. I do front-end, back-end, gather requirements from customers, mockups, graphics design with photoshop, illustrator; build websites; build mobile apps; I can write in any programming language, use any ide, and learn even the most complex new technologies as quickly as it can be learned. I do computer repairs and builds, I design and implement complex computer networks and networking equipment including PBX and VoIP. Security is also one of my domains as I pursued CISSP certification many years ago. There is nothing in Tech that I cant do including ML, LLMs, prompting, agents, etc... They are my bitches. Stand-alone software apps in C++, Java, or Python. No problem. Datbases. I speak SQL fluently, MySQL, PostGres, etc.. I could go on and on and on.

I do not think I am a know it all, I am very much aware of the extenf of my ignorance and would never assert that I know everything. I am confident in my own skills to do or learn anything though.

@CogNetSys
Copy link
Contributor Author

So as far as the ML... its simple. I am going to keep logs of how the program is being used, format it for training after anonymizing it, and then upload it to my server for training ML. Im splitting the "logs" into relevant sections for proper training. They will be split into logical segments: 1) Planning, 2) Generating, 3) Editing and Testing, and 4) QUality Assurance and Product Release. 4 stages, 4 AI's. #1 is Doc (planning), #2 is Jenny (she generates), #3 is Eddy (he edits), #4 is Inspector (he inspects and releases). Splitting up the workflow creates a team of AI units that specialize in their domain but yet work together to accomplish work.
How will they work? They will be monitoring your activities as you work and interject to make suggestions such as "I wrote a macro to automate that task for you would you like to use it?" or "According to the Corporate Style Guide the main background color for the app should be #fffff or white."
They examine how you are using the app and make suggestions on how to improve it such as automations, redisign it such as swtiching from Angular to React, or ?? Let's find out together what they can do for us.
The point of the ML section is to create models, interact with them, and customize them so we can do our jobs more effectively and efficiently.
The task of trying to train ONE AI to plan, generate, edit, and inspect is just way too hard. Breaking it up into logical domains and scaffolding these domains is how true AGI will be achieved. Im not striving for TRUE AGI, just AGI for software developers.

@CogNetSys
Copy link
Contributor Author

AGI, as in Artificial General Intelligence. The holy grail of AI.

@CogNetSys
Copy link
Contributor Author

CogNetSys commented Aug 17, 2023

...

@2214962083
Copy link
Collaborator

Precisely. Exposing commands to keyboard shortcuts creates an API-like interface that can be used for automation and voice control, as well as screen readers and the option to use AI's multimodal capabilities to assist people with disabilities (for example AI can capture all sounds and convert them to text for those who are hearing impaired or the ability for an AI to describe the contents of a video, graphic, or audio clip).

So do you mean something like eventemitter?

@2214962083
Copy link
Collaborator

2214962083 commented Aug 18, 2023

Im a 54 y/o Business and Technology Consultant who has over 20 years experience working with entrepreneurs converting their business ideas into reality (since 1995). I dont just propose architectures, designs, or system solutions, I actually implement them too. The depth and breadth of my knowledge and experience permeates into every nook-and-cranny of the IT world. Information Technology is my "bitch". There is absolutely NOTHING in all of technology that I cannot do or learn. I do front-end, back-end, gather requirements from customers, mockups, graphics design with photoshop, illustrator; build websites; build mobile apps; I can write in any programming language, use any ide, and learn even the most complex new technologies as quickly as it can be learned. I do computer repairs and builds, I design and implement complex computer networks and networking equipment including PBX and VoIP. Security is also one of my domains as I pursued CISSP certification many years ago. There is nothing in Tech that I cant do including ML, LLMs, prompting, agents, etc... They are my bitches. Stand-alone software apps in C++, Java, or Python. No problem. Datbases. I speak SQL fluently, MySQL, PostGres, etc.. I could go on and on and on.

Wow, it's unbelievable, your experiences are so rich. I've only worked for 4 years and all in the front-end domain. I'm so glad that we have such an experienced engineer like you joining us.

@2214962083
Copy link
Collaborator

So as far as the ML... its simple. I am going to keep logs of how the program is being used, format it for training after anonymizing it, and then upload it to my server for training ML. Im splitting the "logs" into relevant sections for proper training. They will be split into logical segments: 1) Planning, 2) Generating, 3) Editing and Testing, and 4) QUality Assurance and Product Release. 4 stages, 4 AI's. #1 is Doc (planning), #2 is Jenny (she generates), #3 is Eddy (he edits), #4 is Inspector (he inspects and releases). Splitting up the workflow creates a team of AI units that specialize in their domain but yet work together to accomplish work.
How will they work? They will be monitoring your activities as you work and interject to make suggestions such as "I wrote a macro to automate that task for you would you like to use it?" or "According to the Corporate Style Guide the main background color for the app should be #fffff or white."
They examine how you are using the app and make suggestions on how to improve it such as automations, redisign it such as swtiching from Angular to React, or ?? Let's find out together what they can do for us.
The point of the ML section is to create models, interact with them, and customize them so we can do our jobs more effectively and efficiently.
The task of trying to train ONE AI to plan, generate, edit, and inspect is just way too hard. Breaking it up into logical domains and scaffolding these domains is how true AGI will be achieved. Im not striving for TRUE AGI, just AGI for software developers.

This sounds like a great approach, essentially embedding company information for storage or using it to train models, but there are some points to consider:

  1. The starting point of GPT-Runner's design is no-server, meaning we won't provide a server to deploy anything, as it would make the program unreliable and insecure. (We can't afford expensive server fees, and many companies don't allow internet-connected development)

  2. GPT-Runner is designed with a focus on pragmatism, so we won't explore AGI until autoGPT shows significant results. GPT-Runner should be able to assist with both old project development and new project development.

  3. GPT-Runner is designed with efficiency in mind. We can integrate any AI feature that boosts developer productivity, but these features must be accessible to developers of any language.

  4. GPT-Runner is designed with safety as a priority. Another goal is that, once open llm coding reaches the level of gpt-3.5, all businesses can locally deploy llm models connected to GPT-Runner to increase employee development efficiency. Yes, it can be used offline. In fact, I've already used GPT-Runner with some open llm for completely offline use.

Therefore, our feature points should be based on the above points as the direction of development. Training models and paying for servers are a huge challenge for us, so we can focus on other things first.

@2214962083
Copy link
Collaborator

Eventually GPT-R could have a powerfuld API, powerful web GUI, powerful events, and powerful accessibilty functions.

Events are a new technology, I dont know if you are familiar with Flow Architecture or not. This is probably something I am going to have to add to GPT-R if you want it or just my app if you dont.

It is a technology that will overtake "http requests" but not replace them altogether because they actually work together quite nicely each with its own use case. Adding events to our app now future-proofs it.

I'm not quite sure what you mean by the "flow" architecture. I'm guessing it's something similar to EventEmitter, RxJS, WebSockets, or React-Flow. Which one is more like what you're referring to?

@CogNetSys
Copy link
Contributor Author

CogNetSys commented Aug 18, 2023

...

@CogNetSys
Copy link
Contributor Author

Flow Architecture/EDA is a GPTR 3.0/4.0 technology likely.

@2214962083
Copy link
Collaborator

Sorry, I was pretty busy over the weekend, so I'm only responding now.

I agree that we should build an event-driven stream application, with the frontend and backend performing corresponding tasks through mutual event subscriptions.

I took some time to understand Kafka, and I think it's more suitable for distributed systems. For GPT-Runner, our servers are local servers, so websocket + rxjs might be more suitable for us.

Websocket can implement bidirectional event communication and subscription between the frontend and backend, and rxjs has many operators, so we can easily implement batch data reading/writing to disk.

@2214962083
Copy link
Collaborator

At present, we don't need microservices. If we need to integrate third-party LLMs, we can ask users to deploy projects similar to LocalAI themselves.

These projects wrap the third-party LLM API into an OpenAI-like API. In case of additional feature requirements, we can provide a docker microservice for local deployment in the future to enhance GPT-Runner.

Besides this, we should not use microservices at the moment, as that would be too difficult for us to maintain.

However, we can design a plugin system that allows plugins to customize server events and client events, as well as customize the client's view area.

This is similar to how the VSCode extension works for VSCode, which helps decouple our code.

What do you think?

@CogNetSys
Copy link
Contributor Author

I feel like I am imposing my vision on GPT-Runner. I will be starting a discussion on the fork of GPT-R at github.com/CogNetSys for DevAGI. I am a motor mouth and write small books instead of simple communications sometimes. I dont think that filling your discussion forum with my posts is the right spot. I hope that we can continue to work together toward achieving a shared vision but I feel like lengthy posts about my vision for DevAGI belong on my discussion forum.

@CogNetSys
Copy link
Contributor Author

I shared information about DevAGI at my GitHub Discussion Forum. It communicates my basic ideas so far specifically for DevAGI. DevAGI is an ecosystem of components that collectively empower and streamline the entire SDLC to produce software as effectively, efficiently, and cost effectively as possible.
My fork of GPT-Runner is one of many tools that collectively provide an end-to-end solution for software development. A solution that I can streamline and track the use of the system, I can use the usage data to build data for training ML models. I plan to build multiple models and then scaffold those models to achieve AGI for developers.

@Meathelix1
Copy link

GPT-Runner is designed with safety as a priority. Another goal is that, once open llm coding reaches the level of gpt-3.5, all businesses can locally deploy llm models connected to GPT-Runner to increase employee development efficiency. Yes, it can be used offline. In fact, I've already used GPT-Runner with some open llm for completely offline use.

Would love to know how you got that running. As small and local llms are getting better at specific tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants