Seperate models for each agent #5

R-Dson · 2024-04-11T19:56:29Z

Nice project. Open source models are usually good at one task, such as coding, writing, etc. And it would be interesting if we could set a parameter in the .env to specify which model that agent should use.

As an example, this could be done by adding a variable in .env such as ENGINEER_MODEL_NAME="gemma:7b-instruct-v1.1-fp16", and the same for the others agents. If none is set then fall back to the default. And if you run jemma --prompt "Trivia Game" --build-prototype --ollama dolphin-mistral:7b-v2.6-dpo-laser-fp16 with ENGINEER_MODEL_NAME set in .env to gemma then engineer is using gemma and all other use dolphin-mistral.

This could potentially perform better than using only one local model, at the expense of time switching models, thus should be optional to use.

The text was updated successfully, but these errors were encountered:

tolitius · 2024-04-11T22:02:59Z

great idea!
was trying to figure out how to mix models better

I tried to generate requirements from Claude:

jemma --prompt "Trivia Game" --build-prototype --claude

and then write code with Ollama (codegemma, deepseek coder, etc.)

jemma --requirements requirements/learning-portal.2024-04-11.16-11-56-382.txt --build-prototype --ollama codegemma:7b-instruct-fp16

but so far I think local instruct / code models are trained to take in short instructions vs. something like detailed requirements and generate short passages of code

I am sure it will change for the better in a very near future
and also I am looking at meta prompting it better

as to your idea, I think model should be assigned to an activity vs. a role
for example

a business owner can:

generate requirements from idea
generate user stories from idea
generate user stories from requirements
analyze a visual sketch (trying it right now) to generate requirements

an engineer can:

review a business story (language task)
implement a user story
refactor code
etc..

your idea still applies, but I think it makes sense to assign a model for each activity
and if unassigned, use the top level model:

[
 {"task": "idea-to-requirements",
  "model": "claude-3-haiku-20240307",
  "provider": "claude"},
 
 {"task": "review-user-story",
  "model": "mistral:7b-instruct-v0.2-fp16",
  "provider": "ollama"},
 
 {"task": "refactor-code",
  "model": "deepseek-coder:6.7b-instruct-fp16",
  "provider": "ollama"},
]

R-Dson · 2024-04-11T22:24:56Z

Yes that seems like a reasonable approach.

As for the Ollama and short instructions, you might need to add the parameter num_ctx to options and increase the context window. I believe the default is 2048.

tolitius · 2024-04-12T04:18:39Z

yep, updated per docs

R-Dson · 2024-04-12T10:50:25Z

Nice, you could include context length in this part since you may want a 7b model to use a longer length than a 33b model, etc.

 {"task": "idea-to-requirements",
  "model": "claude-3-haiku-20240307",
  "provider": "claude"},
 
 {"task": "review-user-story",
  "model": "mistral:7b-instruct-v0.2-fp16",
  "provider": "ollama",
  "context": "16384"},
 
 {"task": "refactor-code",
  "model": "deepseek-coder:33b-instruct-fp16",
  "provider": "ollama",
  "context": "4096"},
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seperate models for each agent #5

Seperate models for each agent #5

R-Dson commented Apr 11, 2024 •

edited

Loading

tolitius commented Apr 11, 2024 •

edited

Loading

R-Dson commented Apr 11, 2024 •

edited

Loading

tolitius commented Apr 12, 2024

R-Dson commented Apr 12, 2024

Seperate models for each agent #5

Seperate models for each agent #5

Comments

R-Dson commented Apr 11, 2024 • edited Loading

tolitius commented Apr 11, 2024 • edited Loading

R-Dson commented Apr 11, 2024 • edited Loading

tolitius commented Apr 12, 2024

R-Dson commented Apr 12, 2024

R-Dson commented Apr 11, 2024 •

edited

Loading

tolitius commented Apr 11, 2024 •

edited

Loading

R-Dson commented Apr 11, 2024 •

edited

Loading