Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ComfyUI tool for Stable Diffusion #8160

Merged
merged 11 commits into from
Sep 18, 2024

Conversation

QunBB
Copy link
Contributor

@QunBB QunBB commented Sep 9, 2024

Checklist:

Important

Please review the checklist below before submitting your pull request.

  • Please open an issue before creating a PR or link to an existing issue
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Description

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves a feature request, be sure to link to that issue. Close issue syntax: Fixes #<issue number>, see documentation for more details.

Fixes

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update, included: Dify Document
  • Improvement, including but not limited to code refactoring, performance optimization, and UI/UX improvement
  • Dependency upgrade

Testing Instructions

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A
  • Test B
img-1 img-2 img-3

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. 🔨 feat:tools Tools for agent, function call related stuff. labels Sep 9, 2024
Copy link
Member

@crazywoola crazywoola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments

@QunBB
Copy link
Contributor Author

QunBB commented Sep 10, 2024

@crazywoola Done. Check it again please.

Copy link
Member

@crazywoola crazywoola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the author

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 12, 2024
@crazywoola crazywoola self-requested a review September 12, 2024 06:37
Copy link
Member

@crazywoola crazywoola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments

@dosubot dosubot bot removed the lgtm This PR has been approved by a maintainer label Sep 12, 2024
@WeepsDanky
Copy link
Contributor

WeepsDanky commented Sep 12, 2024

Hi @QunBB . When I was testing this tools, an error shows up: Failed to get models, Please input model. My comfyui is started on http://127.0.0.1:8188/ with a blank canvas. How to resolve this error?
I have my models loaded in /models/checkpoints.
image

image

@QunBB
Copy link
Contributor Author

QunBB commented Sep 12, 2024

@crazywoola @WeepsDanky The model name should be also exposed when verifying tool's credentials. Some modified code was after verifying credentials, so i miss the test about the changed code about validate_models. I will fix it later.

@QunBB
Copy link
Contributor Author

QunBB commented Sep 12, 2024

@crazywoola @WeepsDanky I have fixed it, please check it again.

4

@QunBB
Copy link
Contributor Author

QunBB commented Sep 12, 2024

@crazywoola But i am troubled by Ruff check, should i reformat the files?

@WeepsDanky
Copy link
Contributor

@QunBB Great thanks, it is working now.

@WeepsDanky
Copy link
Contributor

WeepsDanky commented Sep 13, 2024

@QunBB I noticed the tool currently can only use a pre-defined workflow txt2img.json. Can you please be more specific about the name and description in yaml? We need to make sure other users understand this tool can only use this workflow.

For example:
name: txt2img workflow,
description: a pre-defined comfyui workflow that can use one model and up to 3 loras to generate images. Does not support newer models such as stable diffusion 3 that requires a triple clip loader.

8f0d79af27fadd2f30849a0e849a65c

@QunBB
Copy link
Contributor Author

QunBB commented Sep 13, 2024

@WeepsDanky Hi, i have change the name and description in yaml. In addition, i add the support for SD3 and FLUX.
You could try them like the examples:

@QunBB QunBB requested a review from crazywoola September 14, 2024 02:14
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 18, 2024
@laipz8200 laipz8200 merged commit cf645c3 into langgenius:main Sep 18, 2024
6 checks passed
@wisepmlin
Copy link

Mac comfloxy
Failed to get models, [Errno 111] Connection refused
截屏2024-09-21 14 06 22

Scorpion1221 added a commit to yybht155/dify that referenced this pull request Sep 21, 2024
* commit '7f3282ec04d87cfb8fcff892e824c96094b92636': (105 commits)
  Update version to 0.8.3 in packaging and docker-compose files (langgenius#8590)
  chore: fix webpack dependencies order (langgenius#8542)
  ComfyUI tool use the new internal enumeration class "VariableKey" (langgenius#8533)
  Fix: update qwen model and model config (langgenius#8584)
  fix: fix qwen series model type (langgenius#8580)
  feat: add hunyuan-vision (langgenius#8529)
  chore: improve delimiter (langgenius#8552)
  add storage error log (langgenius#8556)
  feat: sync Qwen API with Aliyun Bailian (langgenius#8538)
  fix: thread_pool submit count in parallel workflow not releasing (langgenius#8549)
  fix: ci issues(missing duckduckgo-search==6.2.11, ruff lint issue) (langgenius#8543)
  feat: add format util unit and add pre-commit unit check (langgenius#8427)
  validate user permission before enter app detail page (langgenius#8527)
  refactor: rename task_type to task for jina embeddings v3 (langgenius#8488)
  chore: Deprecate gpt-3.5-turbo-0613 and gpt-3.5-turbo-16k-0613 models (langgenius#8500)
  feat: Add ComfyUI tool for Stable Diffusion (langgenius#8160)
  chore: update the .gitignore file to include opensearch,pgvector,and myscale (langgenius#8470)
  feat: Add base URL settings and secure_ascii options to the Brave search tool (langgenius#8463)
  feat: add flux dev of siliconflow image-gen tool (langgenius#8450)
  chore: workflow BRANCH, PARALLEL i18n (langgenius#8452)
  ...

# Conflicts:
#	api/core/file/file_obj.py
#	api/core/file/message_file_parser.py
#	api/core/helper/code_executor/code_executor.py
#	api/core/workflow/nodes/code/code_node.py
#	api/core/workflow/nodes/tool/tool_node.py
@QunBB
Copy link
Contributor Author

QunBB commented Sep 23, 2024

@wisepmlin This seems to be a network issue. I have tried to install Mac comflowy and it worked too.

@Dongnc1017
Copy link

https://docs.dify.ai/tutorials/tool-configuration/comfyui 的链接无法访问
1、请问我想使用flux 的模型,Flux Dev和Flux Schnell,都支持吗?是否下载过来放在ComfyUl/models/unet/ 就行了。
2、但是又说支持包含文本编码器/clip的SD1.5、SDXL、SD3和FLUX,但不支持需要clip加载器的模型。
这句话是什么意思? 比如t5xxl fp16.safetensors 或t5xxl fp8 e4m3fn.safetensors:?
能否出一个详细的教程

@QunBB
Copy link
Contributor Author

QunBB commented Sep 25, 2024

@Dongnc1017 The comments above have already mentioned it. You should download those models which contain text encoders, then put them into ComfyUl/models/checkpoints/ like official comfyui examples:

Flux Dev and Flux Schnell are both supported.

@laipz8200
Copy link
Member

Hi @QunBB, would you be interested in updating the documentation for this tool? Or perhaps removing the inaccessible links from the configuration?

@QunBB
Copy link
Contributor Author

QunBB commented Sep 25, 2024

@laipz8200 I'm intersted in updating it when i'm free later. I would replace the current link with the official ComfyUI website first, and will add it again when i finish the documentation in dify.

cuiks pushed a commit to cuiks/dify that referenced this pull request Sep 26, 2024
@hjlarry
Copy link
Contributor

hjlarry commented Sep 27, 2024

Hi @QunBB I think current ComfyUI tool can be better by reduce the user input and selection, just give the user a prompt_text input.

The comfyUi workflow is similar to dify's workflow, support a variety of nodes. Current implementation seems define specific steps, the user can only select each params of these steps, seems inflexible.

A more reasonable user case is: the user edit the workflow in the comfy UI, and then export the json of this workflow which is simliar to dify's DSL, then paste it to this tool to get the image in dify. Now they combine the dify and comfyUI's workflow.

This is the export API button:
image

This is a simple workflow just to change an ICON's style, but it has 40 nodes, current solution can't do this:
image

what do you guys think about this ?

@QunBB
Copy link
Contributor Author

QunBB commented Sep 29, 2024

@hjlarry Sure, we could add it, then it will support any image generation workflows in ComfyUI. But i don't think LLM could correctly generate the prompt text via agent, so it may only be used in dify's workflow.
By the way, i think the current tool could be continue to be used for beginners or for simplicity like the pattern of stable diffusion tool. And it could be both used in dify's agent and workflow.

@hjlarry
Copy link
Contributor

hjlarry commented Sep 29, 2024

@QunBB Sure, we can add a new tool and keep both of them.
I think the new tool can config the workflow's json and which node is the text node, to make LLM just gen text prompts then used to agent app.

@QunBB
Copy link
Contributor Author

QunBB commented Sep 30, 2024

@hjlarry It is a good idea.

lau-td pushed a commit to heydevs-io/dify that referenced this pull request Oct 23, 2024
idonotknow pushed a commit to AceDataCloud/Dify that referenced this pull request Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 feat:tools Tools for agent, function call related stuff. lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants