Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firecrawl started throwing the error: 'Failed to invoke tool: unhashable type: 'dict'.' #8792

Closed
5 tasks done
ugyuji opened this issue Sep 26, 2024 · 8 comments
Closed
5 tasks done
Labels
🐞 bug Something isn't working

Comments

@ugyuji
Copy link
Contributor

ugyuji commented Sep 26, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.8.3

Cloud or Self Hosted

Cloud, Self Hosted (Docker)

Steps to reproduce

  1. Create a new "Workflow"
  2. Add Firecrawl's "Crawl" tool
  3. Set a "Start URL" variable to "https://dify.ai/"
  4. Run the workflow

✔️ Expected Behavior

No response

❌ Actual Behavior

35b2061a9ead09d037198315f3bd573a
Copy link

dosubot bot commented Sep 26, 2024

Hey @ugyuji! I'm here to help you with any bugs, questions, or contributions you have for our project. Let's squash those bugs together! 😊

I found a similar issue that might help you resolve the error "Failed to invoke tool: unhashable type: 'dict'". The error occurs because the tool_parameters dictionary is being used in a context where a hashable type is required. In Python, dictionaries are not hashable and cannot be used as keys in other dictionaries or added to sets.

To resolve this issue, ensure that tool_parameters or any of its contents are not being used in a way that requires them to be hashable. For example, if tool_parameters is being used as a key in another dictionary, you should convert it to a hashable type, such as a tuple of key-value pairs.

Here is the relevant part of the code for reference:

class CrawlTool(BuiltinTool):
    def _invoke(self, user_id: str, tool_parameters: dict[str, Any]) -> ToolInvokeMessage:
        app = FirecrawlApp(
            api_key=self.runtime.credentials["firecrawl_api_key"], base_url=self.runtime.credentials["base_url"]
        )

        scrapeOptions = {}
        payload = {}

        wait_for_results = tool_parameters.get("wait_for_results", True)

        payload["excludePaths"] = get_array_params(tool_parameters, "excludePaths")
        payload["includePaths"] = get_array_params(tool_parameters, "includePaths")
        payload["maxDepth"] = tool_parameters.get("maxDepth")
        payload["ignoreSitemap"] = tool_parameters.get("ignoreSitemap", False)
        payload["limit"] = tool_parameters.get("limit", 5)
        payload["allowBackwardLinks"] = tool_parameters.get("allowBackwardLinks", False)
        payload["allowExternalLinks"] = tool_parameters.get("allowExternalLinks", False)
        payload["webhook"] = tool_parameters.get("webhook")

        scrapeOptions["formats"] = get_array_params(tool_parameters, "formats")
        scrapeOptions["headers"] = get_json_params(tool_parameters, "headers")
        scrapeOptions["includeTags"] = get_array_params(tool_parameters, "includeTags")
        scrapeOptions["excludeTags"] = get_array_params(tool_parameters, "excludeTags")
        scrapeOptions["onlyMainContent"] = tool_parameters.get("onlyMainContent", False)
        scrapeOptions["waitFor"] = tool_parameters.get("waitFor", 0)
        scrapeOptions = {k: v for k, v in scrapeOptions.items() if v not in (None, "")}
        payload["scrapeOptions"] = scrapeOptions or None

        payload = {k: v for k, v in payload.items() if v not in (None, "")}

        crawl_result = app.crawl_url(url=tool_parameters["url"], wait=wait_for_results, **payload)

        return self.create_json_message(crawl_result)

Ensure that tool_parameters is not being used in a way that requires it to be hashable, and the error should be resolved. For instance, the scrapeOptions and payload dictionaries are filtered to remove None or empty values before being used, which helps avoid such issues [1][2][3].

Additionally, you might find it helpful to look at similar issues and discussions:

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the 🐞 bug Something isn't working label Sep 26, 2024
@hjlarry
Copy link
Contributor

hjlarry commented Sep 26, 2024

I can't reproduce. Can you upload your workflow DSL?

@Halflifefa
Copy link

The same question, the process used is the official template <Interview Preparation (Conditional Parallel)>
Key obtained from https://www.firecrawl.dev/app/api-keys

@hjlarry
Copy link
Contributor

hjlarry commented Sep 26, 2024

now I reproduce it in the cloud version,
but in my local newest branch, it works, maybe some PR already fix this

@ugyuji
Copy link
Contributor Author

ugyuji commented Sep 26, 2024

Here's the DSL content:

app:
  description: ''
  icon: 🤖
  icon_background: '#FFEAD5'
  mode: workflow
  name: Test Workflow
  use_icon_as_answer_icon: false
kind: app
version: 0.1.2
workflow:
  conversation_variables: []
  environment_variables: []
  features:
    file_upload:
      image:
        enabled: false
        number_limits: 3
        transfer_methods:
        - local_file
        - remote_url
    opening_statement: ''
    retriever_resource:
      enabled: false
    sensitive_word_avoidance:
      enabled: false
    speech_to_text:
      enabled: false
    suggested_questions: []
    suggested_questions_after_answer:
      enabled: false
    text_to_speech:
      enabled: false
      language: ''
      voice: ''
  graph:
    edges:
    - data:
        isInIteration: false
        sourceType: start
        targetType: tool
      id: 1712630129285-source-1727334022276-target
      source: '1712630129285'
      sourceHandle: source
      target: '1727334022276'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        sourceType: tool
        targetType: end
      id: 1727334022276-source-1713020453724-target
      source: '1727334022276'
      sourceHandle: source
      target: '1713020453724'
      targetHandle: target
      type: custom
      zIndex: 0
    nodes:
    - data:
        desc: ''
        selected: false
        title: Start
        type: start
        variables:
        - label: url
          max_length: 256
          options: []
          required: true
          type: text-input
          variable: url
      height: 90
      id: '1712630129285'
      position:
        x: 30
        y: 427
      positionAbsolute:
        x: 30
        y: 427
      selected: true
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        desc: ''
        outputs:
        - value_selector:
          - '1727334022276'
          - text
          variable: output
        selected: false
        title: End
        type: end
      height: 90
      id: '1713020453724'
      position:
        x: 638
        y: 427
      positionAbsolute:
        x: 638
        y: 427
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        desc: ''
        provider_id: firecrawl
        provider_name: firecrawl
        provider_type: builtin
        selected: false
        title: Crawl
        tool_configurations:
          allowBackwardLinks: 0
          allowExternalLinks: 0
          excludePaths: null
          excludeTags: null
          formats: null
          headers: null
          ignoreSitemap: 1
          includePaths: null
          includeTags: null
          limit: 5
          maxDepth: 2
          onlyMainContent: 0
          waitFor: null
          wait_for_results: 1
          webhook: null
        tool_label: Crawl
        tool_name: crawl
        tool_parameters:
          url:
            type: mixed
            value: '{{#1712630129285.url#}}'
        type: tool
      height: 454
      id: '1727334022276'
      position:
        x: 334
        y: 427
      positionAbsolute:
        x: 334
        y: 427
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    viewport:
      x: 307.80000000000007
      y: -8.799999999999955
      zoom: 0.7

The same error can be seen in both local and cloud environments.

@yusa-n
Copy link

yusa-n commented Sep 27, 2024

I have a same error with following cases:

  • Dify: Cloud, Self Hosted w/ latest branch.
  • FireCrawl: Cloud, Self Hosted (Docker) w/ latest branch.

My workaround is to use JinaReader instead at this moment.

CleanShot 2024-09-27 at 15 54 43@2x

@hjlarry
Copy link
Contributor

hjlarry commented Sep 27, 2024

this PR raise the issue #8391
and this PR fix it #8666
please wait the new version or git pull latest commit to fix this.

@JustinWangJP
Copy link

I also face the same problem. When will you publish a new version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants