Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lightfuzz unhandled error with certain original_value characters coming from XML data #2182

Open
liquidsec opened this issue Jan 16, 2025 · 0 comments
Assignees
Labels
bug Something isn't working lightfuzz

Comments

@liquidsec
Copy link
Collaborator

 File "/home/zzzz/.cache/pypoetry/virtualenvs/bbot-N1wmp7PR-py3.12/lib/python3.12/site-packages/httpx/_urlparse.py", line 168, in urlparse
    raise InvalidURL(error)
httpx.InvalidURL: Invalid non-printable ASCII character in URL, '\n' at position 44.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/zzzz/bbot/bbot/core/engine.py", line 417, in run_and_return
    result = await command_fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zzzz/bbot/bbot/core/helpers/web/engine.py", line 84, in request
    async with self._acatch(url, raise_error):
  File "/usr/lib/python3.12/contextlib.py", line 231, in __aexit__
    await self.gen.athrow(value)
  File "/home/zzzz/bbot/bbot/core/helpers/web/engine.py", line 211, in _acatch
    f"Invalid URL (possibly due to dangerous redirect) on request to : {url}: {truncate_string(e, 200)}"
                                                                               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zzzz/bbot/bbot/core/helpers/misc.py", line 836, in truncate_string
    if len(s) > n:
       ^^^^^^
TypeError: object of type 'InvalidURL' has no len()

Issue appears to be related to parsing XML content to harvest parameters from it. Problematic XML:

<channel>
	<title>REDACTED</title>
	<atom:link href="https://REDACTED/feed/" rel="self" type="application/rss+xml" />
	<link>https://REDACTED</link>
	<description></description>
	<lastBuildDate>Mon, 06 Jan 2025 20:51:30 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://REDACTED/?v=6.6.2</generator>
</channel>

Possible solution:

Remove whitespace from between xml tags before processing and/or remove special characters from final original_value

@liquidsec liquidsec added bug Something isn't working lightfuzz labels Jan 16, 2025
@liquidsec liquidsec self-assigned this Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lightfuzz
Projects
None yet
Development

No branches or pull requests

1 participant