Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

part_limit=128 is not very generous #62

Closed
agroszer opened this issue Oct 23, 2024 · 3 comments
Closed

part_limit=128 is not very generous #62

agroszer opened this issue Oct 23, 2024 · 3 comments
Assignees
Labels
Feature Feature requests or other non-bugs

Comments

@agroszer
Copy link

Hi,

part_limit=128 is not very generous, you can easily get to more data/widgets to submit.
Sadly zope.publisher does strict=False so parsing errors silently succeed. See zopefoundation/zope.publisher#80

I'd propose to increase that number, and eventually refactor the limits to module global constants, like this:

BUFFER_SIZE=1024 * 64,
HEADER_LIMIt=8,
HEADERSIZE_LIMIT=1024 * 4 + 128,  # 4KB
PART_LIMIT=128,
PARTSIZE_LIMIT=2**64,  # practically unlimited
SPOOL_LIMIT=1024 * 64,  # Keep fields up to 64KB in memory
MEMORY_LIMIT=SPOOL_LIMIT * PART_LIMIT
DISK_LIMIT=2**64,  # practically unlimited

...

class MultipartParser(object):
    def __init__(
        self,
        stream,
        boundary,
        content_length=-1,
        charset="utf8",
        strict=False,
        buffer_size=BUFFER_SIZE,
        header_limit=HEADER_LIMIT,
        headersize_limit=HEADERSIZE_LIMIT
        part_limit=PART_LIMIT,
        partsize_limit=PARTSIZE_LIMIT,
        spool_limit=SPOOL_LIMIT,
        memory_limit=MEMORY_LIMIT,
        disk_limit=DISK_LIMIT,
        mem_limit=0,
        memfile_limit=0,

That would make life much easier by being able to override defaults by patching.

@defnull
Copy link
Owner

defnull commented Oct 23, 2024

I suspected that the lower default limit may bite someone, but since most of the other limits are per-segment, the number of segments must be limited to protect against malicious requests. 128 segments á 96KB is 12MB already, that's a lot per request. My reasoning was that 128 is very generous for 'normal' forms, and for use cases where you have even more fields, you may want to tweak the limits anyway. So, just increasing the defaults may not be the best idea.

The proposed solution with the module level constants also won't work, as python evaluates default values for functions only once. Moving the default-fallbacks into the function body so they are evaluated at runtime (e.g. foo(arg=None): if arg is None: arg = DEFAULT) obscures the method signature and makes it less self-explanatory. Non-constant module globals are also kind of an anti-pattern. It is quite useful to be able to change the limits depending on the endpoint. File uploads need large size limits. Huge auto-generated forms with lots of small fields need lower size limits but a higher part count limit. Mutable global defaults would be a step backwards.

But there definitely is room for improvement: It is not well documented that parse_form_data accepts all limits understood by MultipartParser. That could be better.

And, more importantly, there is a bug or design error that I already noticed, but did not fix yet because it breaks backwards compatibility: Fatal errors that result in wrong or incomplete data always trigger an exception, even in non-strict mode. Reaching limits is a fatal error. The application should be notified that the form could not be parsed completely and some data may be missing. That's the idea, but parse_form_data suppresses all errors in non-strict mode. My reasoning back then was probably to make this helper function as painless as possible, but there should at least be an ignore_errors toggle to control this behavior. And that toggle should be off by default in future releases. strict=False should try to support broken clients, but it should not suppress fatal errors.

Tasks:

  • Improve documentation on limits, both in readme and docstrings
  • Introduce a parse_form_data(ignore_errors=True) parameter that is on by default for now, but should change to off-by-default in future releases.

@defnull defnull self-assigned this Oct 23, 2024
@defnull defnull added the Feature Feature requests or other non-bugs label Oct 23, 2024
@defnull
Copy link
Owner

defnull commented Oct 23, 2024

The error handling improvements are now tracked in #63

@defnull
Copy link
Owner

defnull commented Nov 28, 2024

Documentation is now also greatly improved. I hope this is in a good state now to support developers in deciding for reasonable limits for their use case.

@defnull defnull closed this as completed Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Feature requests or other non-bugs
Projects
None yet
Development

No branches or pull requests

2 participants