Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PYTHON-1982 Update Invalid Document error message to include doc #1854

Merged
merged 5 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion bson/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1006,7 +1006,10 @@ def _dict_to_bson(
elements.append(_name_value_to_bson(b"_id\x00", doc["_id"], check_keys, opts))
for key, value in doc.items():
if not top_level or key != "_id":
elements.append(_element_to_bson(key, value, check_keys, opts))
try:
elements.append(_element_to_bson(key, value, check_keys, opts))
except InvalidDocument as err:
raise InvalidDocument(f"Invalid document {doc} | {err}") from err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the pattern we want, in the C version of this code we recursively call write_dict() for subdocuments which means if we end up erroring in a nested field the error will be something like:
Invalid document {"a": {"b": {"c": ...}}} | Invalid document {"b": {"c": ...}} | Invalid document {"c": ...} | ...

Is that the intended behavior?

Copy link
Contributor Author

@navjots18 navjots18 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShaneHarvey You're right, we can check the top_level param to prevent this from happening

except AttributeError:
raise TypeError(f"encoder expected a mapping type but got: {doc!r}") from None

Expand Down
35 changes: 35 additions & 0 deletions bson/_cbsonmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -1743,6 +1743,41 @@ int write_dict(PyObject* self, buffer_t buffer,
while (PyDict_Next(dict, &pos, &key, &value)) {
if (!decode_and_write_pair(self, buffer, key, value,
check_keys, options, top_level)) {
if (PyErr_Occurred()) {
PyObject *etype = NULL, *evalue = NULL, *etrace = NULL;
PyErr_Fetch(&etype, &evalue, &etrace);
PyObject *InvalidDocument = _error("InvalidDocument");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of missing error/NULL checks in this code. I'm not comfortable merging this for 4.9 so it will need to wait for the next release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blink1073 @ShaneHarvey I took context from code written in the same file, would appreciate if can you point out some existing code from where i can see what all error/Nulls check to put here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @navjots18, apologies for the delay. I think all we're missing in this block is initializing the etype, evalue, and etrace to NULL as we do here. Otherwise I agree this matches the rest of the existing code block.

Copy link
Contributor Author

@navjots18 navjots18 Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @blink1073 fixed these issues, can you review again?


if (top_level && InvalidDocument && PyErr_GivenExceptionMatches(etype, InvalidDocument)) {

Py_DECREF(etype);
etype = InvalidDocument;

if (evalue) {
PyObject *msg = PyObject_Str(evalue);
Py_DECREF(evalue);

if (msg) {
// Prepend doc to the existing message
PyObject *dict_str = PyObject_Str(dict);
PyObject *new_msg = PyUnicode_FromFormat("Invalid document %s | %s", PyUnicode_AsUTF8(dict_str), PyUnicode_AsUTF8(msg));
Py_DECREF(dict_str);

if (new_msg) {
evalue = new_msg;
}
else {
evalue = msg;
}
}
}
PyErr_NormalizeException(&etype, &evalue, &etrace);
}
else {
Py_DECREF(InvalidDocument);
}
PyErr_Restore(etype, evalue, etrace);
}
return 0;
}
}
Expand Down
1 change: 1 addition & 0 deletions doc/contributors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,4 @@ The following is a list of people who have contributed to
- Ivan Lukyanchikov (ilukyanchikov)
- Terry Patterson
- Romain Morotti
- Navjot Singh (navjots18)
13 changes: 13 additions & 0 deletions test/test_bson.py
Original file line number Diff line number Diff line change
Expand Up @@ -1099,6 +1099,19 @@ def __repr__(self):
):
encode({"t": Wrapper(1)})

def test_doc_in_invalid_document_error_message(self):
class Wrapper:
def __init__(self, val):
self.val = val

def __repr__(self):
return repr(self.val)

self.assertEqual("1", repr(Wrapper(1)))
doc = {"t": Wrapper(1)}
with self.assertRaisesRegex(InvalidDocument, f"Invalid document {doc}"):
encode(doc)


class TestCodecOptions(unittest.TestCase):
def test_document_class(self):
Expand Down
Loading