Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: make JSON indentation configurable when merging COCO files #171

Merged

Conversation

philenius
Copy link
Contributor

Thank you for providing this great library. I had the issue that merging two COCO files (each 2.8 MB in size) resulted in a 19 MB large COCO JSON file (too large in my opinion). The huge difference in size was caused by the JSON indentation which is set to 2 spaces:

with open(output_file, "w") as f:
json.dump(output, f, indent=2)

Previously:

Merging two COCO files with fixed indentation of 2 spaces:

from pyodi.apps import coco
coco.coco_merge('coco_annotations1.json', 'coco_annotations2.json', 'output.json')

resulted in:

-rw-rw-r--  1 phil phil 2,8M Dez 13 18:59 coco_annotations1.json
-rw-rw-r--  1 phil phil 2,8M Dez 13 18:59 coco_annotations2.json
-rw-rw-r--  1 phil phil  19M Dez 13 19:01 output.json

With my changes:

Merging the same two COCO files with indentation disabled by setting json_indentation=None:

from pyodi.apps import coco
coco.coco_merge('coco_annotations1.json', 'coco_annotations2.json', 'output.json', json_indentation=None)

results in:

-rw-rw-r--  1 phil phil 2,8M Dez 13 18:59 coco_annotations1.json
-rw-rw-r--  1 phil phil 2,8M Dez 13 18:59 coco_annotations2.json
-rw-rw-r--  1 phil phil 5,6M Dez 13 19:01 output.json

@philenius philenius changed the title Feature/make json indentation configurable Feature: make JSON indentation configurable when merging COCO files Dec 13, 2021
Copy link
Collaborator

@daavoo daavoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

I would suggest adding a simple check (in https://github.com/Gradiant/pyodi/blob/master/tests/apps/test_coco_merge.py) ensuring that indent is passed to json.dump.

pyodi/apps/coco/coco_merge.py Outdated Show resolved Hide resolved
pyodi/apps/coco/coco_merge.py Outdated Show resolved Hide resolved
@philenius
Copy link
Contributor Author

@daavoo thank you for your feedback. I applied all of your suggestions.

As requested, I have added a unit test where I patch the call to json.dump() and assert that it was called with indent=None and indent=2. I hope I have understood you correctly.

@philenius philenius requested a review from daavoo December 13, 2021 20:54
@mmeendez8 mmeendez8 merged commit 3bfd8bb into Gradiant:master Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants