-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
anonymize_dataset fails if dataset contains RawDataElement #85
Comments
Hi @GuillaumeDehaene, please forgive me for the lack of reactivity, I now have a bit of time to work on this project. I understand the problem but I'm wondering how can this problem happen ? Do those RawDataElement come directly from DICOM files or are you anonymizing a custom dataset ? To fix this issue, instead of parsing the dataset twice to sanitize it, couldn't we just drop the tag if we detect that it is a RawDataElement ? This could be done in |
Hello @pchoisel No worries. Thank you for maintaining this project. So this was coming from a real DICOM dataset, but I'm not in contact with the person who generated it so I don't have any info regarding which software, which kind of data, etc. I guess we could also throw out any raw data but that feels very rough, doesn't it? Anyway, I've made my case and I think those are the options. Thank you again for your work on this project. |
I definitely don't want to parse the dataset twice. That's mainly because people (at least us) use this software in automated processes that anonymize lots of DICOM files. I'm fine with not dropping the What do you think ? |
Hello
If you don't want to double-parse, then it's probably possible to rewrite
the parsing code to:
- check each element for whether they are raw
- replace raw elements by standard elements
- apply anonymization
I'll write it out. I'll try to have a draft done for end of september,
hopefully?
Best
Guillaume
Le lun. 5 août 2024 à 15:43, pchoisel ***@***.***> a écrit :
… Hi @GuillaumeDehaene <https://github.com/GuillaumeDehaene>,
I definitely don't want to parse the dataset twice. That's mainly because
people (at least us) use this software in automated processes that
anonymize lots of DICOM files.
I'm fine with not dropping the RawDataElement. Have you checked the
function DataElement_from_raw
<https://pydicom.github.io/pydicom/dev/reference/generated/pydicom.dataelem.DataElement_from_raw.html>
? Maybe we could use it to handle the RawDataElement. If you want to keep
those tags as RawDataElement, maybe you can re-transform them before
writing the output file ?
What do you think ?
—
Reply to this email directly, view it on GitHub
<#85 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACZDZEDBGBWYSTGI2OUIE53ZP56OPAVCNFSM6AAAAABIKXCF2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRZGEYTGMZUG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
That sounds perfect ! |
Hello,
Thank you for this great project.
While using it in our codebase, I have found the following issue.
anonymize_dataset fails if dataset contains RawDataElement
anonymize_dataset fails if the dataset contains RawDataElement:
element.value = new_value
fails. In my case, it's in replace_element_UID (https://github.com/KitwareMedical/dicom-anonymizer/blob/master/dicomanonymizer/simpledicomanonymizer.py#L96) but I believe that all other rules have the same issue.Proposed solution
I would be happy to create a PR to fix this issue here too.
I see two possibilities:
I'm partial to solution 1.:
- it's simple and easy to understand and review.
- it makes it easier for the user to add custom-rules since they are able to assume that all elements are RawDataElement, and they can use the simpler: element.value = new_value syntax.
- however, it also means that input dataset are walked-through twice. I feel that this price is worth paying.
Best
Guillaume
The text was updated successfully, but these errors were encountered: