Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"View as HTML" page should be generated from censored binary documents #934

Closed
hsenag opened this issue May 10, 2013 · 2 comments
Closed
Labels
easier-admin Make issues easier to resolve f:redaction improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) x:volunteer

Comments

@hsenag
Copy link
Collaborator

hsenag commented May 10, 2013

Currently, PDF attachments are first converted to HTML, then the PDF and the HTML versions are censored separately. Sometimes PDFs need custom censor rules based on the internal representation of the PDF. So it's also necessary to add a "plain text" rule to catch the HTML version and it's easy to forget to do this.

There's also a small risk that future changes to the HTML extraction would defeat the censor rule - the closer to the original source the censoring is done, the better.

@HelenWDTK
Copy link
Contributor

+1 I'm frequently having to create extra censor rules specifically to deal with character encoding in the html preview. It also makes removing images from PDF files using censor rules much harder than it needs to be.

@gbp
Copy link
Member

gbp commented Nov 3, 2023

Done in #7681

@gbp gbp closed this as completed Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easier-admin Make issues easier to resolve f:redaction improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) x:volunteer
Projects
None yet
Development

No branches or pull requests

5 participants