"View as HTML" page should be generated from censored binary documents #934

hsenag · 2013-05-10T06:40:06Z

Currently, PDF attachments are first converted to HTML, then the PDF and the HTML versions are censored separately. Sometimes PDFs need custom censor rules based on the internal representation of the PDF. So it's also necessary to add a "plain text" rule to catch the HTML version and it's easy to forget to do this.

There's also a small risk that future changes to the HTML extraction would defeat the censor rule - the closer to the original source the censoring is done, the better.

HelenWDTK · 2023-06-05T07:12:43Z

+1 I'm frequently having to create extra censor rules specifically to deal with character encoding in the html preview. It also makes removing images from PDF files using censor rules much harder than it needs to be.

gbp · 2023-11-03T14:29:00Z

Done in #7681

garethrees added easier-admin Make issues easier to resolve x:uk x:volunteer and removed x:uk labels Nov 5, 2014

crowbot added the 0 - backlog label Sep 21, 2017

garethrees added f:redaction improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) labels May 29, 2018

garethrees mentioned this issue Sep 1, 2020

Sketch out the simplest thing we could do such that future changes in the code can't change the effect of existing censor rules #3903

Closed

garethrees removed the backlog label Feb 11, 2022

garethrees mentioned this issue May 17, 2022

FoiAttachment#body should apply masks internally #3723

Closed

gbp closed this as completed Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"View as HTML" page should be generated from censored binary documents #934

"View as HTML" page should be generated from censored binary documents #934

hsenag commented May 10, 2013

HelenWDTK commented Jun 5, 2023

gbp commented Nov 3, 2023 •

edited

Loading

"View as HTML" page should be generated from censored binary documents #934

"View as HTML" page should be generated from censored binary documents #934

Comments

hsenag commented May 10, 2013

HelenWDTK commented Jun 5, 2023

gbp commented Nov 3, 2023 • edited Loading

gbp commented Nov 3, 2023 •

edited

Loading