Skip to content

Meeting Notes

Alexis Métaireau edited this page Oct 15, 2024 · 35 revisions

Monday - 2024-10-14

Alexis (@almet)

Done:

- Reviews for on host conversion + Pymupdf vendorizing
- Added tests for #193 (PR to come). I changed the approach in the middle and this is now mocking subprocess.popen
- Merged the github issue templates
- Debug session with @apyrgio, finding out why colima wasn't working.

Todo:

- [ ] Finish #193 and propose a PR
- [ ] Continue on the design document and explorations around independent container updates
- [ ] Create the issue about using click + argparse
- [ ] Release the Ubuntu 24.10 deb

Alex (@apyrgio)

Done: - Final review of #939 (#941 is blocked on the on-host conversion PR) - A few more changes on #940 (vendor PyMuPDF) and #748 (on-host conversion) - Debug session with Alexis about Colima - Collected some highlights of Dangerzone for 2024

Todo: - [x] Fix the xvfb issue in our tests - [ ] Send a PR for our release instructions about RCs/betas - [ ] Release a Fedora 41 RPM - [ ] Add logger to dangerzone/init.py, and merge the PR for vendoring PyMuPDF - [ ] Finalize some additions to the on-host conversion PR, based on Alexis' comments - [ ] Check merge queue feature in GitHub (leftover from last week) - [ ] Check internally if we can alleviate our Windows release pains by using SignPath and signing on a GitHub actions pipeline (leftover from last week)

Discussion points

  • Tooling: We currently use overlapping projects in different parts of the code (for instance click and argparse). Should we use the same tools everywhere? Can it be click and requests everywhere?

    • ... or the other way around?
    • Why do so in the first place? Uniform tooling, one way to do things. Better in terms of tracking dependencies. Some options are simpler.
    • What if the alternative is a standard library one? Stable API, could be a tad better now. Could work for simple tasks. No need for a package manager.
    • Side effect of using the same libraries for all scripts and code: Need for package manager for running the scripts, install dev deps, maybe add an extra step in our GitHub actions.
      • (@apyrgio) If we make our users enter a Python env to run our build scripts, what will happen when packaging a .deb / .rpm? What if the distro needs a Python package bundled with rpm/apt, and our Python environment does not include it? May be a far-fetched concern...
      • Should we replace click with argparse in our main code? Alexis uses click nowadays, Alex probably has as well recently, and click was Micah's original choice. So let's stick with it.
      • Next steps:
        • Create an issue for this (uniformization of utils)
        • Tag it as a "good first issue"
        • Check with a CI test that having a Poetry environment for our dev scripts works
        • Then, allocate some time to replace urllib with requests and argparse with click.
  • Tests: Our tests currently break when we setup xvfb.

  • Linux platforms support: Maybe update our release instructions to check not only if our supported distros have new versions out, but also if they have release candidates out as well, so that we can be prepared.

    • We had the following discussion a few months ago: Track upcoming distros and open a ticket when there's a pre-release version of a supported distro, e.g., "Support Debian trixie" Add row to CI testing matrix Manual QA Monitor: Fedora, Debian, Ubuntu release and prerelease dates Ticket for supporting a new version would typically be closed after the Linux distribution's stable release, but decoupled from Dangerzone releases To check -- can we automate alerting for new versions / vs. monthly reminders?

    • Next steps:

    • Monthly reminder for finding out upcoming releases for Fedora, Ubuntu, and Debian.
    • Update our release instructions to say that we should check about EOL distros, new distros, and RCs / betas.
    • If a release is up and coming (RC or beta), we should:
    • Add it in our QA and create packages for it.
    • If QA passes, release a .deb/.rpm package for it, unless there's a Dangezone release about to get out soon.
    • Turn the above into an issue for 0.8.0 milestone.
  • Fedora release is out, add support for it :-)

    • Inform the user about it
    • Add QA for it, create a package from our 0.7.1 branch (and container image)
    • Do the same for Ubuntu 24.10, which came out

Wednesday - 2024-10-09

Alexis (@almet)

Done: - Added a PR adding a --debug flag to dangerzone-cli - Progress on #193 (Container installation error).Started writing tests - Did a bit of research around #745 (Research on Independent container updates) to better understand the landscape around image signing. Started drafting a design document. - Reviewed Bump H2ORestart to version 0.6.6 #943 - Reviewed Preparations for the on-host conversion PR #932 - Updated #920 according to feedback (GH issue templates)

Todo: - Continue research on #745 - Add tests to #193 and propose PR for review - Review Perform on-host conversion for the pixels to PDF stage #748

Alex (@apyrgio)

Done: - Merged the first PR that I sent in preparation for the on-host conversion feature (https://github.com/freedomofpress/dangerzone/pull/932) - Sent the second PR towards adding the on-host conversion support feature (https://github.com/freedomofpress/dangerzone/pull/940). This time, it's about vendoring the latest PyMuPDF package in Dangerzone, when building our Debian packages. - Merged an outstanding contributor PR that handled illegal filenames (https://github.com/freedomofpress/dangerzone/pull/942) - Sent and merged a PR for an outdated H2ORestart plugin (the one we use so that South Koreans can convert .hwp files) * It seems that our CI has stopped failing since then. - Started reviewing two PRs by Alexis (GH issue templates, --debug flag in dangerzone-cli - Tidied up and polished the on-host conversion PR (https://github.com/freedomofpress/dangerzone/pull/748), so that it's ready for review once more.

Todo: - [x] Final review of the two outstanding PRs by Alexis (#941, #939) - [ ] Reply and hopefully merge the PR for vendoring PyMuPDF. - [ ] Check merge queue feature in GitHub (leftover from last week) - [ ] Check internally if we can alleviate our Windows release pains by using SignPath and signing on a GitHub actions pipeline (leftover from last week) - [x] Collect noteworthy stories for Dangerzone for our annual impact report

Discussion

  • About #745:
    • Alexis would like to check the container signing landscape first, before committing on handling the signing stuff on our own.
    • What we want seems pretty simple, so maybe we can reuse something that is out there.
  • Impact report!
    • We should list the highlights of 2024 for the Dangerzone project. The ones that would interest our backers should go to our impact report. The rest may be helpful for (see below)
    • Idea to also publish part of it as a blogpost on dangerzone.rocks
  • Possible metrics for on-host conversion PR
    • Container image size. 680MiB -> 490MiB (!!)
    • The application size. Currently it stays the same (maybe except for Linux)
    • OCR performance: how faster is it for Windows and macOS users?
    • Test coverage: are more tests failing right now (large tests)?
    • Complexity: One less container, no more mounts and pixels as files, more inline with Qubes.
    • Not metrics, but:
      • We need to change our documentation
      • Make language packs in Linux into suggests (or recommends), which would help Tails as well.
  • Should we include macOS entitlements (https://github.com/freedomofpress/dangerzone/pull/639) once we merge the on-host conversion PR?
    • Let's check if they work, on top of the on-host conversion PR. If we can create a .dmg and install it on a macOS system, maybe it's worth merging it.

Monday - 2024-10-07

Alexis (@almet)

Done: - [x] Shiped the move to github actions - [x] Reviewed https://github.com/freedomofpress/dangerzone/pull/932 - [x] Merged https://github.com/freedomofpress/dangerzone/pull/926 - [x] Created an issue about nightly builds, publishing the artifacts is "just there". - [x] Created a PR for the issue templates - Drafted a working solution for https://github.com/freedomofpress/dangerzone/issues/193 (applied @apyrgio changes + some work on top), needs more testing and we should be good to go - Had another look at why dz is not working on colima on OSX over the course of the weekend, slow progress there.

TODO: - [ ] Add tests for the #193 fix - [ ] Research / spec for independent container updates

    - Summarize what is the direction we want to follow, + technical proposal

Alex (@apyrgio):

- [x] Helped a bit with the GitHub actions migration PR (mainly to fix some Debian packages issues)
- Rebased the on-host conversion PR on top of the latest main branch
- TODO:
    - [x] Reply to Alexis' review comments on #932 and ideally merging it.
    - [x] Review GitHub issue templates PR.
    - [ ] Check merge queue feature in GitHub
    - [x] Send a PR that just vendors PyMuPDF
    - [ ] Check internally if we can alleviate our Windows release pains by using SignPath and signing on a GitHub actions pipeline.
    - [χ] Allocate a slot to debug Colima issues. Not the highest priority, but something to have in mind.

Discussion

  • Colima: We have checked disabling AppArmor and there was no improvement.

Wednesday - 2024-10-02

Release of 0.7.1 is out. Let's focus on what's next on the 0.8.0 series.

Preparation for the team meeting tonight. Items to discuss: - On host conversion: #625 - GHA migration: #674 - Container install failure: #193 <-- I believe we should do this one, basically adding debug logs - GHI templates - Should we support podman Desktop ?

    - In addition to Docker Desktop? One way is to move from one to the other. Should we deprecate the support somehow?

- Would release notes be enough?

- Should we display a notification to the users?

- Maybe it's possible to bundle podman, that would be the ideal solution.

Discussion about using uv: - might speedup the installation

TODOS: Alexis (@almet): - [ ] Let's ship this move to github actions! - [ ] Review https://github.com/freedomofpress/dangerzone/pull/932 - [ ] Merge https://github.com/freedomofpress/dangerzone/pull/926 - [ ] Create an issue about nightly builds, publishing the artifacts is "just there". - [ ] Create a PR for the issue templates - [ ] Propose a solution for https://github.com/freedomofpress/dangerzone/issues/193

Alex (@apyrgio): - Final review of the GitHub actions PR * Also, add some commits from 0.7.1 testing - Fix Debian packaging on main (update the changelog to 0.7.1) - Rebase on-host conversion PR based on the merged PRs - Check merge queue feature in GitHub

For later: - WIX migration (windows installer)

Monday - 2024-09-30

Alex:

  • Updated the gVisor design doc, to reflect the most recent developments.
  • Generated an SBOM for Dangerzone
  • Followed up on some user issues, and found out that Dangerzone is affected by some recent Docker updates
    • Started working on a hotfix branch.
  • Sent a separate PR to pave the way for the on-host conversion PR.
  • Reviewed GitHub actions PR
  • Reviewed a PR for migrating to Wix v5

Alexis:

  • Spent some time trying to advance the situation for supporting colima
  • Tried Podman Desktop on OSX, it works
  • Merged #906 - Wrong container runtime detection on Linux
  • Started updating the "migrate to GH actions" branch: https://github.com/freedomofpress/dangerzone/pull/907

We've seen new reports about issues with Docker piling-up, what could have we done differently / how did we found the issue?

Discussion about the current situation where we have a few bugs requiring us to to a 0.7.1 release: - What is the fix? Add two lines about the two IDS in the image-id.txt file - What is the proper fix? We want to have a separate issue about tracking another ways to reference the image (maybe moving away from IDs and using signatures + specific labels we control ?) https://github.com/freedomofpress/dangerzone/issues/745

Let's have a post mortem discussion after this 0.7.1 bugfix release.

Wednesday - 2024-09-18

Alex:

  • Re-reading the gVisor blog post, answering to comments by readers
  • Reviewed #906 (container runtime detection)
    • While reviewing it, I semi-implemented a solution to #193. I can send a PR once merged.
    • I'd definitely like to add some GUI tests though
  • Merged a small PR by a contributor (#916)
  • Started reviewing #907 (CircleCI to GitHub actions)
  • Sent a Debian bug report for PyMuPDF
  • TODO:
    • Help Windows user (#922)
    • Review #909
    • Check Erik's PR for the gVisor blog post
    • Update the gVisor design document with some extra changes (container_engine_t for starters)
    • Finish the review of #907
    • Rebase on-host conversion on top of #907

Alexis:

  • Continued working on the Github Actions migration.
  • Read user reseach on DZ
  • Included feedback on #906 (container runtime detection)
  • Helped figure out what was the problem for M1 mac users with Docker Desktop not working.

TODO: - Merge #906 as soon as CI is back to green (after the brownout, this aftn) - Finish the work on CI migration to Github. - #865 - Have a look at why Colima isn't working - (Low priority) Check if it's possible to remove Java from our container image and use fonts instead

Discussion points:

  • Github CI migration / questions

Monday - 2024-09-16

  • When should we meet?

    • Monday: 10am :fr / 11am :greece

    • Wednesday: 10am :fr / 11am :greece

  • Quick point on the contributors / users asking for help

    • Say to them that we don't know what's going on and we'll add some debug information
  • Planning / What's next?

Alex:

  • Found CI issue with Alexis
  • Silenced the libexpat CVEs
  • Tested the on-host conversion PR with a vendored PyMuPDF, and it works
  • Checked if Etienne's DirectFS PR hurts our performance. We have a small slow-down, but other than that, we should be fine.
  • Worked on the diagrams for gVisor blog post and resolved some internal comments
  • TODO: Review container runtime detection PR
  • TODO: Fix the CI situation on on-host conversion PR.
  • TODO: Send the next round of the on-host conversion PR.
  • TODO: Publish the gVisor blog post

Alexis:

  • Quick follow-up on the libexpat CVE

  • Pair-Debugged why the CI were failing on Fedora 40 and Debian Trixie, updated the runners as a result

  • Updated and merged issues:

    • #901 - Replace stdeb in favor of modern Debian packaging tools

    • #902 - Use PyMuPDF wheels for non-ARM architectures

    • #904 - Do not throw on malformed Desktop Entries on Linux

  • Updated #905 - runtime detection and display errors to the user

  • Started working on migrating to Github Actions for the CI

  • Proofread the gVisor x Dangerzone blogpost

TODO: - Continue the work on CI migration to Github, trying out Github Container Registry - #865 - Have a look at why Colima isn't working - Read some research that was done about DZ usage - (Low priority) Check if it's possible to remove Java from our container image and use fonts instead

Monday - 2024-09-09

  • Should we have a release out?

    • We probably don't have to issue a release right now, but we should prepare for the October 15th date.
    • Independent containers might be for the release after it.
  • We should be able to bring everything listed in the 0.8.0

  • Should we go ahead and include orbstack and colima in 0.8.0 : Yes.

  • About seccomp filters:

    • Most likely, we should follow Etienne's approach and just ship a seccomp filter that accomodates gVisor.
  • About Docker Desktop alternatives:

    • Let's not "bless" an altnernative for the time being, but just fix the seccomp issue. If we do so, most likely the rest of the alternatives will work.
    • After that, we should look closer into alternatives, and ideally "bless" an open-source one.

Alexis:

  • TODO: (Pair) Work on the CI failures
  • TODO: Change the architecture to all on the pybuild branch
  • TODO: Once CI is green, merge the pending PRs
  • TODO: Reproduce the colima errors and find a solution for it
  • TODO: Move to CircleCI

Alex:

  • TODO: Pair with Alexis to find out the reason behind the CI failures
  • TODO: Explain and silence the libexpat CVEs
  • TODO: Test the on-host conversion PR with a vendored PyMuPDF
  • TODO: Check if Etienne's DirectFS PR hurts our performance
  • TODO: Inform internally about CVE direction and open discussion for independent container updates
  • TODO: Review container runtime detection PR

CircleCI issue debug

  1. It seems that Podman returns 126 error code at some point. Our Dangerzone module assumes that 126 is thrown by qrexec, and therefore returns a Qubes-related error. We need to fix this.

Wednesday - 2024-09-04

How should we handle the CVEs generally speaking? When to get other people involved?

The current way of doing things is that we should look at CRITICAL CVEs. Beforehand, we weren't looking at the other ones. AlexP is also looking at HIGH as well.

When do we involve people?

  • Create an issue on a private repo and ping people on the signal group.
    • Every issue in the private repo will become public at some point, once our initial assessment is done.
  • It's better to escalate more than less

What about medium CVEs that happen to affect us?

Alex: I think that it's possible that we encounter a poorly-graded CVE at some point. At the same time, this will probably happen with the same frequency as zero-days discovered by attackers. For example, I'm always worried that LibreOffice has a zero-day RCE that renders all the other CVEs moot. So, because we have little manpower, I think it's best to spend our energy in doing two things; updating the container image more frequently, and having a srtonger sandbox. We have done done the latter with gVisor, and we should work on the former ASAP.

Alexis: Would like more context on that. Not sure if we can trust the level of CVEs, given that I've seen medium CVEs with RCEs. One other issue is how distros handle CVEs. Alpine does not patch packages, but just builds the latest release by the upstream. Debian on the other hand, does provide patches for packages, so that's another factor to consider.

Proposal:

  • Existing CVEs < 2024:
    • At some point assess the old ones. A quick assessment is to ignore any CVE that's before 2024, and does not have a fix.
    • Else, we can have a list of old CVEs we want to assess per week, and have a final conclusion.
    • If we choose to ignore a CVE, we will add an explanation in grype.yaml.
  • Existing and new CVEs >= 2024:
    • We can ignore the Low severity ones (ask Giulio -- what are the hints we need to look at?)
    • Assess the rest of the CVEs, on a case-by-case basis. We believe that the rate is not fast enough, and we have some time to assess them.
    • We have a reminder for assessing CVEs every week.
  • CI:
    • We can decide in the future if we will lower the alert threshold of our security scans from Critical -> High. This will depend on the rate our container image accumulates High severity CVEs.

How do we keep track of what we did? I wasn't sure what was already reviewed, apart from what was in the grype ignore list.

This is fixed by writing down our assessment in our private repository.

Should we upgrade Java? Currently we're using openjdk 8 which is a bit old. (it's not directly related, but probably worth discussing)

Apparently there are no known blockers to upgrade java. At the same time, it seems it's needed only for two things:

  1. Rendering xls documents: https://github.com/freedomofpress/dangerzone/issues/315
    • Wait a second, this looks like a font issue. OpenJDK does install fonts-dejavu. Also, LibreOffice does not seem to require Java to function. Could it be that we need it just for the font dependency?
  2. The H2O extension, which is important for our South Korean users.

General discussion on container updates. How should we do this?

  • Discussion about how do deal with supporting different versions

Monday - 2024-09-02

Post-summer updates:

@almet:

  • Took a look at a Java CVE. Turns out (we think) it was not a critical CVE, but we need to find a way to do independent container updates soon.
  • Worked on finding an alternative to building Debian packages.
  • Sent a PR for using pre-built PyMuPDF libraries.
  • Cross-team chat on potential synergy between SD client and Dangerzone:
  • Worked on proper error messages for Dangerzone on Linux (s/Docker/Podman/)
  • Currently working on the CircleCI -> GitHub
  • TODO: Fix the failing PRs on Fedora 40
  • TODO: Fix Debian trixie not building because podman is missing

@apyrgio:

Monday - 2024-07-08

We've had a chat where we discussed the following actions.

For the release, we're waiting for the host to be back online, and we will perform the updates afterwards. We want to rebuild our assets for the following reasons:

  1. Certifi (which bundles CA certificates) has issued a new version where they remove the GLOBALTRUST CA (see https://groups.google.com/a/mozilla.org/g/dev-security-policy/c/XpknYMPO8dI?pli=1 for more context)
  2. There is a CVE about SSL, which doesn't affect us directly, but we will take the opportunity to upgrade the container in order to build more trust.

About the release, the next course of actions for us will be:

  • @apyrgio will do the releases for Windows and MacOS intel.
  • @almet will do the releases for the different flavours of Linux.

We've also discussed more broadly about the security considerations of trusting external CAs.

@almet will create an issue about trusting external CAs, where more context will be provided.

Thursday - 2024-06-27

Alex:

  • Reviewed the changelog PR
  • Debugged a Docker-related issue on macOS, where gVisor could not execute due to an older seccomp policy.
    • Created an is&sue for the above (#846) and sent a PR.
  • Looked into the new PyMuPDF wheels for musl environments, and found out that they don't work in ARM architectures (see issue #850).
    • I have a draft PR for that (#851), which has to wait until PyMuPDF adds the neccessry wheels for aarch64.
  • Currently performing QA on Qubes and Fedora 40

Alexis:

  • Started drafting a release on the Apple Silicon machine
  • Reviewed Alex_P PR about using custom seccomp profiles on some specific Docker Desktop versions (see above)
  • Bumped python to 3.12 for Windows and macOS build, finding some bumps in the road.

Tuesday - 2024-06-25

Alex:

  • Reviewed the Drag and Drop PR on Windows, Linux, and Qubes
  • Quick review of some PRs that Alexis sent for 0.7.0
  • Did some extra tests for the on-host conversion PR on some more Linux platforms
  • TODO: Review the updated changelog

Alexis:

  • Started preparing the release

Discussion: - How should we automate/simplify the CHANGELOG generation ? We should add a rule on the CI which asks for a CHANGELOG entry before merge. - We'll tag the 0.7.0, and not have a 0.7.0-rc for now - Next steps for 0.7.0: * QA for the different platforms. Let's divide them between us! * macOS intel (apyrgio) * fedora (apyrgio) * Qubes (apyrgio) * Windows (apyrgio / almet) * macOS arm (almet) * ubuntu (almet), using the qa.py script.

Thursday - 2024-06-20

Alex:

  • Reviewed Alexis' PRs about CI tests for our .deb/.rpm packages
  • Sent a PR for PySide6 that silences an error for Fedora 41
  • Battling with the various PyMuPDF versions in Ubuntu/Debian for the on-host conversion PR.
  • TODO: Review Drag and Drop PR

Alexis:

  • CI is now installing .deb and .rpm packages and doing a conversion on them
  • Found an issue about windows line endings when building the dz image, fixed it in a PR
  • TODO:
    • Prepare the next release
    • Review Pyside6 PR

Tuesday - 2024-06-18

Alex:

  • Some roadblocks right now: some Linux version do not ship with a complete version of PyMUPDF, so some work was needed to make it work there
  • Ubuntu Jammy has two issues:
    • muPDF is not compiled with OCR support
    • It fails with a memory violation.
  • Seen Alexis' PRs

Alexis:

  • Took a stab at installing the CI for installing .deb files
  • Made the drag-n-drop feature pass the CI tests

Thursday - 2024-06-13

Alex:

  • We merged the gVisor PR!
  • I took a look at Alexis' suggestion for detecting outdated Docker Desktop versions
  • I rebased the on-host conversion PR over the latest main branch, and tried it out on Linux and macOS
  • TODO: Create a separate issue for experimenting with a Debian-based container image: what are the build time benefits, size improvements, and if it works with the latest changes (gVisor).
  • TODO: Review the roadmap in two weeks

Alexis:

  • Read the Drag-n-drop PR (#752) and rebased it on latest main branch
  • Prepared the work for tomorrow "sprint planning", by reading the issues that will probably go into it
  • Reviewed #834
  • Installed the new Framawork machine

Discussion:

  • Issues for 0.7.0 milestone
  • Libreoffice dependency. Pandoc might be a good candidate.
  • Ideas about other ways to sandbox

0.7.0 Milestone

What we want to achieve with this release:

Since January, we're trying to decouple the container from the application.

  • The long term goal is to let Debian and Fedora handle packaging on their side. It's not currently possible because we ship with the container directly.
  • We also have UX-related stuff, like the drag-n-drop.
  • Improving the overall security with findings from the security audit (gVisor + Mac entitlements)
  • Leave CircleCI and move to GH Actions

Tuesday - 2024-06-11

Alex:

  • Polished the gVisor PR and retested across all of our platforms.
  • TODO: Didn't manage to rebase the on-host conversion PR, will do so today.
  • TODO: Take a look at #830
  • TODO: Merge the design doc PR (#815)

Alexis:

  • Started working on the Docker Desktop version verification on macOS and Windows machines.
  • TODO: (Today) Push Docker Desktop version checks
  • TODO: (Today) Have a look at the segfault PR #832
  • TODO: Read the changes to the gVisor PR
  • TODO: Review the gVisor docs #815
  • TODO: Review #625
  • TODO: Review #748
  • TODO: Map the space of Docker Desktop alternatives (Colima, WSL, etc. - probably with a focus on macOS)

Discussion:

  • Let's allocate a slot to put our 0.7.0 issues in a roadmap.

    • We are down to 8 PRs, and with gVisor merged, we'll be down to just 5.
    • Let's do it on Thursday morning
    • Plan is to:
      • Familiarize ourselves with our issues (can happen async)
      • Decide on which issues should be added/removed
      • Estimate time for each issue (velocity)
      • Allocate people to issues
      • Finally, draft a rough roadmap
      • Take into account that summer is approaching, make time for maintenance tasks that always pop up
  • Qubes laptop

    • Framework 13 should be good for installing Qubes, let's see.

Thursday - 2024-06-06

Alex:

  • Started working on a gVisor presentation, and realized that I had some knowledge gaps. Read a bit more on gVisor to cover most of them.
  • While working on the presentation, I realized that we can run gVisor fully rootless, and experimented with that.
  • Sync with Alexis, where we debugged various issues, had some performance-related discussions, and talked a bit more about the dev workflow.
  • TODO: Wrap up the gVisor presentation.
  • TODO: (tentative) Rebase on-host conversion PR

Alexis:

  • Landed "small changes": we now have less dead imports and code, and bare exceptions
  • Bumped the minimum python version to 3.9 in the pyproject.toml and to 3.8 on debian derivatives (as PySide6 is not packaged there). It allows packaging the main branch for Linux distributions
  • Pushed some changes in the way pytest fixtures are loaded, and fixed an issue in the tests (it was not using the right fixture), removing the run the tests on separate processes.
  • Synced w/ Alex on various issues, especially on setting up the dev environment properly, and debugging some podman issues.
  • Looked at ways to circumvent a podman issue which currently makes it impossible to run emulated environment when using "docker on docker" setups.
  • TODO: Have a look at the on-host conversion PR (#748)

Discussion:

  • Next steps for Alexis
    • 2 PRs that could be reviewed: on-host conversion PR (#625) and drag-and-drop functionality (#752)
    • 1 issue that can be tackled: outdated Docker version (#693)
    • Map the space of Docker Desktop alternatives (Colima, WSL, etc. - probably with a focus on macOS)

Tuesday - 2024-06-04

Alexis:

  • Working on the Python update, finding that PyMUPDF seems not installed in the container image.

Alex:

  • Review some open PRs by Alexis, and sent some PRs that fix CI issues.
  • Made some final changes to the gVisor design doc.

Discussion:

  • How to organize our work?
    • We could have 4 week sprints, starting on Thursdays
    • Retrospectives may not necessarily happen on a strict schedule, since we have several touchpoints within a week.
  • Why can't the Dangerzone container image find PyMuPDF?
    • That's because they have renamed fitz to pymupdf upstream, and we didn't copy the pymupdf path as well.

Wednesday - 2024-05-29

Alex:

  • Lined up all the necessary PRs for fixing our segfault issues on Fedora
  • Sent a PR that updates our release/signing instructions with how to sign/verify the Dangerzone source.
  • Started working again on the gVisor PR, making sure it runs on every supported platform.
  • Paired with Alexis regarding the Python 3.9 bump

Alexis:

  • Found out my M1 mac will not be able to cut it for running a "docker in docker" setup
  • Setup a dev environment on my Linux machine
  • Checked I was able to install the fedora rpms and run a conversion on them
  • Debugging the current Python 3.9 bump PR

Monday - 2024-05-27

Alex:

Alexis:

  • Debugged (and fixed!) an issue with the GUI tests
  • Did a PR changing how the test fixtures are used
  • Did a PR removing dead code and imports
  • Started working on Python version bump
  • Continued reading design documents
  • Reviewed some PRs by AlexP: #590, #815
  • TODO: continue the work on python version bump
  • TODO: Setup GPG keys for [email protected] and use this as my identity for DZ
  • TODO: Setup podman to work with OSX and dev_scripts/env.py

Discussion:

  • How to handle merging dependabot PRs (really, any PR)
    • Continue merging the same way as we are doing thus far. Might be worth going down the merge queue.

Thursday - 2024-05-23

Alex:

  • Writing a design document for gVisor. Along with this PR, I'll add some design documents for update notifications, and for Dangerzone environments.
  • We need to fix an issue with our container image builds. They fail due to an updated Alpine Linux image.
  • Progress on the Tails front
  • TODO: Review the almet PR for various chores
  • TODO: Debug the gVisor PR

deeplow: - TODO check comments on on-host conversion PR - TODO review UX designs - TODO continue drag-and-drop exploration

Alexis:

  • First week :-)
  • Reading a bunch of issues about packaging, and about PyMuPDF
  • Reading the current codebase
  • Currently working on a PR with minor changes on the codebase
  • Setup my development environment
  • TODO: 780 issue on github (PySide6 no longer supports Python 3.8)

Wednesday - 2024-03-27

Alex:

  • Updated the Dangerzone hiring exercise with some container-related questions.
  • Looked into CVE-2024-28757, which doesn't seem to affect us.
  • Merged a PR that fixes our failing nightly builds, due to a PyMuPDF regression (dangerzone#753)
  • Lots of UX discussions
  • Investigated into a timeout issue, but still haven't found the culprit (https://github.com/freedomofpress/dangerzone/issues/749)
  • TODO: Make fixes on on-host conversion PR
  • TODO: Publish PySide6 6.6.3
  • TODO: Review drag-and-drop PR

Deeplow:

  • Open Drag and Drop PR
  • TODO On-host PR: Not including tessdata in our packages or userns=0
  • TODO Review new designs
  • TODO prune large tests and find document with multimedia

Wednesday - 2024-03-20

Alex:

  • TODO: F39 - Timeout
  • TODO: Follow up on on-host PR comments

deeplow: - TODO check comments on on-host conversion PR - TODO review UX designs - TODO continue drag-and-drop exploration

Discussions: - implementation chocies for drag-and-drop exploration - UX feedback


Wednesday - 2024-03-13

Alex:

Deeplow:

Monday - 2024-03-04

Alex:

  • Looked into a contributor's PR for gVisor support
  • TODO: Publish PySide6 6.6.2
  • TODO: Fix shebang issues that break our RPM packages
  • TODO: Shape 0.7.0 release
  • TODO: Draft workshop proposal for TCIJ's Summer Conference

deeplow:

Discussion:

  • Dangerzone 0.6.0 failure on Fedora 38:
    • The reason we didn't catch it is because we don't normally test all Fedora templates for Qubes (Fedora 39 is not affected). We did make test run on Fedora 38, but that was using a dev script, and that script correctly uses the Tessdata prefix (https://github.com/freedomofpress/dangerzone/issues/704#issuecomment-1976114322)
    • We can have better tests for Qubes as we have for Windows: run all the tests but short circuit the isolation provider (kind of like dummy). In the long run, we need a Qubes CI
    • Release a Qubes package for Fedora 38 (add a -2 revision for Qubes) with a minor fix for that.

Thursday - 2024-02-29

Alex:

  • We released 0.6.0!

deeplow:

  • Test final artifact on macOS and generate hashes
  • Review/Approve pending PRs on apt-tools-prod and yum-tools-prod

Discussion:

  • what should the behavior be when a setting is deprecated? Answer: no need. When we have a setting that should be deprecated, we should add that logic then and appropriate tests.

Monday - 2024-02-26

Alex:

deeplow:

  • Large tests: Find out if we had any more failures than normal.
  • TODO Once test Approve pending PRs on apt-tools-prod and yum-tools-prod
  • TODO: Draft announcement for license change comms
  • TODO: find out how to create a release draft template on GitHub
  • WIP: move scenario 10 to CI #719
  • Test final artifact on macOS and generate hashes
  • Update links on Dangerzone's website and hashes for the artifacts (@deeplow)

Discussion:

  • UX directions for dangerzone. Feedback to be given by @deeplow to Superbloom
  • Release tasks:
    • Test release builds on various platforms (Windows, macOS (Intel / M1), Fedora 39)
      • apyrgio: Fedora 39, Windows
      • deeplow: Test on macOS
    • Approve pending PRs on apt-tools-prod and yum-tools-prod (@deeplow)
    • Write release announcement (license change, acknowledge contributors, Docker updates, Docker Desktop Windows bug, link to PyMuPDF integration issue) (@apyrgio)
    • Update links on Dangerzone's README.md (@apyrgio)
    • Update links on Dangerzone's website and hashes for the artifacts (@deeplow)
    • Propose an announcement for Mastodon and other social media (@apyrgio)
  • Post release tasks:

Wednesday - 2024-02-21

Alex:

  • Wrapped up QA on macOS
  • Merged the remaining PRs for 0.6.0
  • Helped debug a Qubes issue
  • TODO: Redo the QA on Windows
  • TODO: Rebuild arfifacts on macOS Intel/M1
  • TODO: Tag 0.6.0, create draft release, upload build artifacts using our script

deeplow:

  • TODO: Large tests: Find out if we had any more failures than normal.
  • TODO: Go through release checklist
  • TODO: Draft announcement for license change comms
  • TODO: find out how to create a release draft template on GitHub
    • Remember to update the docker desktop (while we don't do in-app updates)
  • TODO: move scenario 10 to CI #719

Monday - 2024-02-19

Alex:

  • Completed QA on Windows
    • Found some issues with the latest docker release, but they mostly affect devs.
    • Sent two PRs to fix some issues found on QA (#716, #717)
  • TODO: Approve contributor's PR for docker image buil
  • TODO: Sneak in fix for commit title check
  • TODO: Wrap up some fixes on my PRs (#716, #717)
  • TODO: Start QA on macOS (Intel CPU)
  • TODO: Check Qt for critical CVEs since December

deeplow:

Discussion:

  • move scenario 10 to CI #719
    • Let's split this in two:
      1. Change Scenario 10 to be macOS and windows only and also check if new container image is installed. This is because it's easier to run over the previous installed versions on these platforms.
      2. Make sure our settings tests coverage check if some settings would be ovewritten

Wednesday - 2024-02-14

Alex:

  • Informed Alpine Linux devs about CVE-2023-5841, which was detected by our security scanner.
    • Alpine updated the offending package, and now our security checks pass.
  • Built a patched conmon version for Ubuntu Jammy, open-sourced the repo (https://github.com/freedomofpress/maint-dangerzone-conmon/tree/ubuntu/jammy), and included the .deb in our APT repo
    • Also fixed a failing CI check in Ubuntu Jammy, which was affected by this.
  • Reviewed some small pre-release Dangerzone PRs (#711, #709, #707, #706)
  • TODO: Find out why the bad file extension is not reported

deeplow:

Discussion: - INSTALL.md#Qubes - references fedora 38. Should we update to fedora 39? - People who install Qubes right now will still fedora 38 - Let's align with the Qubes team and recommend fedora 39 when 38 becomes EOL - have it set to 39 when it's the default in the latest ISO - we don't need to change anything in our install instructions - Is error reporting working properly (https://github.com/freedomofpress/dangerzone/issues/704#issuecomment-1943448055)? - We should semi-automate scenario 10. It is burdensome. - let's try to move scenario 10 to the QA. Tweak the build and install RPM. Install the previous RPM and then the new one and starting dangerzone. And add a similar thing on macOS and Windows (these are not too critical since scenario 10 is easier there as the older DZ version is already installed)

Monday - 2024-02-12

deeplow:

  • Bump poetry dependencies #701
  • Merging of #686 (OCR working on Qubes again)
  • fixing development for #700 (PyMuPDF using prints for debugging and messing up out pristine JSON stdout)
  • TODO review package maintenance PRs / list
  • TODO Bump poetry dependencies (again) in #701 as the last thing prior to QA
  • TODO: find some Qt documentation to send to UX people

Alex:

  • Small fixes on the streaming pages PR, which is now merged
  • Informed Ubuntu maintainers of conmon about the possibility to ship a patched conmon version in the next point release of Ubuntu Jammy (https://bugs.launchpad.net/ubuntu/+source/conmon/+bug/1997139)
  • TODO: Check out CI alerts
  • TODO: Build the conmon package on Ubuntu Jammy, in order to fix the CI issue

Discussion:

  • which approach should we use to tackle PyMuPDF (fitz_new) misbehaving (#700) a) via imports (see https://github.com/freedomofpress/dangerzone/issues/700#issuecomment-1934255506) b) via pinning PyMuPDF version to something before fitz_new transition to fitz
    • decision: let's go with b)
  • Qt tour
    • we don't have enought working knowledge on this, and we are in release mode. So what we can do is find the Qt documentation on widgets and some more complex applications that use Qt to demo some example uses of Qt.
  • CI troubleshooting:
    • It seems now that conmon is shipped through the Debian Bullseye repos, it's not longer present in oldstable-proposed-updates. This is the reason why Ubuntu Jammy fails. The solution is to build our own package for Jammy.

Wednesday - 2024-02-07

deeplow:

  • Squashing and merged the long stream pages PR (#627) (only 2 commits left)
  • Updated issue description in "Update container image independently #698" to add some more context
  • Short time playing around with #700 (contextmanager user to suppress / redirect PyMuPDF's annoying prints to stderr)
  • TODO: Use fitz_old as the fitz module for the Dockerfile (avoiding too referencing fitz_old in our code)
  • TODO bump poetry lock
  • TODO Create a QA issue and document the testing order

Alex:

  • Fixed a failing CI test in Fedora, due to a CI runner running out of space (#699)
  • Discussed internally on how we will package conmon, and specifically which version we'll choose for Ubuntu Jammy.
  • Preparations for the upcoming blog post on our security audit
  • Looked into dangerzone#700 and found out that PyMuPDF recently changed to a newer fitz implementation, that unfortunately writes to stdout.
  • TODO: Rewrite some commit messages in the streaming pages PR (and review existing ones)
    • Also add an Ubuntu Jammy note.
  • TODO: Weigh in on the Ubuntu bug for conmon and ask to provide the backported package from Bullseye, before Ubuntu's point release (https://launchpad.net/ubuntu/+milestone/ubuntu-22.04.4)
  • TODO: Follow up on the policy guide based on the discussion below
  • TODO: Build conmon for Ubuntu Jammy and send a PR to apt-tools-prod

Discussion:

  • Package support criteria
    • Definitely make the backport policy guide instruct the user to leave a "Backporting rationale(?)" document, that will include the following:
      • What is the original reason for considering backporting?
      • What were our alternatives at the time?
      • What are the main friction points for this backported package?
      • How long do we believe we need to backport this package?
    • Based on this "backporting rationale" document, we need to have some rudimentary next steps (how to fork the repo, where to link back to this rationale, set expectations for people who depend on this package)
  • How to silence PyMuPDF warnings?
    • May be better to stick to the fitz_old interface, which was replaced by the new fitz module in Jan 11th. It seems more battle-tested, and we can switch later to the new fitz module once it has seen some action.
  • Conmon versioning

Monday - 2024-02-05

  • Made a second full review of the streaming pages PR (dangerzone#627)
    • Added some fixups of my own as well.
  • Reviewed the PR for supporting extra file formats (dangerzone#697)
  • TODO: Add a check for a bad conmon version.

deeplow:

  • TODO wrap up #627

Wednesday - 2024-01-31

Alex:

  • Reviewed two small PRs by deeplow (#686, #684)
  • Added conmon from oldstable-proposed-updates in our CI for the streaming pages branch, and now tests pass.
  • Debugged and fixed a CI issue on Debian Trixie
  • Kept the ball rolling on the PySide6 packaging front
  • Wrote a draft policy on how FPF can backport packages
  • Implemented some review comments on the Fedora 39 PR
  • Helped with sumarizing the audit findings in our upcoming blogpost for our security audit
  • TODO: Review the PyMuPDF extensions PR
  • TODO: Final review on the streaming pages branch
  • TODO: Have a final decision on how to handle conmon

deeplow:

  • Wrap up new file format support #697
  • Create remaining follow up issues for the security audit report and audit comms plan
  • Address feedback on various open PRs
  • TODO review draft policy on how FPF can backport packages
  • TODO review security findings in audit blog post
  • TODO update website with pdftoppm reference
  • TODO approve the Fedora 39

Monday - 2024-01-29

deeplow:

  • Removed timeouts (dangerzone#687)
  • found a bug where if we were to run conversions in parallel it would fail because both conversions would use the same proc attribute. Addressed it in PR #627
  • convert large pdf on page streaming PR to see in which page it fails; update /tmp filling up issue with results. Conclusion: it does in fact now only fail on the client https://github.com/freedomofpress/dangerzone/issues/574#issuecomment-1907912615
  • TODO: Updates supported formats (#660)
    • .desktop, macOS and widows installers (related #646)
    • update in server mime-type detection
    • add test files for each format
  • TODO: review security assemssent and create issues associated with findings

Alex:

  • TODO: Backporting policy for packages (conmon / PySide6)
    • And create a CI job with this package on the streaming pages PR that shows it passes our tests.
  • TODO: Take a final look at the security audit report
  • TODO: Review the Dangerzone logo for the Fedora PR (https://github.com/freedomofpress/dangerzone/pull/684)
  • TODO: Package conmon 2.0.26 for Ubuntu Jammy and Debian Bullseye
  • TODO: Final review on the streaming pages branch
  • TODO: Implement review comments on Fedora 39
  • TODO: Review OCR fix on Qubes https://github.com/freedomofpress/dangerzone/pull/686

Wednesday - 2024-01-24

Alex:

deeplow:

  • Reading up on conmon (cause of stream pages issue) and sync call help Alex troubleshoot a related situation
  • Merging contributor PR (fixing some capitalization)
  • Adding missing Dangerzone logo on Linux (#684)
  • Addressing comments on PR "Add support for Fedora 39" #680
  • TODO: convert large pdf on page streaming PR to see in which page it fails; update /tmp filling up issue with results. Crashing on the client-side is better
  • TODO Tremove timeouts, nonblocking IO and asyncio on the server side and explain rational https://github.com/freedomofpress/dangerzone/pull/627#discussion_r1413877119
  • TODO address https://github.com/freedomofpress/dangerzone/issues/682 and being careful about not overriding the user's preference

Discussion:

  • Reasoning for removing timeouts:
    • we have asked Micah and the original reason was due to some commands that couldn hang. However now we no longer
    • timeouts were costing engineering time for little benefit. Costs include more complex code, dealing with non-blocking IO (particularly on Windows) https://github.com/freedomofpress/dangerzone/pull/627#discussion_r1413877119
    • timeouts as implemented today are inconsequential in the sense that they don't stop the job on the sandbox (#563) and for the user the timeouts are so long that users are likely to stop the conversion because it's hanging before the timeout
    • we could consider implementing timeouts in the future, though the GUI that detects when conversions are taking too long and allowing the user to stop it.
  • Reasoning for removing asyncio:
    • We originally needed asyncio in order to get page info from pdftoppm in an async manner, and call callbacks on each page data.
    • We are no longer using pdftoppm, and the only thing that uses asyncio is the spawn of LibreOffice, which is not strictly necessary.
    • The asyncio code has performance impliications, because it context-switches every time it needs to write to a pipe.
    • We can greatly simplify our sandbox code by removing asyncio altogether.
  • Timeout removal process:
    • remove timeouts, non-blocking IO, asyncio in server side (was needed for pdftoppm)
  • Fedora KDE people are starting to look at building PySide6: https://pagure.io/fedora-kde/SIG/issue/446

Monday - 2024-01-22

Alex:

Deeplow:

Discussion:

  • Hanging conversion:
    • We have tested removing the non-blocking read component, but the tests still failed.
    • Testing it locally fails due to #673

Wednesday - 2024-01-17

Deeplow:

  • Troubleshooting page streaming PR CI failing on ubuntu 20.04-based machines. The conversion would be extremely slow and time out
  • ran into issues running ubuntu 20.04 via dev_scripts/env.py, Dindn't look much furteher but reported it https://github.com/freedomofpress/dangerzone/issues/673
  • troubleshooting PyMuPDF Stream Pages PR being too slow (and thus failing due to timeout). Tried git bissect but build times are extremely low and our podman in podman wasn't working. So trying on a ubuntu 20.04 vm to trouble shoot it. Slow build times make this particularly hard to troubleshoot.
  • tested stream pages 22.04 and it's hanging as well. This was doumented in #443
  • Fleshed out a bit more the tails comms
  • TODO test in ubuntu 22.04 further test removing the -i from podman exec

Alex:

  • Almost polished the PR for PySide6
    • TODO: Introduce a CI job for building a Dangerzone RPM, and installing it in Fedora, while downloading the PySide6 RPM from packages.freedom.press.
    • TODO: Get notifications for PySide6 GitHub actions. deeplow: Ideally have a CI job that does the bump for us, and sends a PR.
    • TODO: Figure out why we can't build RPM packages in dev machines.
  • TODO: Check if timeouts are still necessary in Dangerzone
  • TODO: Review latest iteration of PyMuPDF
    • Help debug an issue with some Ubuntu builds
  • TODO: Check out latest iteration on Tails comms

Monday - 2024-01-15

Deeplow:

  • Working on the PyMuPDF page-streaming PR dangerzone#627
  • fixing most of the CI issues
  • making tests pass locally
  • sending back progress information via stderr (and removing leftover progress parsing)
  • rebase on top of main and address conflicts
  • Address high-vulnerabilities user issue
  • review security report
  • planning meeting
  • adapt dummy dangerzone converter (for testing in CI without nested virt.)
  • replied to multiple contributor's comments
  • TODO wrap up tail comment and reference https://github.com/freedomofpress/dangerzone/issues/669 as something very exploratory

Alex:

  • Proactively started mapping the problem space of hosting our container image separately from the application.
  • We have a WIP document with F_rancisco that we can share once it's more polished.
  • Looked into timeouts in Dangerzone; how to implement them on Windows (dangerzone#632) and if they are still necessary.

Discussion:

Thursday - 2024-01-11

Alex:

  • Assessed the impact of CVE-2023-7104. Verdict was that it doesn't impact Dangerzone, so I sent a PR to ignore this alert.
  • Tested Dangerzone and PySide6 on a Fedora 39 VM with GUI. It seems that we don't install any extra X11 package, which probably means that our PySide6 package is lean.
  • Informed our Fedora 39 beta tester about the way to test our candidate Dangerzone RPM.
  • Reviewed Tails comms
  • Reached out to PyMuPDF folks. Turns out that MuPDF has undergone fuzzing, and that the best way to stay up to date with security fixes is to follow their changelog.
  • Started working on on-host pixels to PDF conversion (dangerzone#625)
  • Looked into some CVEs that a user reported. Our understanding is that these CVEs are not critical enough to release a new version, nor do they seem to affect our software (dangerzone#666)
  • Debuged a Qubes-related issue with deeplow
  • TODO: Help with the non-blocking read on Windows

Monday - 2024-01-08

Alex:

  • Compared compression algorithms (Gzip vs LZMA) (dangerzone#663)
  • Continue with PyMuPDF review
    • Fixed a security scanning issue
  • Continued working on packaging PySide6
    • Added CI jobs for getting the latest PySide6 version and its wheel hashes, as well as building the packages nightly
    • Found some issues in the Provides section of the package, which I have mostly fixed.
    • Checked if we can build the package reproducibly, but turns out that's an open question for Fedora (https://fedoraproject.org/wiki/Reproducible_Builds)
  • Had a discussion with deeplow regarding licences
  • Sent a PR to freedomofpress/yum-tools-prod (https://github.com/freedomofpress/yum-tools-prod/pull/16), that will include a PySide6 RPM in Fedora 39.
  • TODO: Security assessment of new issue
  • TODO: Feedback on Tails comms
  • TODO: Merge the PySide6 RPM and reach out to our beta user
  • TODO: Reach out to PyMuPDF folks
  • TODO: Work on native host conversion

Deeplow: - changed Dangerzone license - rebased stream pages PR and continue work there - Progress not showing for 2nd stage conversion (Fixed) - removing untrusted progress parsing code - opened issue about build-image.py not failing fast https://github.com/freedomofpress/dangerzone/issues/664 - Draft comms for Tails - Looking into PySide6 code that Alex made and played around with the build package - troubleshooting docker.io pull limitations (caused by our multi-stage build) - Working around docker pull limits - TODO continue PyMuPDF-stream-pages (replace update_progress() with sending to stderr) - TODO send tails comms after feedback

Discussion: - What should we do regarding the update_progress()? We want to have somehow a way of debugging / troubleshooting. Currently this progress was passed onto the user, but now that this is inferred client-side, we no longer have detailed progress reports.

Wednesday - 2024-01-03

Alex:

  • Helped a Linux Mint user with a Python issue (dangerzone#661)
  • Progress on PySide6: got the package into shape, pinged the Fedora maintainers
  • TODO: Help with failing security scan check in PyMuPDF branch
  • TODO: Communicate with Artifex about our use of their code.
  • TODO: Create feeds for CVEs in PyMuPDF / MuPDF / LibreOffice / Tesseract
  • TODO: Migrate CircleCI jobs to GitHub actions

Deeplow: - continue addressing feedback in PR PyMuPDF PR - TODO follow up on https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1440244245 - TODO fix issues in stream pages - dummy -> unsafe conversion + add tesseract_ocr (if too big, fix the dummy and this will be done on a future PR)

Discussion: - How to go about license change (due to PyMuPDF inclusion) - client-only dummy - we need python3 magic as a test dependency - - State of open PRs: * PyMuPDF branch: only security scans broken (alex will help there), and a minor lint issue. * We need to change the license as well. * We also need to close some issues (e.g., the one where the size balloons). Conclusion: We discussed this synchronously and agreed that we'd go with my suggested diff above plus adding a note about this discussion on the code. Additionally we'll test one document with and without the deflate images just as a sanity check to ensure that deflate_images in the to_bytes method does the same thing as in the pdf.save(). * Streaming pages branch: tests are currently broken, we need to update our

Tuesday - 2023-12-20

Deeplow:

Alex:

Discussions: - security audit preliminary results - timings of compression modes in document https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1431728747: if we have OCR enabled compression

Monday - 2023-12-18

Alex:

  • Researching more into PyMuPDF
  • Reviewed most of the open PRs
  • TODO: Understand a bit more about the Tesseract security properties

Deeplow:

  • addressing feedback in remaning PyMuPDF

Wednesday - 2023-12-13

Alex:

  • Fixed missing Release file for Ubuntu Focal
  • Sent PR for running our Linux installation instructions nightly (#655)
  • Sent PR to apt-tools-prod for avoiding a similar bug as the missing Release files in the future (apt-tools-prod#12)
  • Created an issue for bug reporting (#656)
  • TODO: Review #649, #651, and #654
  • TODO: Continue on the PySide6
  • TODO: Check streaming implementation on the second stage of the conversion

deeplow:

review adding Qubes (beta) to the website

tackle issue of installing pip dependencies alongside system python deps (bypassing PEP 668)

follow address security audit questions

TODO: look at pending PRs to unblock them

TODO: continue to address pymupdf feedback

TODO: rebase stream pages PR on top of PyMuPDF branch so @apyrgio can look into question 5)

Discussion:

  • PyMuPDF final questions:

    1. @apyrgio Open Q: How exactly does PyMuPDF perform Tesseract conversion? Does it bind into tesseract? Does it write to the file system? Does it call the shell under the hood?

    2. @deeplow Open Q: What is the reason behind additional failures in the PyMuPDF case?

    A: Timeouts and new size limits on documents

    1. @deeplow Open Q: Why conversions of multiple small files take more time?

    TODO: comparison of document conversion timings; Take a look at problematic documents and see what's up.

    1. @apyrgio Open Q: What is the impact on the container image size?

    Can we remove the CJK fonts and ghostscript and all other libraries that are now unnecessary?

    Can we measure it against a different container image (e.g., Ubuntu / Debian), which have PyMuPDF in their repos?

    1. @apyrgio (Optional) Question: How does (py)mupdf use Ghostscript? (we had a recent CVE on that)

    2. @apyrgio Question: How can we improve the installation of PyMuPDF on the Alpine Linux container image (performance and security -wise)

Monday - 2023-12-11

Alex:

  • We did the release for 0.5.1
  • Release metric: 2 days if everything is in place to have the release out
    • When it's worth automating it?
  • TODO: Check download count of dangerzone-qubes package
  • TODO: Fix the Ubuntu Focal installation by creating a CI job that validates our installation instructions.
  • TODO: Create a GitHub issue for bug reporting
  • Good news on https://github.com/google/gvisor/issues/8205

deeplow:

  • TODO: Improving release instructions
  • TODO: Improve running the large doc tests

Discussion:

Wednesday - 2023-12-06

Alex:

  • Took a look at a user's error report (#631)
  • TODO: Create installation/user guides in dangerzone.rocks

Monday - 2023-12-04

Alex:

  • Made a first pass of the PyMuPDF PR (#622)
  • Started looking at the PR depending on top of it (#627), which adds streaming support on containers.
  • TODO: Resume the review on the #627 PR
    • Check out what's the case with timeouts once we have the same code in Qubes/containers
  • TODO: Help on adding streaming support for the #627 PR
  • TODO: Fix failing Grype job

deeplow:

  • TODO: address feedback from PyMuPDF PR (#622) and #627

    • Regarding the DPI: let's put 72 everywher as a constant, create a separate issue for checking if there's a noticeable size impact if we bump it to 150, link this issue to the compression issue.
  • TODO: finish comparison evaluation of PyMuPDF and reporting results on the related issue.

    • TODO: look at timeout situation in PyMuPDF branch

Discussion: - DPI - follow up from PyMuPDF - investigate the failure reasons on pymupdf

Thursday - 2023-11-30

Alex:

deeplow: - TODO finishing a slide deck on PyMuPDF comparison - TODO create issue to replace with PyMuPDF and license - TODO streaming implementation for 2nd container (pixels to PDF)

Discussion:

  • PyMuPDF: Overall, looks pretty good in various fronts (performance, code readability, PDF rendering)
    • What about security checks? Does Grype pick them up?
  • Python bindings for LibreOffice: There is a LibreOfficeKit project that offers Python bindings for document conversion: https://github.com/xrmx/pylokit/
  • We have missed notifications for CI errors in the main branch:
    • We must always check notifications from the main branch. The rest are not as critical
  • we are lacking an installation guide

Monday - 2023-11-27

Alex:

Thursday - 2023-11-23

Alex:

  • Created a PySie6 package using an RPM spec
    • TODO: Does it make sense to write in the Fedora thread for PySide2 that we have bundled PySide6 as an RPM?
    • TODO: Create a Git repo with instructions to build this package.
  • TODO: Create an issue about not being able to run Dangerzone as user ID != 1000 and link it to #620 and #443.
  • TODO: Is there a non-Python project that can do OCR, which can potentially have binaries for MacOS / Windows?

Tuesday - 2023-11-21

deeplow: - working on #443 page streaming support in containers (based on PyMuPDF)

Discussion:

  • Native second stage conversion:
    • If we cannot install an OCR binary in the user's host, we need to perform the conversion in the container.
    • For the time being, we will use a host directory as a buffer for the pages that the fist stage conversion created (basically, what we are doing right now), and once it finishes, send them to the second stage container.

Alex:

  • Working on backporting PySide6
  • TODO: Prune milestones
  • TODO: Follow up on #620

Discussion:

  • What native Python bindings (PyMuPDF) mean for Dangerzone:
    • Improvements in first conversion stage (Document to Pixels):
      • Less bugs: No need to call subprocess and potentially parse stdout
      • Less traceability: PyMuPDF does not need to write pixels to disk, as pdftoppm does (TODO verify claim)
      • By removing external software dependencies it makes it possible to have the 2nd conversion stage run on the host. Not only does this offer potential speed-ups, but critically it opens up the pathway for dynamically downloading tesseract-OCR models as needed by the user. To implement it without this change we'd need to downloade them on the client and then send it to the container at runtime, while with PyMuPDF on the client we can download them and just run them all on the host.
    • We can potentially do second stage conversion (Pixels to PDF) in the host with native Python bindings, without the need for the second container. Benefits:
      • Performance: Both stages can happen in parallel
      • Code simplicity: Qubes and container isolation providers will be using the same code for the second stage of the conversion.
      • Less bugs: No need for buffer space (ENOSPC scenarios), or mounting files to containers (SELinux issues)
    • TODO:
      • Measure performance betweeen previous code and current iteration:
        • Test scenario: We should benchmark both DZ on a few large files and lots of small one.
        • Platform: Prioritize benchmarking the container isolation provider.
      • Code changes: We should have an idea of how this change simplifies our codebase (e.g, in LoC?)
      • Visual Diffing: Does PyMuPDF change the way the PDFs are rendered?
      • Does PyMuPDF bring any security benefits for the first stage of the conversion?
      • Can we use the PyMuPDF calls in all of our supported distros (oh hai Ubuntu Focal)?
  • Backporting PySide6:
    • Thankfully, PySide6 ships the Qt libraries within the Python module, so we don't rely on system ones.
    • When backporting PySide6, how can we make sure that the code we have downloaded (Python wheel?) is the one that Qt people have shipped, without building it ourselves?

Wednesday - 2023-10-16

deeplow: - QA fedora 39 (still in beta) - build container images on macOS

Wednesday - 2023-10-04

Alex:

  • Sent and merged a fix to catch exceptions in the second stage of Qubes conversion (#568)
  • Sent and merged a fix to properly support "dark mode" in our user dialogs (#569)
  • Bumped our Poetry deps (#570)
  • Bumped our version to 0.5.0 (#571)
  • Reviewed the notarytool PR (#558)
  • Started QA for 0.5.0
    • We have stumbled on some Qubes issues for which we will open GitHub issues.

Monday - 2023-10-02

Deeplow:

  • rebased and merged "Handle errors in Qubes" (#546)
  • let 0.5.0 running large test in background over the weekend
  • TODO: commit large test results

Alex:

  • Helped with the merging of #550. Required a bit of Git plumbing.
  • Addressed some comments in #561 and merged it.
  • Re-opened #430 to highlight some issues that we haven't addressed.
  • Re-opened the HWP support PR, since we will go with alpine:latest instead of edge.

Discussion:

  • Release 0.5.0:
    • Revert the HWP support for Apple Silicon @apyrgio
    • Merge the altool PR once we have tested it during the release @apyrgio
    • Error handling:
      • Catch the OCR error on Qubes @apyrgio
      • Move some error handling scenarios to the stabilitization effort (stable qubes integration) @deeplow
      • Change the out of RAM message with a more generic one: @deeplow
        • "Could not start a disposable qube for the file conversion. More information should have shown up on the top-right corner of your screen."

Wednesday - 2023-09-27

Alex:

  • Merged the PRs for switching to tessdata-fast and for adding installation instructions for Qubes (#548 and #543)
  • Sent a PR (#551) for detecting Qubes errors when we receive EOF, and merged it.
  • Found a nasty bug (#560) that could potentially lead to leaving the last page of a document out.
  • Found an issue with client-side timeouts in the current Qubes implementation (#557)
  • Sent a PR (#561) on fixing #560 and #557
  • Reviewed #554, #556, #546

Deeplow:

  • reviewed "Detect if we received EOF due to a command that failed" #551
  • follow up on PR Open Better "dark mode" support (#550)
  • open PRs for minor Qubes conversion issues (#554 and #556)
  • migrated macOS notarization process and open PR for it (#558)
  • TODO: Handle errors in Qubes #546
  • TODO: Merge the Dark Mode PR
  • TODO: review "Stream page data in real time" #561

Monday - 2023-09-25

Deeplow:

  • wrapped up Qubes error handling PR (#546)
  • look into contributor PR "Better Dark Mode Support" #550
  • TODO: update notary tools
  • TODO: finish reviewing #550 (dark mode PR)

Alex:

  • Merged 3 PRs
  • Worked on making every read function check the exit code when it receives EOF.
  • TODO: Review the error handling PR

Discussion:

Wednesday - 2023-09-20

Alex:

  • Sent a PR for slimming down our OCR models
  • Reviewed the WIP error handling PR
  • TODO: Send a fix for detecting exit code of process whenever we reach EOF in our read_* helpers.

Deeplow:

  • reviewed pending PRs:
    • OCR parameters passing (#544)
    • RPM packages from an RPM SPEC (#538)
    • installation instructions for Qubes (#543)
    • qubes: Add client-side timeouts (#547)

Discussion:

  • We're running into an issue where if some error happens early in the conversion the conversion, then it won't detect that as the failure reason. Rather, it will detect the fact that it didn't receive the number of pages. So in that case we need to catch also the exit code to understand if the cause. @apyrgio will work on this.
  • We were thinking about having a server-side limit of 56K pages to detect early failures -- the max an unsigned 16bit int can have (2^(8 + 8)). However, this approach is a bit useless because it'll do all the work server-side just to have a 10K page limitation on the client. We concluded that it's best to have the same 10K limit in the serve and the client.

Monday - 2023-09-18

Alex:

  • Review dangerzone#537
  • Fixed some issues deeplow commented on (#538, #543, #544, #547)
  • Send a PR for client-side timeouts (dangerzone#547)
  • Wrote an issue about slimming down our language models (dangerzone#545)

Monday - 2023-09-11

Deeplow:

  • took a look at: Add installation instructions for Qubes" (#543)
    • question: language #431 to update RPM packaging #543 doubts about the "alpine:edge" (#541 #542) https://wiki.alpinelinux.org/wiki/Edge. We run the risk of not having the package in time or forgetting that we even have ":latest"

Alex:

  • Updated the items for the 0.5.0 roadmap.
    • Removed the stretch goals that wouldn't cut it for this release, did my best to set due dates for each task.
  • Reviewed and merged dangerzone#451, which adds HWP support on MacOS.
  • Reviewed a PR for Qubes error handling (dangerzone#537)
  • Sent a PR with formal installations instructions for Qubes, as well as some updates on our Qubes RPM packaging PR.
  • Sent a PR that improves the way we pass OCR parameters during sanitization (dangerzone#544)
  • TODO: Update the PR with the installation instructions to make the Qubes RPM to include every tesseract-langpack-* package.
  • TODO: Recap internally the main points of the RPM packaging story
  • TODO: Weigh in on the 1.0.0 vision
  • TODO: Check the discrepancy on the size of RPM language models vs the downloaded language models.

Discussion:

  • alpine:latest (#540)
    • deeplow agrees with the assessment for ":latest"
  • 1.0.0 vision
  • OCR languages
  • Let's take a look at the proposed roadmap, see the issues that each of us can work on parallel.
  • mention GL discussion on slack?

Wednesday - 2023-08-30

Alex:

  • Continue working on RPM packaging: #298, #431, #514
    • I have managed to follow pretty much the latest conventions regarding SPEC files, and I have created a fully working one, albeit hacky.
    • I have a more polished version that lacks only the following:
      • Includes our assets (/usr/share) in the final RPM
      • Run a post install script that fixes the stale .egg-info directories
      • Allow it to produce a -qubes.rpm
  • TODO: Verify that .dist-info is the latest recommendation by Python, and not .egg-info
  • TODO: Consider adding temporary directory for RPM builds in ~/.local/dangerzone-dev/rmp-build//...

Deeplow:

  • post on forum
  • continue continue work in Qubes error handling

Discussion:

  • implement client-server shared error codes with an increment of 128

Monday - 2023-08-28

Alex:

  • Taking a look at our bdist_rpm alternatives
  • Prioritized items for the 0.5.0 milestone
  • TODO: Clear up the 0.5.0 milestone from the stretch goals
  • TODO: Check how SecureDrop Workstation creates their RPM files and incorporate some of the logic in Dangerzone.
  • TODO: Review the Qubes alpha instructions PR

deeplow:

  • continue RPM packaging issue (#298)
  • sync 0.5.0 release scoping in prep for planning meeting
  • follow up on dangerzone.rocks not updating
  • discuss with user issue where Dangerzone wouldn't start (#514)
  • create issue about Dangerzone not showing on Gnome Software (#531)
  • drafting up post for Qubes Forum about DZ alpha (Marek's suggestion)
  • upgrading dev environment to Fedora 38
  • test and review Qubes alpha setup instructions (related to the forum post)
  • TODO: Publish Qubes forum post
  • TODO: error handling on Qubes and timeouts

Wednesday - 2023-08-23

deeplow:

  • investigate /tmp space shortage (see #518)
  • found upstream issue with pdftoppm https://github.com/freedomofpress/dangerzone/issues/524
  • merge simple and approved PRs (#510, #509, #508)
  • implement suggestions in PR 'Propagate "update check" prompt to UI checkbox' (#515)
  • final review and approval of: "Post-release fixes for MacOS issues" (#523)
  • Address feedback in large tests PR and merge it
  • Continue RPM packaging work (#298)

Alex:

  • Reviewed PR #386
  • Test uninstall situation in Debian
  • Created issues for page size problems and test_update_error
  • Milestone 0.5.0 discussion

Discussion:

  • SELinux violation (#517): We haven't yet managed to trigger an SELinux violation yet. Will try removing the :Z flag.
  • RPM packaging (#514): We've made progress there, but we don't have a package out yet. We need to pair on this problem fix some remaining issues.
  • Announce Qubes Alpha integration in the Qubes forum:
    • Also start a GitHub discussion for this feature.
    • Give a list of issues that we will work on in the next few months.

Monday - 2023-08-21

Alex:

  • Reviewed PRs #515, #510, #509, #508
  • Added fixes to my PR (#523)
  • TODO: Review #386
  • Tested uninstalling Dangerzone on Debian and it does not leave stale folders behind.
    • Actually, I never tested with pycache folders. I guess I need to retest...
  • TODO: Check if Dangerzone originally used the tmp directory, and then moved to the config one.
  • TODO: Create an issue for fixing the page size problem.
  • TODO: Open an issue for test_update_error that fails randomly

Deeplow:

  • looking at packaging RPM
  • TODO create issue for dark mode on macOS (black)
  • TODO try and try to get an SELinux violation :z argument
  • TODO continue RPM packaging work

Discussion:

  • What is the size of a single page?

    • 1 A4 page - 72 DPI = 595 x 842 pixels
    • 1 A4 page - 150 DPI = 1240 x 1754 pixels

    We need to account for 3 color channels though (RGB), meaning that the final size is:

    • 1 A4 page - 72 DPI = 3 x 595 x 842 pixels = 1.43 MiB
    • 1 A4 page - 150 DPI = 3 x 1240 x 1754 = 6.22 MiB

    If we want to fill 1 GiB of RAM, we need 716 pages (72 DPI) or 165 pages (150 DPI). Note that at some point, we compress the end product, so there will be two files in the same tmpdir (the united ones, and the compressed ones).

    • What can we do here?
      1. Can we stream the pages from container 1 to container 2, and call the programs that "unite" them on the stdin or sth?
        • This would be optimal, but it requires an architecture that we don't have right now (2 containers speaking to each other).
      2. Can we compress each page that we receive from container 1 (e.g., RGB to PNG)?
        • The streaming pages feature is a hard requirement for this. Else, we'd need to use an inotify-like mechanism, which is typically not cross-platform.
        • PNG-compression improvements: 30x - 40x for document types, 2x for photos.
      3. Can we store the pages in a data dir? (reverting to the way it was before)
        • Not the best option, as this will leave traces of the file in the computer, especially if the original file existed in a tmp dir or an external device.
  • bdist_rpm deprecation:

    • RPM packaging commands can take a .toml as an argument.
      • RPM can now fetch dependencies from pyproject.toml. But our toml file has sections that are poetry-specific [tools.poetry.dependencies] so rpm packaging commands cannot recogize this, but maybe there is a PEP.
      • We need to handle removing files from previous installations, even if the RPM that we produce is correct. We need a pre-installation script that will handle removing the stale egg.info dirs, and it needs to work even if no such dirs are present. Also, it needs to exist for quite a lot of time in our codebase, because there may be users out there who forget to update for a while, and we need to cater to them as well.
  • I saw a screenshot in this issue (https://github.com/freedomofpress/dangerzone/pull/508) and I wondered, do we handle dark mode correctly? Reminder that we forced the font color to be black in https://github.com/freedomofpress/dangerzone/pull/487

  • I think the test test_update_error is flaky. I've seen it fail a couple of times with "signal not emitted withing 5000ms". We should open an issue.

  • We need to trigger the SELinux bug without Dangerzone. Can we use chcon / restorecon for that?

    • The SELinux alert brower is probably not stock Fedora. So the user in #517 could have edited something else as well.

Wednesday - 2023-08-16

Deeplow:

Alex:

  • Started looking on some Fedora-related issues (#514, #517, #518)
  • Opened some issues/PR for problems that we found during release.

Monday - 2023-08-07

Deeplow:

  • reviewed and merged all pending PRs for 0.4.2 release
  • continued work on large-test-docs PR to see if we could leave it running over the weekend (wasn't be possible too much work left rebasing)|
  • TODO: Create release artifacts for Windows
  • TODO: Create release artifacts for Intel Mac

Alex:

  • TODO: Create release artifacts for M1 Mac
    • Ping deeplow once it's free, so that he can run the large test
  • TODO: Create release artifacts for Ubuntu/Debian/Fedora

Discussion:

  • We missed CI testing on MacOS M1 platforms
  • We missed GUI testing on installed Debian packages (e.g., PySide2)
  • We should ensure that Poetry runs from the latest Python version.
  • We should ensure that the container.tar.gz is fresh and exists, when building the final Windows / MacOS artifacts.

Wednesday - 2023-08-02

Deeplow:

  • TODO QA on windows but first check if hwp works there
  • TODO QA on Ubuntu
  • TODO QA on Debian

Alex:

  • TODO QA on MacOS x86 / Apple Silicon
  • TODO QA on Fedora

Monday - 2023-07-31

Deeplow:

  • Look into flatpak situation
  • Merge 2 approved PRs
  • HWP: extra docs folder and base64 to avoid accidental opening
  • TODO: review "Improve the UX of the update check flow" #490
  • TODO debug issue making tests where CI ran out of space

Alex:

  • Sent a PR for various UX improvements (dangerzone#490)
  • Sent some small PRs for minor improvements (dangerzone#486,487)
  • Looked our Flatpak situation a bit (dangerzone#45)
  • Ready to send a PR for sanitization logic
  • TODO: FIXUP https://github.com/freedomofpress/dangerzone/issues/489

Wednesday - 2023-07-26

Alex:

  • Merged the update notifications PR (dangerzone#466)
  • Merged the PR that bumps our Python deps and allows us to run more than one Qt CI test (dangerzone#483)
  • Merged a user contribution for a long-standing warning (dangerzone#481)
  • Debugged lots of CI failures in the meantime, that are Qt-related (e.g., dangerzone#480)
  • TODO: Sunset Ubuntu Kinetic
  • TODO: Fix fonts in Fedora
  • TODO: Research on the flatpak situation
  • TODO: Add tests for sanitization logic (2023-07-santize-progress-update)
  • TODO: Open a Qt issue for PySide6
  • TODO: Send a PR to the extrepo devs
  • TODO: Convert https://github.com/apyrgio/homebrew-reprepro to actual PR

deeplow:

  • rebase and fix CI issues with PR" Add "change document selection" button #439
  • Taking another look at the HWP PR to see what's missing
  • Rebuild broken dev environment (something podman related got broken)
  • review PR "Bump python dependencies" #483
  • TODO Look into flatpak situation
  • TODO Merge 2 approve PRs
  • TODO HWP: extra docs folder and base64 to avoid accidental opening

Monday - 2023-07-24

Alex:

  • Wrap up update notifications PR
  • Add extra GUI tests for full coverage
  • Bump Python dependencies
  • Remaining fixmes for updater notification PR:
    • Detect Homebrew installations: we cannot detect that an application has been installed via Homebrew. Open an issue for this
    • What if users click on X? TODO: I need to open a PR for this.
  • TODO: Update the update notifications PR with proper fixes for the CI tests.
  • TODO: Send the PR for the rest of the GUI tests
  • TODO: Send the PR for the new Python dependencies

Deeplow:

Discussion:

  • Open threads for releasing 0.4.2:
    • Container logging:
      • Defer to after 0.4.2
      • Qubes support required major revisions
      • Reworked the training part of the testing
      • TODO: Factor out the string sanitization logic from this PR. Sanitize the strings in print_progress in our current code. Pass it optionally through repr().
    • Updater PR:
      • TODO: Add proper commit for fixing our CI tests (use -g instead of offscreen, remove pytest-wrapper)
    • Change document selection:
      • Rebase it on top of the main branch, once the updater PR gets merged. We expect that all tests will be green.
    • Use containers in Qubes:
      • Follow up on the review comments
    • HWP PR:
      • TODO: Remove the files without extensions

Wednesday - 2023-07-19

Deeplow:

  • caught up with over-the-weekend contributors comments
  • continue work on large docs PR
  • TODO do minor QT-related TODOs on "DZ Update notification" PR (#466)
  • TODO feedback in other PRs
  • TODO update

Monday - 2023-07-17

Deeplow: - ask: @apyrgio to comment on GPL situation https://github.com/freedomofpress/dangerzone/pull/460#issuecomment-1637200125 - review of on paper on PDF redaction relevant for advising on the use of DZ for post-redaction mitigation. https://petsymposium.org/2023/files/papers/issue3/popets-2023-0069.pdf

  • address feedback in "DZ Update notification" PR (#466)
  • follow up on HWP office issue (#460)
  • open issue about adding CJK fonts
  • prep machines for QA (found some issues)
  • TODO: check HWP PR conflicts with the Qubes conversion (#460) and open issue for adding that
  • address feedback large docs test PR (#386)

Alex:

  • Reviewed some PRs for 0.4.2 (#467, #464, #450, #439, #386)
  • Still working on fixing some issues with the update notifications PR (#189), based on deeplow's review comments
  • Researched a bit what's the case (licensing-wise) with regards to H20Restart (GPL) and Dangerzone (MIT)
  • Took a look at our Homebrew situation, due to a user report (#470). Haven't found something breaking so far, but we're talking with the user to understand what's breaking in their side.
  • Merged a PR for SIP instructions (#401)

Discussion:

  • Homebrew issues:
  • 0.4.2:
    • Updater notifications PR:
      • We need to store the error somewhere, instead of retrieving it from the settings. We can either store it in the QAction, or in the MainWindow.
      • The tests for the updater settings have lots of updater logic (updater fixture for instance) so maybe we can keep them within the test_updater.py module.
    • Enable container logging PR:
      • Maybe it makes sense, in the server side, to grab the output from the commands verbatim, and then do any processing in the client side
    • HWP support PR:
      • Summarize the licensing situation and mention it's not a blocker.
      • Check that in Qubes, missing HWP support is not a big issue (doc conversion will fail)

Monday - 2023-07-10

Deeplow: - Found issue leading to recusion in QT tests https://github.com/freedomofpress/dangerzone/actions/runs/5457217733/jobs/9930987271?pr=466 - reviewed https://github.com/freedomofpress/dangerzone/pull/466 - continued work on Hancom Office PR - checking windows situation for QA (opened PR) - TODO finish setting up manual QA system for windows

Discussion: - TODO book UX meeting - QA and release: bump python and python deps for mac and windows (add to release procedure)

Monday - 2023-07-03

Deeplow: - review and propose a patch to hancom PR - TODO: look at https://wiki.documentfoundation.org/Documentation/DevGuide/Extensions

Alex:

  • Finalizing initial work on update notifications PR

Wednesday - 2023-06-28

Deeplow

  • Review scope of UXD work
  • Open PRs for 2 new platforms (Debian Trixie and Ubuntu Lunar)
  • Start work on falling back to using containers on Qubes by default (we don't want any Qubes users expecting it to use containers to be surprised) #451
  • Catch up on contributor's PR for supporting Hancom Office files (see #460)
  • check if containers isolation works in Qubes

Alex:

  • Reviewed PRs for Debian Trixie and Ubuntu Lunar
  • Reviewed and merged PR for reducing container image size by a first-time contributor (thanks!) (see #459)
  • Started reviewing a PR for Hancom file support (see #460)
  • Started working on the update notifications issue (see #189)
  • TODO: Check Hancom issue with Libreoffice

Discussion:

  • hancom office files support
    • Security consideration: Do not pre-load this extension for every filetype.
    • Size consideration: Fonts take ~90 MiBs of space, probably bitmap, might be worth checking ttf
    • (optional) Security consideration: Have an "experimental" flag for Hancom files.

Monday - 2023-06-26

deeplow: - rebase progress reports PR (#450) - finishing up rebasing large docs test on top so it works on Qubes (plus many improvements) - TODO wrap up bulk doc test (#386) - TODO (?) improve GUI test coverage (make QA easier)

Alex:

  • We merged alpha Qubes integration!
  • Prepared an internal presentation for the above item.
  • Added a plan for update notifications (dangerzone#189)
  • Started working on a plan for the 0.4.2 milestone.

Discussion:

  • Leftovers for Qubes integration presentation
  • Set dates and work items for 0.4.2
  • UX for notification updates:
    • No need to terminate the updater thread (if it's running in the background) when the user unclicks "Check for updates"
    • Threshold for update checks can be 12 hours
    • We can add a setting icon (hamburger/gear) in the top right corner, that can be always visible.
      • We will follow the Tor Browser notification model.
      • This icon will have a notification bubble in case of a new update or error.
      • Clicking on this icon will open a dropdown menu.
      • This menu will have the following items:
        • "Check for updates" with a slider (on/off)
        • "Update available" / "Update error", accompanied by a green/red notification bubble.
          • Clicking on this menu entry will open a pop-up with info.

Wednesday - 2023-06-21

deeplow: - TODO record dangerzone-qubes demo video - TODO give feedback on presentation - TODO continue bulk doc test (based on Qubes PoC PR)

Alex:

  • Wrapped up the final branch for the Qubes PoC
    • Mainly made the Git history more sensible.
  • TODO: Work on the Qubes presentation
  • TODO: Propose plan for 0.4.2 or 0.5.0

Discussion: - capturing all output for debug purposes - send all relevant data via stdout (pixels) - send all debug info via sterr. Currently captured only on development. In the future add in cli qubes-similar --pass-io that shows sanitized attacker-controlled sterror so we can understand why a certain document failed. We'll cap this at a reasonable amount. - in production for the moment, we'll be using exit codes to identify the step in which it failed

Monday - 2023-06-19

Deeplow:

  • investigate the tails situation following private user question
  • Finish addressing Qubes PoC PR feedback
  • TODO: Work on progress reports (#429)

Alex:

  • Continued working on Qubes PoC.
  • TODO: Finalize the Qubes PoC branch
  • TODO: Scope the update notification feature (#189)
  • TODO: Create an issue for using containers by default on Qubes
  • TODO: Add support for Ubuntu Lunar (23.04) and for Debian Trixie (13)

Wednesday - 2023-06-14

Alex:

  • Reviewed the Qubes PoC
  • Reviewed the large tests PR

Deeplow:

  • address review comments for Qubes PoC

Monday - 2023-06-12

deeplow:

  • opened PR for Qubes (#437)
  • working on "add change docs selection button" issue (#428)
  • migrated user stories to github. Hopefully, this can be made public soon.
  • TODO: finish "change docs" PR
  • TODO: investigate updating

Alex:

  • Sent review note for Qubes PR
  • TODO: Reply to deeplow's comments on Qubes OS
  • TODO: Quick review comments for huge doc tests

Wednesday - 2023-06-07

Alex:

  • Updated the code for the server-side conversion of Qubes
  • Fixed a CI race for Debian Bullseye
  • Ignored a CVE in our security scanner that does not apply to us

deeplow:

  • review "ci: Fix CI races in Debian Bullseye tests" #435
  • organize some notes on admin API
  • work on Add "Change selection" button near "x document(s) selected"dangerzone#428
  • TODO: open PR for qubes-poc and review each other

Monday - 2023-06-05

Alex:

  • Presented Dangerzone at Dataharvest to ~20 people. Went pretty smooth and got interesting feedback.
  • Finished the review of the Qubes integration PoC branch. Will share the review comments soon.
  • TODO: Check out comments on asyncio in the server-side wrapper.
  • TODO: Send comments for Qubes integration PoC

deeplow:

  • reviewed Qubes Admin API
  • Add "Change selection" button near "x document(s) selected"dangerzone#428
  • TODO: Send PR for selection, once Alex comments on asyncio

Wednesday - 2023-05-31

deeplow:

  • Looking into Qubes Admin API for dangerzone distribution methods
  • review https://github.com/freedomofpress/dangerzone.rocks/pull/16
  • move Qubes design doc to the github wiki
  • add user story for security slider
  • Open Dangerzone-Qubes issues:
  • Progress reporting
  • Timeouts
  • Exception handling + hardening
  • packaging (OCR + libreoffice + ...)
  • TODO: document / post summarizing Qubes API findings and limitations
  • TODO: start taking a look at the UX issues
  • TODO: talk about potential need to split container.convert() into 2 stages (for file preview)

Alex:

  • Fixed a minor Qubes issue wrt Python's bindings for libmagic.
  • TODO: Finalize the Dataharvest presentation
  • TODO: Review the Qubes integration POC branch

Monday - 2023-05-29

Deeplow:

  • polish most of the dangerzone Qubes integration (still not done yet)
  • work on user stories and creation of multiple issues following that work

Alex:

  • Worked on the presentation for Dataharvest 2023
  • Merged the PRs that fix our CI tests
  • Weighed in on some GitHub issues for post-IJF user stories
  • Debugged an issue with py3-magic in Fedora environments.
  • TODO: Work on the server-side integration for Qubes.
    • Fix the Python magic issue
    • Merge dz.Convert with the dangerzone wrapper code.
    • Make sure that asyncio works on the server-side conversion.

Discussion:

  • large doc tests on Qubes (https://github.com/freedomofpress/dangerzone/pull/386/)
    • we have to somehow change the logic to accommodate the fact that we stream pages and there's no space fo sending json data inbetween
    • we can probably just send the data at the end of the conversion as json. The client ignores this if not in debug mode

Wednesday - 2023-05-24

Deeplow: - Review pending pull requests - continue work on Qubes PoC - Go through user research once again and map it to user stories - Idea: restricted web service - having a Dangerzone web service without file upload (only select doc from URL). This mitigates the somehow risk of having people upload sensitive documents https://github.com/freedomofpress/dangerzone/issues/110#issuecomment-1560534886

Monday - 2023-05-22

deeplow:

  • investigate OCR language issues that blocked CI and opedned PR #418
  • merge contributor's typo patch #416
  • reviewed open PRs from Alex
  • finish first stage of Qubes PoC
  • drafting installation instructions / packaging stuff for Qubes PoC

Alex:

  • Proposed a fix for some Tesseract issues.
  • Fixed a racy test for Debian Bullseye

Discussion:

  • Qubes PoC next stages
  • How do we homegenize the code
    • container/dangerzone.py will be split in two:
      • dangerzone/conversion/pdf_to_pixels.py:convert
      • dangerzone/conversion/pixels_to_pdf.py:convert
    • Move Dockerfile to the root directory.
      • Make our container image build scripts use the dangerzone/ dir as the build context.
  • Move the design document to the wiki
  • Tesseract:
    • Add a test for checking if the trained data match our language list
    • Consider adding a safeguard for making languages not selectable if the trained data do not exist.
  • Debian Bullseye: Polish the PR.
  • Create user stories as GitHub issues

Wednesday - 2023-05-17

Alex:

  • First draft of the Qubes integration design document
  • Created some issues for the Qubes integration
  • Merged Fedora 38 support
  • Update security scanning PR
  • Sent a PR for a CSS issue on dangerzone.rocks

Deeplow:

  • development of Qubes integration

Discussion:

  • Binary protocol: should we use asyncio? If we go full asyncio, we may need to greenify Qt as well. It may be worth making only the conversion async, and the rest of the code sync. Python has the ability to run async functions on a specific thread, and wait for it.
  • qrexec: Can we use the Python API instead of the executable?
  • we'll need the user to create a diposable template for dangerzone. Otherwise they'll use the fedora-37-dvm, which is networked. We can check the networking of the dispVM on the first start just as a santity check to make sure the user didn't shoot themselves in the foot
  • ultimately we should streaming integration with Docker as it would make the user IDs easier to handle, which has caused us problems in the past. This won't be a concern for now.

Monday - 2023-05-15

Deeplow:

  • prune branches

Alex:

  • Ran Dangerzone PoC on Qubes
  • Followed up on some PRs

Discussion:

  • Qubes OS:
    • Alpha: Provide a rudimentary integration with Qubes OS. E.g., may not have GUI, or other important (but not security-critical) features, but the core functionality is there.
      • assumption: main qube and disp qube are based on the same template
      • have a qubes isolation provider
      • set the proper isolation provider in Qubes (containers in most platforms, dispVMs in Qubes)
      • figure out if licensing affects the use of Qubes APIs
      • no code in dom0, install can include instructions for dom0
    • Beta: Improve the integration with QubesOS. E.g., add GUI support and other important features, make the installation easier.
    • Stable: Finalize the integration with QubesOS. There should be feature parity in all platforms, and installation on QubesOS must be as easy as possible.
    • Integration Testing
    • SecureDrop integration
      • assumption: main qube and disp qube are based on the same template

Wednesday - 2023-05-10

Alex:

  • Set up Qubes for experimentation
  • TODO: Follow-up on IJF findings
  • TODO: Run Dangerzone PoC on Qubes

Deeplow:

  • review security scanning PR (https://github.com/freedomofpress/dangerzone/pull/405)
  • Open discussion around OCR
  • Rethink the need to have status messages (in fact we don't), which will make the Qubes integration easier
  • reading up on UX user stories in prep for discussion
  • prep Q meeting for DZ update
  • present Qubes (minimal) PoC
  • idea about potential signal service

Monday - 2023-05-08

Alex:

  • Sent the PR for the security scanning of our images (#405)
  • Sent the PR for building our Debian package and installing it in every Debian-based flavor (#406)
  • Sent a PR in yum-tools-prod to fix sig checking for RPM packages.
  • Remaining:
    • Debian Bullseye CI
    • Write down notes from the 0.4.1 release.
    • Clean up stale branches.
    • Review open PRs
  • Dataharvest: Pitch Dangerzone!

Discussion:

  • Juggling priorities:
    • O.4.1 milestone: We should close it once we review the PR for the large tests. Actually this issue might be a collection of other issues (0.4.0 - 0.4.1 comparison, testing Dangerzone against large dataset, storing performance results per release). We'll make this clearer once we review the PR. For now, we will move this to the 0.5.0 milestone
    • IJF results analysis and User Stories development
      • We need to develop user stories based on user interviews.
      • Figure out lessons learned, convert those to User Stories
      • From User Stories, we will have actionable items.
    • Qubes integration Proof of Concept
      • alex: Install QubesOS and have a quick tour (end of week goal)
      • Current Qubes PoC: write down instructions, somehow make it mergeable.
      • Future steps: Create issues for missing functionality, add those in 0.5.0
    • Open PRs
    • TBD: 0.5.0 priorities
    • Dataharvest: Share presentation, pitch tool.

Deeplow:

  • curate IJF data
  • coordinating IJF insights analysis for this week

Wednesday - 2023-05-03

Deeplow:

  • GPG signing key in Mastodon account
  • IJF interviews analysis
  • sync meeting about IJF interviews
  • write a list of branches to protect

Alex:

  • Continued on the security scanning front.

Discussion:

  • https://github.com/freedomofpress/dangerzone/issues/323
    • branches to protect
      • 397-f38 :: PR #399
      • 334-large-test :: PR #386
      • 221-user-ns :: PR #248
      • wip-gvisor ::
      • qubes-build-inst ::
      • release-0.4.1-grype ::
      • 334-large-test-gh-actions ::
      • qubes-integration-poc ::
    • for Alex to check:
      • qa-windows
      • ci-debs
      • ci-wheel-validation
      • 334-large-test-ssh
      • 352-0.4.0-large-tests

Wednesday - 2023-04-26

Alex:

  • We released 0.4.1!
  • TODO: Send a PR for Grype code scanning and close #222 (release-0.4.1-grype)
  • TODO: Send a PR for CI of Debian-based debs (ci-debs)
  • TODO: Fix Debian Bullseye CI (https://github.com/freedomofpress/dangerzone/issues/388)
  • TODO: Clean up stale branches
  • TODO: Write down configuration notes for MacOS (homebrew fork of reprepro, multi-user Homebrew/Python, run Docker without sudo)
  • TODO: Prepare for release retro
  • TODO: Handle architecture options for Debian packages (https://github.com/freedomofpress/dangerzone/issues/394)
  • TODO: Nuke the PackageCloud repo after #323 is resolved
  • TODO: See improvements for yum-tools-prods CI checks (sig checking for unsigned packages, repo configuration for new distros)

deeplow:

  • add fingerprint to website (https://github.com/freedomofpress/dangerzone.rocks/pull/12)
  • make protocol for rotating keys, listing places where key need to be updated (mastodon, website)
  • investigate potential integration complexity of Dangerzone into OCCRP's Aleph
  • TODO: write a list of branches to protect
  • TODO: Add GPG signing key in Mastodon account
  • TODO: performance increase toot: You should expect this last Dangerzone release (0.4.1) to be faster overall on larger documents, especially on apple silicon chips (M1, M2), which now runs natively.
  • TODO: resolve https://github.com/freedomofpress/dangerzone/issues/323

Monday - 2023-04-03

Deeplow:

  • release procedure preparation

Alex:

  • QA and release preparation
  • Check some Windows performance issues between GUI and CLI

Wednesday - 2023-03-29

Deeplow:

  • follow up on libreoffice mime type issues
  • started QA on RC2
  • TODO: open issue about making libreoffice security level to max
  • TODO: open issue about about pipefail (see discussion)
  • TODO: update macOS signing key ID in code and same for Window

Alex:

  • Fix Poetry CI issue
  • Merged the Changelog PR for the Keep a Changelog format
  • Fixed incomlete MIME type support for some files.
  • Looked into running NoMachine securely on Linux
  • Started the QA on Windows and Ubuntu 22.10
    • I should have started it in Fedora 37 :-|
  • TODO: Update Debian PR with the new FPF keys
  • TODO: Send a PR for Fedora instructions.
  • TODO: Check out GitHub deployment keys for apt-tools-prod / yum-tools-prod

Discussion:

  • QA issues:
    • set -o pipefail for bash scripts (see podman save | grep situation).
    • Dangerzone shows an empty image when image load fails -> creeate an issue for that.
    • Bump version to 0.4.1-rc2 in pyproject.toml and share/version.txt
  • In the future, setup LibreOffice security level to Very High.

Monday - 2023-03-27

Deeplow:

  • review 0.4.1 changelog update PR
  • brief look at mime type issue and propose solution (dangerzone#369)
  • internal demo of Qubes integration PoC (Dangerzone-Qubes-trusted-PDF-hybrid)
  • final work on large tests (local testing only)
  • short dive into how the Wikimedia software handles PDF security - create GH wiki page about "similar projects"
  • TODO: look into the mime types
  • TODO: Commence the QA (again!)

Alex:

  • Got nerdsniped with Tailscale, debugged some issue for a little while
  • TODO: Fix Poetry / CI issues
  • TODO: Send a quick fix for MIME type support.
  • TODO: Commence the QA (again!)

Discussion:

  • QA 0.4.1:
    • Alex: Windows, MacOS M1, Fedora 37, Build .deb
    • Deeplow: MacOS Intel, Ubuntu 22.10, Build 2 .rpms
  • Release candidates:
    • Remove PackageCloud logic for tags.
    • Create a release-0.4.1 branch
    • Merge any PRs there that must go in 0.4.1
    • We can merge PRs on the main branch if we don't want them in 0.4.1
    • Tag the first commit of release-0.4.1 as v0.4.1-rc1
    • Do the QA.
    • Once we tag the final commit as v0.4.1, then we merge release-0.4.1 to the main branch.

Thursday - 2023-03-23

Agenda:

  • release check-in
    • Creds for infrastructure have been disseminated
    • poetry update has broken a package - Alex will open issue
    • OpenSUSE 502 errors
    • Onionshare/Dangerzone installation issue only affects users runing old binaries: https://github.com/freedomofpress/dangerzone/issues/153#issuecomment-1479354403
    • Discussion with Micah to come regarding packagecloud / potentially breaking migration to packages.freedom.press which would require manual user intervention
    • Reconvene on Tuesday next week for another release check-in (look out for meeting invite)
  • early discussion about Qubes PoC and some post-0.4.1 planning on that
    • Action: Ro to share some experiences and findings with Alex, deeplow & Allie (optional)
  • (if we have time) a bit of light brainstorming about future DZ capabilities.

Wednesday - 2023-03-22

Alex:

  • Configured a new Dangerzone environment on a Windows VM
  • Worked a bit on dangerzone#153.
    • Found out that it only affects 32-bit versions of OnionShare, so that's a different story.
  • TODO: Resolve dangerzone#153
  • TODO: Fix the MIME type issue

Deeplow:

  • Create github action for SSH debugging github actions via onion service
  • merge languages removal PR
  • provide feedback on "Dangerzone; beyond tools" doc, which evaluates the feasibility of expanding Dangeronze
  • debugging issue ssh connection to github actions
  • TODO: research if there is a way to mark repo as deprecated
  • TODO: do a PR for running the bulk tests locally (without the GH actions part)

Discussion:

  • 0.4.1 announcement on PackageCloud:
    • Took a look at differential updates, Debian/Fedora don't seem to support it.
    • Maybe research if there's a way to deprecate an APT/YUM repo.
    • We will handle the removal of the PackageCloud in the installation instructions for packages.freedom.press, as a note for people coming from Dangerzone 0.4.0.
    • Alert box on Dangerzone start. no post install script because we want to promt the user about updating repos
    • Check if we can do "automated" button which will prompt users for their sudo pass and update the repos
    • Our 0.4.1 release will be both on PackageCloud (last time) and on packages.freedom.press.
  • Large test update
    • Maybe reprioritize it after the 0.4.1 work, just for local development.
  • Onionshare issue

Monday - 2023-03-20

Deeplow:

  • created https://github.com/deeplow/action-ssh-onion-service to ssh into github actions runners
  • found strategy to make have dangerzone run on github actions but feels hacky and creates other problems. Needs more research (probably Alex can tackle this one better).
  • TODO: ping Erik about posting on mastodon about how Dangerzone could have protected against google pixel redaction failures
  • TODO: warn packagecloud users
    • change repo is possible?
    • if not possible

Alex:

  • Basically merged the vast majority of the open PRs.
  • Did some research on how Dangerzone can sanitize other multimedia types
  • Helped a bit with benchmarking and GitHub actions stuff.
  • TODO: Send a fix for the MIME type issue
  • TODO: Open an issue for running Dangerzone as a user other than 1000.

Discussions:

  • Github actions runner. Creating this is not straightfoward due to some key differences:
    • CI user called runniner and with uid=1001 instead of uid=1000. Our env.py (which creates a container to run Dangerzone on) isn't preparef for that.
    • Because of this, podman in the env fails to run due to insufficient permissions (running as user but ~/.local/share/containers is owned by docker)
    • They also have another user calledunneradmin
    • Let's deprioritize this task?

Wednesday - 2023-03-15

Alex:

  • Worked on finding the root cause for the faster OCR performance on 0.4.1, compared to 0.4.0.
    • Trigger: During our benchmarks, we could reproducibly demostrate that the OCR step was significantly faster on 0.4.1. The problem here is that we hadn't changed something in this particular step for 0.4.1.
    • Test subject: https://workstation.securedrop.org/en/stable/SecureDropWorkstation.pdf
    • Times for converting the above document:
      • 0.4.0 - No OCR : 52s
      • 0.4.0 - With OCR: 250s
      • 0.4.0 - Just OCR: 198s
      • 0.4.1 - No OCR: 20s
      • 0.4.1 - With OCR: 177s
      • 0.4.1 - Just OCR: 157s
    • Hypotheses:
      1. The image size (RGB file) is different between 0.4.0 and 0.4.1, due to pdftoppm. This hypothesis is wrong, since the image width and height are the same (1275x1651) in both versions. Given that RGB is a raw image format (i.e., with no compression), the image size should be the same.
      2. Another comamnd (probably gm) is the one to blame for this difference. This hypothesis is wrong. I verified by timing the runtime of each command in the container that tesseract is responsible for the 198s vs 157s difference.
      3. OCR does not work / do a good job on 0.4.1. This hypothesis is wrong. The image quality of the resulting PDF and the character detection seem to work the same in 0.4.0 and 0.4.1, at least on some random samples I chose.
      4. The RGB file produced by pdftoppm is more "readable". This hypothesis may be correct. While the RGB files have the same size, I saw that the PNG files, as produced by the RGB files, differ in size. In general, the 0.4.1 PNGs were slightly bigger in size, which could mean that the original files contain more info and are thus harder to compress.
      5. In 0.4.1 we also installed poppler-data, a package with additional fonts. It could be that tesseract-ocr was trained on these sets and thus it scans them faster.
    • Conclusion: Whatever the underlying reason for this performance speed-up, may understanding is that it seems to be benign and it shouldn't worry us.
  • Discovered an issue with our MIME types. We don't handle application/zip and application/octet-stream, meaning that we ignore some perfectly valid files (dangerzone#369).
  • TODO: Create a spreadsheet with our benchmark results.
  • TODO: Send a fix for the MIME type issue

Deeplow:

  • continue CI 200 doc short test
    • run into issues with git-lfs in CI "bad credentials" appears to be a bug in git-lfs
    • move on test on github CI
    • running into issues with "permission denied" when creating files in home folder of CI

Discussion:

  • Insights from the benchmarks:
    • No regressions for now
    • The OCR speed up seems to be benign for now
    • 1.5x speed up for small files and 4x speed up for files with lots of pages (OCR enabled)
    • 4% less failures due to PDFtk
    • both versions fail on ~380 documents due to unsupported format (dangerzone#369)
    • analysis limitation: we didn't do visual diffing on PDFs to see if there were rendering issues (appart from the occasional manual inspection)
    • analysis limitation: We ran the tests on bare metal, so timeouts will be less rare than in Windows/MacOS VMs.

Monday - 2023-03-13

Deeplow:

  • merge: Fix "Choose..." button not opening dir selection dialog on Qt6 (#361)
  • final PR review and approval of poetry PEP 668 compliance (#353) (latest poetry CI breakage)
  • review all other pending PRs
  • revisit 0.4.0 large tests (and fix issue causing wrong result)
  • propose changes to changelog (dangerzone#366)
  • Investigate tesseract library not reporting issues on 0.4.1 but reporting them on 0.4.0: turns out 0.4.1 reporting was not being applied to second container whereas in 0.4.0 branch it was
  • Adapt website to have two mac downloads (platform-specific) - dangerzone.rocks#10
  • create PR for large_doc_set + CI for running daily on subset

Alex:

  • Merged most of the pending PRs
  • Sent a PR for our changelogs, and a PR for fixing a minor issue when building a container image.
  • Started looking on performance comparison between 0.4.0 and 0.4.1.
  • TODO: Create a spreadsheet with the results
  • TODO: Find the root cause for the tesseract delay on 0.4.0.
  • TODO: Understand why a specific document fails to convert.

Thursday 2023-03-09

Agenda:

  • Agree on must-do remaining tasks prior to release
  • Identify any stretch goals if we have time due to release infrastructure readiness delays
    • Use the stretch goal label for these issues

Action items:

Notes:

  • update on speed difference from 0.4.0

    • seeing inconsistent results around performance between 0.4.0 and 0.4.1

    • how we are splitting pdfs into mutliple images might have something to do with it

    • without ocr, 0.4.1 is faster

    • with ocr, 0.4.1 is even faster

    0.4.0 on [X platform/hardware] with [Y features] on [Z dataset]: performance / error rate

    0.4.1 on [X platform/hardware] with [Y features] on [Z dataset]: performance / error rate

    • deeplow's run:

      • 0.4.0 on Fedora 37 with OCR disabled on 200 docs: 823s

      • 0.4.1 on Fedora 37 with OCR disabled on 200 docs: 595s (x1.4 speedup)

      • 0.4.0 on Fedora 37 with OCR enabled (eng) on 200 docs: 3098s

      • 0.4.1 on Fedora 37 with OCR enabled (eng) on 200 docs: 1281s (x2.4 speedup)

    • Alex's run:

      • 0.4.0 on Ubuntu Focal with OCR disabled on 200 docs: 759s

      • 0.4.1 on Ubuntu Focal with OCR disabled on 200 docs: 478s (x1.6 speedup)

      • 0.4.0 on Ubuntu Focal with OCR enabled (eng) on 200 docs: <took several hours, lots of timeouts, CPU thermal throttling is suspect)

      • 0.4.1 on Ubuntu Focal with OCR enabled (eng) on 200 docs: 1218s

    • All tests, except were noted, fail on ~50 docs, with "The document format is not supported" on both versions.

Monday - 2023-03-06

Deeplow:

  • Fix the Choose folder bug on MacOS.
  • Wrap up the Fedora 37 QA
  • "investigate why dangerzone description on windows is still wrong" - in the end it was correct, but just not displayed by default (opened dangerzone#359)
  • run large tests on 0.4.0 (dangerzone#352) - tests were running over the weekend and still haven't finished
  • TODO: re-review dangerzone#353
  • TODO: The tesseract library
  • TODO: adapt website to have two mac downloads (platform-specific)
  • TODO: prepare some release notes to toot about substantial facts (when doing the changelog Alex will give some user-noticeable changes)

Alex:

  • Finished QA testing on Ubuntu Kinetic
  • Found an issue in our Debian packages, built from Ubuntu Focal.
    • Basically the entrypoint is different across OS, due to the setuptools version.
    • I've tested with a .deb produced by Debian Bookworm, and it seems to work cleanly across platforms
  • I tried to create a GitHub actions job that tests the above:
    • Create a dev environment for Debian Bookworm, produce a .deb, create an end-user environment in all platforms, install the .deb there.
    • I tried to do so using caching, but it seems that the cache gets busted when building more than one Docker image. We don't exceed the 10GiB limit, so this is very weird.
    • TODO: Create a GitHub actions job regardless, even with no caching involved, to help cementing our hypothesis that .debs produced by Debian Bookworm can be used in all of our Debian-based distros.
  • Sent an update for dangerzone#353, since the new Poetry installation methods failed on Ubuntu Focal and Debian Bullseye.
  • Sent a PR for bumping our timeouts (dangerzone#363)
  • TODO: (leftover): Send some fixes for the release instructions.
  • TODO: Review dangerzone#{356,361}
  • TODO: Send a PR for Fedora instructions, same as we did for the Debian one.
  • TODO: Finalize the Debian instructions PR
  • TODO: Create a lint for changelogs

Discussion:

  • The tesseract library no longer outputs a message for calculating the size of the page.
    • Why does it do that though? Have we changed something? We need to investigate.

Tuesday - 2023-02-28

Alex:

  • Finished QA testing on MacOS M1
  • Sent a PR for our recent CI breakage due to PEP 668 (dangerzone#351)
  • Started working on Qa for Ubuntu Kinetic.
  • TODO: Wrap up the Ubuntu Kinetic QA
  • TODO: Send some fixes on the release instructions
  • TODO: Produce a .deb that works on all Debian-based environments.
  • TODO: Reproduce some MacOS findings during QA.
  • TODO: Send a PR for the timeout issues

Deeplow:

  • improve CI MSI building check (dangerzone#347)
  • 11K docs test - summarize test results (it took 20h45m ~ 8s/document)
  • run QA on windows and fedora 37
  • TODO: Fix the Choose folder bug on MacOS. (deeplow)
  • TODO: Wrap up the Fedora 37 QA (deeplow)
  • TODO: investigate why dangerzone description on windows is still wrong
  • TODO: dangerzone#352

Discusssion:

  • Timeouts: We have seen that the .ppt file may timeout in our tests.
    • Let's have a minimum timeout of 1 minute, and multiply the proportional timeout by x3
  • General todos:
    • Changelog / Version bump
    • Providing instructions for packages.freedom.press (Fedora)
    • Update the website to provide different links for MacOS architectures.

Monday - 2023-02-20

Deeplow:

  • feature-comparison with PDF Redact Tools (deprecated) add issue about documenting this
  • fix issue with macOS container generation PR
  • (final) review of "improve directory handling" PR (dangerzone#336)
  • basically done with work on large doc test
    • make way of rebuilding the documents library and updating it
    • run test over several thousand docs and analyse results for timeouts
  • merged arm64 docker image PR (dangerzone#337)
  • looking a bit more into International Journalism Conference about possible place to do user research
  • Write some initial notes about Dangerzone - SecureDrop integration in Qubes
  • TODO: review pending PRs

Alex:

  • Sent a PR for build issues for Ubuntu Focal
  • Sent a PR for a PySide2 issue... in Ubuntu Focal
  • Sent a PR for replacing references to First Look Media in the code.
  • Merged some PRs that I sent (debian packaging PR for apt-tools-prod, temp dirs PR)
  • TODO: Test packages.freedom.press locally and send a PR with instructions for user and dev
  • TODO: By the might of Zeus, I must install Windows
  • TODO: Send a PR for yum-tools-prod

Wednesday - 2023-02-15

Alex:

  • Wrapped up Debian packaging PR (dangerzone#322)
  • Sent a PR with the resulting .debs to the apt-tools-prod repo
  • Stumbled into some .deb build issues on Ubuntu Focal.
    • TODO: Send a PR for these issues.
  • Replied to open questions on several open PRs
  • TODO: Send fixes to my open PRs

Deeplow:

  • almost done with the testing

Monday - 2023-02-13

Deeplow:

  • Finished reviewing of proportional timeouts PR (dangerzone#332)
  • large test set - analysis of results sorting throug common error types

Alex:

  • Sent a PR for our temp files issue (dangerzone#336)
    • Also fixed a long standing issue wrt "permission denied" errors (dangerzone#335)
  • TODO: Wrap up Debian packaging PR, based on our team meeting.
    • Do we need a .deb for each Debian distro?
  • TODO: Open PR on the LFS repo, that will provide 0.4.0 .debs (currently for testing)
  • TODO: Reply to comments on my PRs.

Discussion:

  • Qubes meeting preparation:
    • Introduce Dangerzone:
      • Based on TrustedPDF, which they can wrap their heads around. Where do we diverge though?
        • OCR
        • Batch conversion (with throttling)
        • 2-step process: second step also happens in container for practical reasons.
        • Multi-platform
        • Multiple filetype support.
        • Timeouts
        • Progress reports
        • Compresssion
        • (not ideal) we print the attacker-controlled message to the UI.
    • Question time: Given the above, if one were to piggy-back on the TrustedPDF architecture (dom1 <-> dom1 communication), would there be an issue? Would we be able to get progress reports? Would we be able to pass parameters for the conversion process?
  • Large dataset:
    • We need to log the command output within the container (either failed or success cases). The reason is that some commands may complete successfully, but they may throw errors in their stderr, to indicate that the document is corrupted (think LibreOffice - Java issue).
    • To do so, the container will prepend the logs with a special prefix, and the host will be able to handle it.
    • To store these logs in files, the host must be in dangerzone deb mode, and the user needs to pass an environment variable to also enable command logging.
    • If we are in a dangerzone dev mode -> we print those logs to stderr.
    • If we also are instructed to write the logs in a file -> we also write these logs to a file.
    • If we are not in a dangerzone dev mode -> log an invalid json error

Wednesday - 2023-02-08

Alex:

  • Sent a PR for our CI issues.
    • Merged the PR as well, and now our CI is green again.
  • Reviewed the PRs that deeplow sent.
  • Pinged Maeve for the Debian PR.
    • Will reply to my comments soon.
  • Drafted a status report about the 0.4.1 milestone.

Deeplow:

  • preview DZ sanitization proposal (engineering#11)
  • continue working on large test set

Discussion:

  • large test test:
    • always run with --ocr
    • store expected output and result (pass/fail)
    • if the expected result diverges from the expected
    • consider increasing parallel number documents
    • large test set discovery: we'll have it as a submodule with a pytest

Action points:

Monday - 2023-02-06

Alex:

  • Sent a PR for proportional timeouts.
    • Contains a commit that allows users to tweak the timeout value. Can be dropped if we don't like it.
  • Looked into the building PySide2 from the Debian repos.
    • Not fun, we may have to create wheels and bundle them with our project.
    • At this point, one has to ask if it's better using PySide6 wheels for development, and PySide2 packages only for user environments.

Deeplow:

  • look into why certain deps are failing
  • look into adding PySide2 as dependency from source
  • briefly taking a look at the PR for fixing timeouts (didn't test it yet)
  • looking into testing a large document set a logging outputs to discover timeouts
  • fix various things found while converting those documents
  • made taiscale work in Qubes (installed only on appVM)

Discussion:

Action Points:

  • Deeplow: finish reviewing of proportional timeouts PR (dangerzone#332)
  • Deeplow: Continue work on large document tests
  • Deeplow: open PRs for fixing issues during document set conversion tests
  • Deeplow: add PR for running daily some document conversion tests
  • Deeplow: go through my PRs and merge if approved and they pass the CI (dangerzone#332)
  • Alex: Fix CI (PySide6)
    • Work on top of the 2023-01-lint branch.
  • Alex: Review the PRs that deeplow has sent
  • Alex: Work on temp dirs issue, and also create a temp dir for input files (so that we fix the various permissions denied issues)
  • Alex: Ping Maeve for the Debian PR.

Wednesday - 2023-02-01

Alex:

  • Sent a PR that fixes pdfinfo and OpenJDK issues.
    • With this PR, the conversions should start working again.
  • Working on introducing proportional timeouts.

Deeplow:

  • Updated branch based on changes by @AlexP in adding pyside6 support for mac/windows
  • Checking container code for non-doc dependent timeouts (they should depend on the doc's size)
  • side-tracked with investigating how some inter-vm stuff works (could be interesting from SD-DZ communication)
  • exit with code 1 when one doc failed to convert (dangerzone#329)
  • review dependencies fix (dangerzone #328)
  • investigate the multiple failures in the CI

Discussion:

  • Fixing our multiple CI versions
    • convert-tests: unpin Poetry 1.2.2, and use --no-ansi flag.
    • PySide2 on bookworm and fedora37 (where there is python 3.11) - get from debian sources to it actually supports - HACK use a git repo to get salsa.debian.org PySide2 version

Action Points:

  • Alex: Press pause on timeout PR, review the infra PR
  • Deeplow: Write a failing test for timeout and look for large pool of documents

Monday - 2023-01-30

Alex:

  • Fixed a regression in our Fedora 37 package (dangerzone#156
  • Continued the work on dangerzone#296
    • We can switch to types-PySide2 instead of PySide2-stubs, and sidestep the linting errors on MacOS M1.
  • Opened dangerzone#320 to track the PySide6/Mypy problem
  • Started working on dangerzone#315
  • Opened dangerzone#321 to track our visual testing needs for our CI tests.
  • Leftovers: Send PR for OpenJDK, LibreOffice conversion warnings.

Deeplow:

  • Reviewed hotfix branch branch that fixed #307 (UI did not start in fedora 37)
  • Merge #302 (add isolation providers)
  • Create issues for unintentially introduced / other discovered bugs bugs from #305
    • #315 removal of java dependency lead to broken .xls files
    • #316 some normal container output is being printed back to the host.
    • tesseract (OCR scanning) is guessing image's DPIs even though we provide it via --dpi (will need investigation), but seems minor.
  • Final test and closing PR for showing exceptions raised during a document conversion (#313)
  • open PR for supressing container output when it was not meant to be parsed (#326, which fixes #316)

Discussion:

  • Enforce updating changelog when closing an issue.
    • Preferably add a lint that checks if a commit closes an issue without updating the Changelog (duh).

Action points:

  • Alex: Create an issue for converting a 1000 pages PDF, and adding it to our 0.4.1 milestone.
  • Alex: https://github.com/freedomofpress/dangerzone/issues/317
  • Deeplow: (test-driven) make pdfunite step dependant on num pages
  • Deeplow: https://github.com/freedomofpress/dangerzone/issues/318
  • Deeplow: Container: Fails to calculate number of pages #325
  • Alex: Update #296 with the latest state of the PR, and what deeplow should do.
  • Alex: Create signed per-distro packages in apt-tools-prod (w/ test key)
    • Current state is that we have signed packages, but not with a unique name per distro.
  • (from meeting) go through docs and see if there was any other missing characters due to java or anything else

Wednesday - 2023-01-25

Alex:

  • Looked into dangerzone#294
    • Qt devs still haven't published a PySide2 version that supports Python 3.11
    • TODO: Open an issue for that on their issue tracker.
  • Looked into our Fedora 37 installation issue:
    • Root cause: we deployed a Fedora 35 package to our Fedora 37 repo, which installs Dangerzone for Python 3.10
    • Suggestion: Create a new package (0.4.0-2) solely for Fedora 37, push it in a hotfix-0.4 branch and tag it as v0.4.0-2

Deeplow:

  • debugging CI issues on windows
  • merge CI failure (re-fix) (dangerzone#312)
  • add timeout proportional to # of pages in to PDF->Images code and merge (dangerzone#232)
  • debugging multiple issues introduced by dangerzone#306 after merging (didn't find the root cause yet)

Discussion:

  • How to find bugs in the conversion process. Currently our conversion process supresses error messages from the ran commands (e.g. libreoffice, tesseract).
    • We need to get the output of these commands in development mode
  • Security Idea: move conversion error strings to host, container sends numeric ID and page number.
  • Visual document diffing tests for catching changes (no, Dangerzone does not convert deterministically -- hashes are not possible)
  • Fedora 37 package issue:
    • There is a hotfix-0.4 branch that is ready to tag as v0.4.0-2.
    • Only Fedora 37 packages will be deployed.
  • General packaging improvements:
    • The following are suggestions only.
    • Ideally do not bundle Dangerzone with the .deb/.rpm package.
    • Let the builder build those packages, add them in the Git LFS repo, sign them and send the PR.
    • Let the maintainer verify these packages out-of-band (compare with source files), and resign the packages, and accept the PR.

Action items:

  • deeplow: Check hotfix-4.0 for fedora branch https://github.com/freedomofpress/dangerzone/commits/hotfix-0.4
  • deeplow: merge #302, then #303
  • deeplow: address review comment in #313
  • deeplow: After #313 is merged, open PR for exiting with non-zero if conversion failed (cli + gui)
  • deeplow: report issue where tesseact guesses DPI even though it's passed as an argument.
  • deeplow: (stretch) investigate error reporting in GUI: if unexpected conversion message it will stop showing progress information and just freezes. But in reality the document is still converting
  • Alex: Fix the Fedora 37 packaging issue.
  • Alex: Take over dangerzone#296.
  • Alex: Add JDK again as dependency and test affected file conversion dangerzone#315
  • Alex: Check why/if __pycache__ files exist in RPM packages.
  • Alex: open issue for discussed visual output diffing
  • Alex: Check out if LibreOffice has a way to report conversion warnings.

Monday - 2023-01-23

Deeplow:

  • reorganize isolation provider PR to have isolation providers as separate files
  • investigate a bit PyMuPDF as dependency alternative for processing PDF files. ((part of dangerzone#305)
  • Make PDF conversion faster (dangerzone#305)
  • Meeting w/ infra team for updates regarding the readyness for the next release

Alex:

  • Fixed Poetry issue in our CI builds.
  • Worked on packages.freedom.press
    • Tested it out on a container.
    • Wrote down the current workflows for devs/users.
    • We need to improve these workflows, especially as development requirements are ~2.5hours of build time and 45GiB of space.
  • Made a review pass of all the open PRs
  • Left out: Fedora 37 instructions, setup Thinkpad

Discussion:

  • What to do about the lint on macOS? (dangerzone#296)
    • create an issue that we can't have mypy liting on ARM macs and pyside6
    • change make lint to say: "on m1 macs it can't run; see issue #???"
  • packages.freedom.press workflow:
    • It's not practical to have the package building take so much time and space.
    • Might be worth having a runner and verifying the built packages locally.
    • Else, we will have to bite the bullet and do it ourselves.

Action items:

  • deeplow: simple fix for (discussion on macOS linting)

Wednesday - 2023-01-18

Alex:

  • Reviewed all open PRs except for dangerzone#310
  • I haven't tested the dangerzone#305 PR, because there are things that contend for disk space.
    • I think I will have disk space today.
  • Stumbled on an issue regarding apt-tools-prod: there's a hash mismatch type of error that I'm not sure how to deal with.
    • I've started a discussion with Maeve on that subject.
    • I'll work on updating our installation instructions for packages.freedom.press on Debian, so that we can have them ready once the apt-tools-prod issue is resolved.
    • Fun fact: At least in our 0.4.0 release (haven't checked yet on main), the Debian debs have the exact same hash. Same applies to the Ubuntu debs, but they have a different hash from the Debian one. Not sure how we can use this fact, as it can change from release to release.

Deeplow:

  • investigating why exceptions in conversion process where not raised (opened issue at dangerzone#309)
  • retest pyside6 support in remote mac (dangerzone#296)
  • address feedback in isolation provider abstraction (dangerzone#302)

Action items:

  • Alex:
    • Review dangerzone#310
    • Test dangerzone#305
    • Second pass of the rest of the PRs.
    • Recheck the state of dangerzone#294 (Fedora 37 build environment)
    • Create PR for packages.freedom.press instructions
    • (stretch) Start seting up Thinkpad for Windows testing
  • Deeplow:
    • TODO: continue addressing PR feedback
    • TODO: What's the state with RPM updates.

Monday - 2023-01-16

Alex:

  • Reviewed dangerzone#{295,296,297,301,302}
  • Will merge today dangerzone#289
  • TODO: Review dangerzone#30{3,4,5}
  • TODO: Push the latest 0.4.0

Deeplow:

  • TODO: Fix the mypy lint errors on the PR for PySide6
  • TODO: Finish addressing open PR comments
  • TODO: Tuesday - check how the RPM package publishing status is
  • TODO: (strech) try to reproduce again dangerzone#153 (installing w/ onionshare)

Wednesday - 2023-01-11

Deeplow:

  • Submited PR to dangerzone.rocks to add security note in about.html (dangerzone.rocks#8)
  • Merged Qubes build instructions (dangerzone#284)
  • Finished removal of unused dependencies in container image dangerzone#305 (PDFtk in particular)
  • Continue work on seccomp stuff. Reading up the paper behing the "confine" tool and playing around with it

Alex:

  • Addressed all the review comments on dangerzone#289.
  • Added some extra environments for testing (Fedora 35, Debian Bookworm, Ubuntu Jammy).

Discussion:

  • using "confine" approach for seccomp policy
  • considering gvisor - doesn't need as many syscalls

Monday - 2023-01-09

Deeplow:

  • merge dangerzone.rocks#7
  • Addressed comments in dangerzone#288 - ready for merge
  • dangerzone#284 is ready to merge -- additonal changes will follow in another PR (as they were extra)
  • dangerzone#248 - Implement Linux User Namespaces support - reviewed
  • discussion with @eaon about SecureDrop integration details
  • TODO: submit PR to dangerzone.rocks to add security note in about.html
  • TODO: re-evaluate dependencies dangerzone#232

Alex:

  • Made almost all of the changes we discussed on dangerzone#289.
    • We have some deps that where missing, and some system libraries that
  • Testing those changes locally and on our CI.

Wednesday - 2023-01-04

Alex:

  • Going through the review comments on dangerzone#289

Deeplow:

  • Moving on with dangerzone#229, where we have in the works a dummy driver for Dangerzone conversions
    • This dummy driver will not do any actual conversion, which means that it does not test a class of problems. It will merely test that pre-conversion / post-conversion works.
  • TODO: outstanding tasks from last meeting
  • dangerzone#229: add windows / macOS to CI and run the cli tests

Monday - 2023-01-02

Deeplow:

  • Reviewing QA semi-automation (dangerzone#289).
    • detect when platform build instructions changed (so qa.py and BUILD.md don't get out of sync)
  • Reinstalled windows system and re-built testing environment
    • played a bit with installing dangerzone and onionshare at the same time to reproduce (unsuccessfully) dangerzone#153
  • Show dangerzone version in CLI and GUI (dangerzone#295)
  • Address PR feedback in DZ website (dangerzone.rocks#7)
  • Make fix for "Open With" dialog on Windows showing the DZ description instead of the app name (dangerzone#297)
  • Address comments about a potential DZ flatpak version
  • Add PySide6 support to MacOS and Windows (dangerzone#296)
  • Played around with bundling pyside6 with .debs (will be needed see dangerzone#211)
    • In particular, investigate bundling PySide6 .whl with an RPM package
  • start work in incapsulating isolation provider (containers) as a path towards dangerzone#229

Alex:

Action points:

  • deeplow: update issue dangerzone#229 and continue working on it
  • Alex: Address feedback on dangerzone#289.
  • Alex: Review dangerzone#{295, 296, 297, 300}
  • deeplow: merge dangerzone.rocks#7
  • deeplow: address comments in dangerzone#288
  • deeplow: ping ro about dangerzone#284

Wednesday - 2022-12-14

Deeplow:

  • Make fix for instability issue dangerzone#217 (with PR dangerzone#288) and make a script to identify podman version where the root cause issue was fixed (it wasn't in the release notes)
  • Quick investigation on the state of Pyside6 (Qt6) availability in distros (it doesn't look promising) - dangerzone#211
  • TODO: review https://github.com/freedomofpress/dangerzone/pull/289

Alex:

  • Created an issue for Qt testing.
  • Created an issue for using packages.freedom.press for APT packages.
  • Created an issue for failing CI tests
  • Created a milestone for 0.4.1. We should discuss priorities for them and divide tasks.

Discussion:

Monday - 2022-12-12

Alex:

  • Trying to wrap up the QA PR, currently ~1K LOC. Will contain scripts for building Linux environments where Dangerzone can run, a script that follows our QA steps, and run CI tests for Fedora and other flavors on CircleCI.
  • TODO: Meeting notes that I need to update.
  • TODO: Open some GitHub issues for the next release.
  • TODO: Create a PR for linting unused imports.

Deeplow:

Discussion:

  • Fixing dangerzone#217 - if podman version <4.0.0 then we run tests sequentially

Wednesday - 2022-12-07

Alex:

  • I'll take a look at dangerzone#284. Looks good, and I might add some things as well.
  • Working on the QA PR, I'll try to also add it in the CircleCI configuration.

Deeplow:

  • Resumed on research about thumbnails and background reads of the file on MacOS.
    • Even if you disable thumbnail previews, downloading the file from Safari passes through the Indexer / thumbnail preview pipeline.
    • Will report the full findings
    • Maybe it makes sense to include our findings in the section on how one can still get hacked, even with Dangerzone.

Monday - 2022-12-05

We have a release!

Alex:

  • Reviewed the README PR
  • Retrospective on release and consideration on which issues are release papercuts that can be improved

Deeplow:

  • Sent a PR for updating Homebrew (homebrew-cask#136918)
  • Sent a PR for updating the screenshots in the README dangerzone#282

Action points:

  • Alex: Sent a PR for dangerzone.rocks with the new screenshots.
  • Alex: Send PR for Spin Linux environments...
  • Alex: Check CI tests on MacOS / Windows
  • Deeplow: Work on Bug: cannot install with Onionshare #153
  • Deeplow: Evaluate if Disable previews/thumbnails is effective #65 and after that tests fail non-deterministically.

Discussion:

Candidate Issues for Releases

Small issues that can help subsequent releases:

  • Documentation: Disable previews/thumbnails #65
    • May take more time, since it's a complex issue.
  • Bug: cannot install with Onionshare #153
  • Support building on M1 macs #177
  • Migrate to Qt6 before Qt5 end-of-life #211
  • Test packages.freedom.press workflow #220
  • tests fail non-deterministically (Error: error retrieving size of image) #217
  • Get the Dangerzone version from the GUI and CLI #219
  • Automated Testing in Windows & Mac #229
  • "Open with" on Windows shows Dangerzone Description instead of "Dangerzone" #283
  • Spin Linux environments for various distros/versions
    • Document how to setup X11/Wayland forwarding within a container.

Issues for a 0.5.0 release:

  • Defense in Depth: ...
  • User research: ...

Monday - 2022-11-28

Deeplow:

  • started QA on Fedora, found out that it has Python 3.11 version.
  • worked on MacOS QA. Posted update here

Alex:

  • Found out that Ubuntu 22.10 "Kinetic Kudu" has been released 1 month ago - should be supported. Sent a PR for that
  • tested on debian buster (will still be supported for a year from now) - doesn't have podman, but the instructions are the same as ubuntu 20.04 - some more libs are required (buster backport, libseccomp)

Discussion

  • dangerzone#269 - change the default app window on MacOS
  • Python3.10 isn't installed on Fedora 37. We need to check if Dangerzone works with Python 3.11, and bump the dependency on Poetry.

Action points:

  • deeplow: address dangerzone#269 and other MacOS related issues
  • deeplow: Check Python 3.11 in his Fedora environment
  • Alex: Complete QA process on Windows and ubuntu 22.04 / 22.10
  • Alex: send PR for QA script
  • Alex: Review deeplow's PRs
  • Alex: Check out Dangerzone on a real Kinetic Kudu

Wednesday - 2022-11-23

Alex:

  • Proposed an implementation for skipping slow steps in GUI.
  • Worked on publishing a Debian package on apt-tools-prod.
  • Reviewed dangerzone#247.

Deeplow:

  • finished work on dangerzone#255 (opt to move untrusted files)
  • give more feedback on QA process for Dangerzone

Discussion:

  • timeouts: for this release
    • double timeout to 2m
    • for this release "disable parallel conversions" by setting the max threads to 1
  • Version migration in QA tests:
    • "Install the previous version of Dagnerzone, and tick some non-default settings"
    • "Install the new version of Dangerzone system-wide, and ensure those settings exist"
    • Do grep -f share/image-id.txt <(podman images)

Action Points:

  • Deeplow: open PR for showing number of selected docs while in settings
  • Deeplow: open PR for disabling parallel conversions
  • Alex: Review dangerzone#255
  • Alex: open PR for doubling timeout
  • Alex: Address comments for QA PR

Monday - 2022-11-21

Deeplow:

  • reviewed QA proposal PR
  • Opened issue for missing documentation for the whole project - and created branch with code implementation (will open PR soon)

Alex:

  • Sent a QA proposal for release testing.
  • found way to open documents
  • Finished the review of dangerzone#247
  • Started looking at linting unusued imports.

Discussion:

  • The infra may not be there before the feature freeze. We need to address this somehow, while still contributing to main.
  • Timeout increase - why are there timeouts? (investigate)
    • Per our discussion, it seems that it would be best if we have a soft timeout; warn the user that the conversion takes too long, and ask them if they want to skip OCR.
  • Things we want to have ready by the feature freeze:
    • Better timeouts (discussed above)
    • Option to move untrusted files into subdirectory after conversion (dangerzone#251)
    • QA tests (dangerzone#246)

Action Points:

  • Alex: check out the final changes for dangerzone#247.
  • Alex: investigate timeout issues and make a PR for not limiting timeout (see discussion)
  • Alex: post thread about our "soft timeout" ideas and ask Micah for ack
  • deeplow: merge dangerzone#247
  • deeplow: open PR for dangerzone#251
  • deeplow take a look at the linux-namespace support dangerzone#248

Wednesday - 2022-11-16

Alex:

  • Dived into seccomp and gVisor.

Deeplow:

  • Changing the box for the output filename to allow arbitrarily naming for single file conversions turned out to be more difficult.
  • Almost done with the fixes in dangerzone#247.
  • Started to work on moving the unsafe files on a separate directory, kind of like Qubes TrustedPDF does.

Action points:

  • Alex: Send a PR for bumping the timeout.
  • Deeplow: Open issue for missing documentation for the whole project.

Monday - 2022-11-14

Alex:

  • Final review of dangerzone#209
  • First round of comments for dangerzone#247
  • Created a draft PR for Linux User Namespaces.

Deeplow:

  • final review of dangerzone#241 (ubuntu focal support)
  • address feedback in dangerzone#209 (multi-doc cli support)
  • investigate way to reliably detect seccomp policy violation (dangerzone#225)

Discussion:

  • Seccomp policy violation detection: have a way to say to the user that opening the document failed in the sandbox -- discourage the document opening. -> A way to contact the dangerzone team to solve the situation
  • maybe a static analysis tool to check if what syscalls certain documents types lead to. This could make us more sure about
  • for 0.4.0 we may not be able to give users a foolproof way to detect a seccomp violation -> open issue for this
  • UX feedback on multi doc conversion:
    • Selecting a directory would be great, but let's not have it in this PR.
    • Selecting files from different directories requires two steps (one to add files, and one to add more). Let's add this in a next release.
    • Seeing the files takes real estate from the settings. Having two tabs for settings will require quite a bit of work. This is better suited for a next release.
    • Let's use the full path of the directory where we will save the converted files
    • In general, UX-wise, having safe files with -safe.pdf may not help users, because they may erroneously choose the unsafe (and shorter) file name. In Qubes TrustedPDF, the original files are moved to an "untrusted files" directory. We could have a checkbox in the setings for that.
    • We could have a mode where we allow the user to change the output filename, if they convert a single file.
    • Once the conversion ends, let's show the original filename.
    • We have a bug where OCR gets applied the next time we run Dangerzone.
    • Showing full context for a single conversion (logs, tracebacks) is something we should consider when we discuss the UX side of things, but not right now.

Action points:

  • Alex: Jump on board the seccomp PR.
  • Alex: Document how we should do QA on Dangerzone.
  • Alex: Write down UX comments that we want to consider in the future on dangerzone#117.
  • Alex: Review dangerzone#210.
  • Alex: Create PR for linting unused imports.
  • Deeplow: Do some fixes on dangerzone#247 based on Alex's review comments.
  • Deeplow: Ask for feedback on the idea of moving untrusted files to subdirectory

Wednesday - 2022-11-02

Alex:

  • Sent PR for Ubuntu Focal support.
  • Minor PRs Changelog, Poetry.lock files, Poetry instructions.
  • Done with the review of dangerzone#216
    • Tried out asyncio support successfully.

Deeplow:

  • address review comments for dangerzone#216
    • figure out alternative strategy for avoiding wildcard injection vulns by the user mistakenly running $ dangerzone * on a maliciously crafted document set.
    • rebase branches based on it

Discussion:

  • Regarding the asyncio support, let's merge this PR first, and then see if we can add it in the GUI PR.

Action points:

  • Alex: Start looking at AppArmor

Monday - 2022-10-31

Deeplow:

  • Fixed and merged dangerzone#208
  • Merged Debian 12 and Fedora 37 support (dangerzone#230 / dangerzone#233)
  • Rebased PRs 2/3 and 3/3 regarding multi doc support
  • Tested out Ubuntu 22.04 on our CI runners, but encountered issues.
  • Checked out the Alpine status wrt reproducible builds.
  • Started working on the seccomp issue.

Alex:

  • merged the windows unittest PR
  • acked the PRs for Fedora, Debian & the refactor container
  • reviewed the part 2 PR for the multi-doc support (#201)
  • opened a few small fixes PRs
  • added a couple of stuff to the internal wiki

Action points:

  • deeplow: Create an issue about handling error messages from the dangerzone container, and the security/UX impliciations that this has.
  • Alex: 2nd pass of the dangerzone#201
  • Alex: update ubuntu focal install (& test on focal) instructions

Wednesday - 2022-10-26

  • Estimate time for each issue in prep for mgmt meeting

Deeplow:

  • deeplow: merge commit for dangerzone#161
  • deeplow: (re-review) dangerzone#235
  • deeplow: (re-review) dangerzone#208

Alex:

  • Merged the dangerzone#235 PR

Estimations

Sizes: XS, S, M, L Fibonacci-like increments: XS = 1 hour, S = 4 hours (1 day), M = 8 hours (2 days), L = 16 hours (4 days),

  • #205: (GUI part of #77)
  • #206: Dev: M, Review: XS -> ~2 days
  • #188: Scoping: M -> ~2 days
  • #77: Dev: L, Review: M -> 6 days (d: 2 weeks more realistically)
  • #204: (GUI part of #77)
  • #207: Dev: S, Review: S -> ~2 days (d: 1 hour)
  • #209: Dev: S, Review: M -> ~3 days (d: 1 week more realistically) <- + #205 = multi-doc support (CLI)
  • #220: (part of untracked)
  • #233: Dev: -, Review: XS -> 1 hour
  • #230: Dev: -, Review: XS -> 1 hour
  • #225: Dev: L, Review: M -> 6 days (d: 4 days)
  • #227: Dev: M, Review: S -> 3 days
  • #228: Dev: M, Review: S -> ~3 days
  • #224: Dev (only removing sudo & ro fs): S, Review: XS -> ~1 day
  • #232: Dev (only replace pdftk deps): S, Review: XS: -> 1 day
  • #157: (part of #228)

Total time: 35 days * 4 hours / 45 hours per week = 3.1 weeks = ~17th of November

If we remove:

  • #188 (Reproducible builds): -2 days
  • #77 (GUI): -9 days (-10 days +1 day for backporting #205 & #204) <- keep that
  • #207 (Prevent running DZ): -2 days
  • #224 (Prevent root): -1 day
  • #232 (replace pdftk): -1 day -> maybe bump to 0.5.0
  • #224 (remove sudo & ro fs): -1 day

Remaining: 20 days * 4 hours / 45 hours per week = 1.8 weeks = ~8th of November

Plus GUI: 29 * 4 hours / 45 hours per week = 2.6 weeks = ~14th of November

Untracked (not exhaustive list):

  • Provisioning/accessing Mac Minis,
  • Signing on Windows and MacOS
  • Changes to the website and GitHub
  • Build and upload Linux packages, MacOS and Windows binaries
  • Perform QA on all platforms.
  • Homebrew bump release hash & number.

Monday - 2022-10-24

deeplow:

  • Created a template for Dangerzone presentations.
  • Tested submitting a package to packages.freedom.press.
  • Checked out the Windows unit tests and wrapped up the review.

Alex:

  • Sent PR for Windows unit tests dangerzone#235
  • Able to GPG-sign as [email protected].
  • CVE assessments for Dangerzone.

Discussion:

  • the timeout is failing sometimes. How do we solve this?
    • idea: make timeouts proportional to a benchmark ran on first run (might not be good because CPU bursts exist)
    • idea: more leninent timeout (disadvantage: the user can't have an estimate)
    • better approach: add watchers on subprocess commands
  • Probably the extra dependency from GitLab can be removed, if we side-step the "spliting the pdf into pages" step.
    • We could also split the PDF using tools from the official repos, such as Poppler.

Action points:

  • Alex: Add/Suggest content for Dangerzone presentation.
  • deeplow: merge commit for dangerzone#161
  • deeplow: (re-review) dangerzone#235
  • deeplow: Start with Seccomp issue.
  • Alex: Merge the Debian/Fedora PRs.

Wednesday - 2022-10-19

Alex:

  • Elaborate on defense in depth subtasks
  • went through the Windows unittests and will add that a PR

Deeplow:

  • Finished a big portion of the multi-document support.
  • Timeout part remaining, will require merging a PR by a contributor.
  • Improve thread handling: stopping threads is not straight-forward.

Discussion:

  • updating dangerzone - how to detect new updates (separate issue)
  • separating the container from the download is important
  • supporting ubuntu focal:
    • podman is not available in the 20.04 repos and would require documenting how to add external dependencies.
    • Add note for ubuntu focal in the installation wiki and add instructions.

Action points:

  • deeplow: create issue for minimizing software in container
  • deeplow: Look into stopping threads in multi-document
  • deeplow: comment on security in depth issues
  • deeplow: final review of python 3.10 bump
  • Alex: do PR for window tests
  • Alex: give a look at #167 and ACK it.
  • Alex: update ubuntu focal install (& test on focal) instructions

Monday - 2022-10-17

Alex:

  • Started the umbrella issue for defense in depth.
  • Found some failing tests on the main branch on Windows.
  • Python 3.9 is not easy to install for developers on Linux (deadsnakes repo) and Windows (official installer harder to found).

Deeplow:

  • Fixed concurrency issues with conversion thread limit.

Action points:

  • Alex: Create subtasks for defense in depth, get ACK from rest of people.
  • Alex: Fix the failing tests on Windows.
  • Alex: Probably don't drop the commit that drops the Alert class because we'll need it
  • deeplow: finish multi-document styling (align progress bars), add alert box for when exiting while conversion in progress, increase converstion timeout when multiple docs being converted.
  • Alex: Update the instructions for developing Dangerzone on Linux (add poetry reference)

Wednesday - 2022-10-12

Alex:

  • Shared an update regarding Mac.
  • Very close to merging #208.
  • Didn't manage to work on the security and reproducible builds issue.

Deeplow:

  • Deeplow: multi-document support in GUI: limit threads & add progress icons
    • QThreads implementation is leading to a race condition; Alex suggests we might need to use GDB. Probably inspect the core dump

Action points:

  • Test #208 on Windows as well.
  • (leftover) Alex: Begin the scoping discussion for reproducible builds.
  • (leftover) Alex: List the security vectors we need to treat first.
  • deeplow: review Alex's work on #208 and force-push
  • deeplow: fix Concurrency issues with conversion thread limit
  • deeplow: finish multi-document styling (aligned progress bars)

Monday - 2022-10-10

deeplow:

  • addressed feedback in dangerzone#208
  • css & logic to settings in multi-document support (safe extension -safe.pdf and its customization)

Alex:

  • Several meetings with Sec / UX people.
  • Finished the first pass of the review of #208
  • Took a look at #157

Action points:

  • Alex: List the security vectors we need to treat first.
  • Alex: Share info for the Mac situation.
  • Alex: Edit, test, and merge #208
  • Alex: Begin the scoping discussion for reproducible builds.
  • Deeplow: finish multi-document support in GUI (limit threads & finish styling)

Discussions:

  • reproducible builds:
    • dependencies: poetry.lock
    • for system packages: have a debian snapshot
    • we might want to tackle debian first as this is where we / the SecureDrop team has the most experience

Wednesday - 2022-10-05

deeplow:

  • implementing ModelView approach for document handling. Ideally we'd have frontends (branch: 77-muti-doc-gui)
  • blocker: difficulty in Qt in regards to using dangerzone (see discussion)

Alex:

  • Tested PR #208 on Linux
  • I need to create a build environment on Windows as well.
  • Send rest of the comments on deeplow's #208 PR.
  • Document my forays around security / nested virtualization on a wiki.
  • See how Qubes treats the GitHub commit signatures.

Discussion:

  • ModelView blockers
    • performing the same sequence of calls from the GUI and the CLI is not necessary, since presentation differs. What should be the same is the underlying logic for a conversion of a document.
    • in the GUIs we get the dynamic list of documents and convert them individually. A QThread would call
    • if the Model/View separation is complicated at the moment, we can do it in the future if we hit scalability issues (1000+).

Action points:

  • deeplow: multi-document UI polish & limit number of parallel conversion
  • PR review strategy: avoid rewriting PR history when when the review has started. Preferably append commits but if some history needs to be rewritten or commit messages changed, then do it in new branch called [feature-branch]-N, where N is the iteration number. After OK from the original in-house contributor, force push into [feature-branch].

Monday - 2022-10-03

deeplow:

  • Continue multi-document GUI support

Alex:

Action points:

  • Meet with UX person to go over the Dangerzone UX journey.
  • Deprecate the multi-window functionality, and convert new documents in the existing window. This should affect only MacOS environments, but we expect users to be able to open a second Dangerzone instance, if they want conversion with different parameters.
  • Alex: Share the research on the container security subject, schedule a meeting with Alex M.

Wednesday - 2022-09-28

Alex:

  • Setup a Windows VM with nested virtualization
  • Setup of MacOS environment in KVM (Docker Desktop pending)

deeplow:

  • Finished for the cli bulk document conversion, for the most part

Action points:

  • deeplow: continue implementing GUI bulk document conversion
  • Alex: Today focus on MacOS on QEMU/KVM and make that work. Worst case scenario, help with the PRs of deeplow.

Monday - 2022-09-26

Alex:

  • Wrap up Windows / MacOS platforms by middle of this week.
  • Goal: Manage to merge a PR by end of this week.
  • Goal: Start reading on the literature for securing containers in-depth (tied to the milestone item).

deeplow:

Discussion:

  • for the seccomp/apparmor/selinux scoping discussion (or more general container security) we should schedule that when Alex is done with the dev env. setup on all target platforms
  • brief discussion on build reproducibility
  • parallel discussion should probably use asyncio instead of treading since it is most likely IO-bound instead. But that is a larger refactor which we can do in the future.

Action points:

  • on next monday schedule container security overview (scoping #182)
  • discuss which code parts each of us should be responsible for
Clone this wiki locally