Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide NixOS module option to enable the paperless exporter. #242084

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

ctheune
Copy link
Contributor

@ctheune ctheune commented Jul 7, 2023

Description of changes

Integrate the paperless document exporter as a backup feature into the module.

Also fixes a configuration (quoting) issue.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jul 7, 2023
@leona-ya leona-ya removed their assignment Jul 7, 2023
@leona-ya leona-ya self-requested a review July 7, 2023 15:43
nixos/modules/services/misc/paperless.nix Outdated Show resolved Hide resolved
nixos/modules/services/misc/paperless.nix Outdated Show resolved Hide resolved
nixos/modules/services/misc/paperless.nix Outdated Show resolved Hide resolved
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Jul 7, 2023
@mweinelt mweinelt requested review from lukegb, gador and erikarvstedt July 7, 2023 20:53
Copy link
Member

@erikarvstedt erikarvstedt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are fixups for your first commit, which should definitely be merged.
Maybe add a dedicated PR because these changes are orthogonal to backups and entirely uncontroversial.

@erikarvstedt
Copy link
Member

erikarvstedt commented Jul 7, 2023

The term backup is misleading because this only exports the documents but not the database (containing all the metadata).
Is a doc-only export generally useful for users?

Also, the export command used by the backup service creates redundant copies of each doc in multiple formats, which is also not suitable for backups.

@ctheune
Copy link
Contributor Author

ctheune commented Jul 9, 2023

The term backup is misleading because this only exports the documents but not the database (containing all the metadata). Is a doc-only export generally useful for users?

Well it is what is recommended as the official backup strategy according to the manual. It also seems comprehensive with the metadata. I can see all the table content like tags, correspondents, users, even saved filters and UI settings ...

Also, the export command used by the backup service creates redundant copies of each doc in multiple formats, which is also not suitable for backups.

That is configurable using the command parameters that I've made adjustable.

@ctheune
Copy link
Contributor Author

ctheune commented Jul 9, 2023

Here are fixups for your first commit, which should definitely be merged.
Maybe add a dedicated PR because these changes are orthogonal to backups and entirely uncontroversial.

Thanks for those. I'll double check (and I guess I should add a test) that the quoting works both for systemd and the manage command now.

@erikarvstedt
Copy link
Member

It also seems comprehensive with the metadata. I can see all the table content like tags, correspondents, users, even saved filters and UI settings ...

Ah right, the metadata is exported to manifest.json.

That is configurable using the command parameters that I've made adjustable.

These should be fixed so that no redundant content is exported by default.

I'm still not convinced that paperless needs a dedicated backup service. It can be generically backed up like many other database-based NixOS services: Either by (1) snappshotting and transfering the whole of /var/lib or by (2) using services.postgresqlBackup and just rsyncing /var/lib/paperless/.
Note that like (2) the document exporter method used in this PR is not atomic/consistent. Quote from the manual: "Before making backups, make sure that paperless is not running."
Let's delegate the decision to other paperless users. @mweinelt, @Flakebi, @lukegb, @leona-ya, what do you think?

@erikarvstedt
Copy link
Member

Thanks for those. I'll double check (and I guess I should add a test) that the quoting works both for systemd and the manage command now.

There's no quoting involved when setting up the systemd env, so this has always worked correctly.
As for the manage script, escapeShellArg is guaranteed to work. It's a common pattern in NixOS:

"${name}=${escapeShellArg value}"

@mweinelt
Copy link
Member

mweinelt commented Jul 9, 2023

(2) using services.postgresqlBackup and just rsyncing /var/lib/paperless/.

That is what we're doing, but with ZFS snapshots of the data dir.

@leona-ya
Copy link
Member

leona-ya commented Jul 9, 2023

I think there are two different use-cases for backups:

  1. Restore to paperless, after a host failure
  2. Clean backup independent from the software

1 can probably just be handled by backuping postgresql + the paperless state dir.
2 is also an idea that I like very much. It respects the value of the documents (in the metadata, ASNs are relevant, for example).


Probably i would still, even if this feature was available, just make a backup in the style of 1. But I can also understand why people (probably @ctheune) may think different.
If we want to include this I would definitely rename the option from backup to documentExporter. I'm not 100% sure how to decide in the conflict between 'don't overcomplicate module' (is this change something a user should do in their own config) and 'allow users to easily use the features they want for a DMS'.

@erikarvstedt
Copy link
Member

The export service just boils down to this simple snippet, which we could simply add to the NixOS manual:

{ config, ... }:
let
  paperless = config.services.paperless;
in
  systemd.services.paperless-export = {
    startAt = "daily";
    serviceConfig = {
      User = paperless.user;
      ExecStart = ''
        ${paperless.dataDir}/paperless-manage document_exporter <export_dir> --no-progress-bar --no-color --compare-checksums --delete
      '';
    };
  };

@erikarvstedt
Copy link
Member

@Atemu, maybe let's first decide if or in what form we want to include this.

@ctheune
Copy link
Contributor Author

ctheune commented Jul 12, 2023

Here are fixups for your first commit, which should definitely be merged.
Maybe add a dedicated PR because these changes are orthogonal to backups and entirely uncontroversial.

Thanks. I've created #243084 for the orthogonal changes.

@ctheune
Copy link
Contributor Author

ctheune commented Jul 12, 2023

Alright. I'm happy to follow pretty much most of the suggestions, but as the PR itself is still under question, I'll postpone that.

Having to take paperless offline was something I overlooked and I guess the exporter could arrange for that.

Whether to include it, I see the following parts that need discussion:

  1. Not using the name "backup" – sure, I can perfectly live with calling it documentExporter as @leona-ya suggested.
  2. I'm running it with the built-in sqlite thing and I don't have snapshots as simple solution available in my environment, so the document exporter seems the most attractive route to me but I'd prefer it to be part of the tested feature set. Apparently it did not run out of the box due to the quoting issue and I'd love if we can avoid regressions here that everyone has to fix themselves.
  3. Complication of the module ... well ... not sure what the metric here is in value per complication ;) ... as a user I would have loved this being already in there as keeping my data safe is kind of the job of a DMS and we see similar complications in other database modules.

@ctheune ctheune force-pushed the paperless-backup-master branch from 211f996 to e7ee089 Compare January 4, 2025 09:44
@ctheune ctheune force-pushed the paperless-backup-master branch from e7ee089 to ed36619 Compare January 4, 2025 15:48
nixos/tests/paperless.nix Outdated Show resolved Hide resolved
@wegank wegank added the 12.approvals: 1 This PR was reviewed and approved by one reputable person label Jan 4, 2025
Copy link
Member

@Atemu Atemu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrated my config over to this and it was a breeze.

Atemu/nixos-config@c34c9e9

nixos/modules/services/misc/paperless.nix Outdated Show resolved Hide resolved
nixos/modules/services/misc/paperless.nix Outdated Show resolved Hide resolved
@ctheune ctheune force-pushed the paperless-backup-master branch from ed36619 to 916c9e3 Compare January 5, 2025 06:33
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Jan 5, 2025
@wegank wegank removed the 12.approvals: 1 This PR was reviewed and approved by one reputable person label Jan 5, 2025
@wegank wegank added the 12.approvals: 2 This PR was reviewed and approved by two reputable people label Jan 5, 2025
Copy link
Member

@erikarvstedt erikarvstedt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module and the test are broken. Fixups.
Fetch with:

git fetch https://github.com/erikarvstedt/nixpkgs paperless-backup-master-ea-4
git log ..FETCH_HEAD

Further remarks. Medium prio, so ignore if you want to get this merged fast:
I agree with @Atemu that pre/postScript is redudant. There are many ways to achieve this without extra options, most notably so:

systemd.services.paperless-exporter = {
  preStart = "...";
  postStart = "...";
}

Also, like @Atemu mentioned before, needless log output is annoying and doesn't fit the style of other NixOS services.
The exporter prints Running pre/post script even when no pre/post script is run. We should remove this kind of output entirely.

When using pre/postStart = "...";, systemd always shows which stage is currently running and the log shows the offending stage on failures. So diagnosability is fine.

@ctheune
Copy link
Contributor Author

ctheune commented Jan 6, 2025

Ugh, dang. Apparently I didn't run the tests after my last changes - sorry, that was bad form on my side. Although I did run the test when twiddling the v.default stuff and that did fulfill the test ... if it did that by accident, then I'll need to revisit that test. I'll pick up your changes and double check that.

I'll remove the pre/post options.

[Edit]: Note that I'm personally still on the fence with the pre/post stuff. I guess I dislike splitting up those units of execution in an IMO clobbered way over multiple scripts. I know we rely endlessly on systemd, but it still feels clunky, especially if you want to run things manually for development/debugging.

[Edit]: I'm still removing the pre/post script stuff, but please note that it doesn't provide the diagnosability things as I mentioned: it does show the stage when it breaks. It doesn't show the stage if it's silent and gets stuck, so there's also no info how long steps take ... but I guess that's something everyone can put in their pre/post scripts if they want to see this. (Except being able to see the moment when the actual export starts because the unit start log entry will reflect when the unit is started, not when the main process starts)

Paperless includes a document exporter that can be used for e.g.
backups.

This change extends the module to provide a way to enable and configure
a timer, export settings, pre- and post-processing
scripts (e.g. to ship the backup somewhere else, clean up, ...).

It works out of the box when just enabling it but can be customized.

Includes suitable tests.
@ctheune ctheune force-pushed the paperless-backup-master branch from 916c9e3 to 865ab91 Compare January 6, 2025 07:26
Copy link
Member

@erikarvstedt erikarvstedt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your patience!

I did run the test when twiddling the v.default stuff and that did fulfill the test

Weird, v.default will always error when evaluating the module, regardless of what config options are used.

@wegank wegank added 12.approvals: 3+ This PR was reviewed and approved by three or more reputable people and removed 12.approvals: 2 This PR was reviewed and approved by two reputable people labels Jan 6, 2025
Copy link
Member

@leona-ya leona-ya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again! LGTM

And sorry for not testing correctly @erikarvstedt

@leona-ya leona-ya merged commit 6355c63 into NixOS:master Jan 9, 2025
26 of 29 checks passed
@ctheune ctheune deleted the paperless-backup-master branch January 10, 2025 10:00
@@ -82,7 +82,7 @@ let
};
in
{
meta.maintainers = with lib.maintainers; [ leona SuperSandro2000 erikarvstedt ];
meta.maintainers = with lib.maintainers; [ leona SuperSandro2000 erikarvstedt atemu theuni ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many maintainers will probably just diverge responsibility and in the end no one feels really responsible.

SuperSandro2000 added a commit to SuperSandro2000/nixpkgs that referenced this pull request Jan 13, 2025
@SuperSandro2000
Copy link
Member

#373472

SuperSandro2000 added a commit to SuperSandro2000/nixpkgs that referenced this pull request Jan 13, 2025
SuperSandro2000 added a commit to SuperSandro2000/nixpkgs that referenced this pull request Jan 15, 2025
SuperSandro2000 added a commit to SuperSandro2000/nixpkgs that referenced this pull request Jan 15, 2025
…fault value into config to aid future problems
SuperSandro2000 added a commit to SuperSandro2000/nixpkgs that referenced this pull request Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 12.approvals: 3+ This PR was reviewed and approved by three or more reputable people
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants