Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add small how/why and user tutorial #102

Merged
merged 1 commit into from
May 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,22 @@ input files.
- For for the old version [switch to the master branch](https://github.com/adfinis/pyaptly/tree/master)
- Main branch builds contain [alpha packages](https://github.com/adfinis/pyaptly/actions/runs/8147002919), see Artifacts

# Why & How

[Aptly](https://www.aptly.info/) is great tool for creating Debian repositories.
But as soon as it's required to maintain repositories for different [environments](https://en.wikipedia.org/wiki/Deployment_environment) it gets very complicated fast.

This is where Pyaptly comes in.
First of all, a single `config.toml` can be used to define `mirrors`, `snapshots` and `publishes` instead of using command line arguments.
The definition includes exactly how the entities are created and updated.

Secondly, aptly isn't really layed out to have retention policies. Updating a `snapshot` will lose the information of the previous state.
That means it's hard to roll back to a previous state if required.
This problem is fixed by using fix timestamps in snapshot names.
That behaviour also allows to define a fixed update spacing. It's possible to say for example "only update this snapshot once a week".

[Follow the Tutorial](./docs/TUTORIAL.md)

## Example commands

Initialize a new aptly server.
Expand Down
170 changes: 170 additions & 0 deletions docs/TUTORIAL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
> Note: This tutorial assumes basic knowledge of [Aptly](https://www.aptly.info/).

Pyaptly is capable of managing mirrors, snapshots and publishes.
Each of those are handled completely separately, so it's possible to only a subset with pyaptly.
But for the purpose of this tutorial we assume a clean [install of Aptly](https://www.aptly.info/download/) with no content yet.

TODO: Note to jump to the relevant chapter if only a subset should be managed by aptly.

# Installation

TODO (once packages are available)

# Aptly Mirror

Pyaptly can create and update mirrors. Since mirrors are nor a very complicated construct, there's no extra logic not available within aptly.
Configuring a mirror with pyaptly is pretty much the same as writing a command for aptly - except that it's declarative.
Let's take the following `aptly` commands as an example to creating an aptly mirror:

```bash
gpg --yes --no-default-keyring --keyring trustedkeys.gpg --keyserver keyserver.ubuntu.com --recv-keys EE727D4449467F0E
aptly mirror create aptly "http://repo.aptly.info/" nightly main
```

After adding the gpg key to our keyring, we add a the official `aptly` repository. A pyaptly configuration would look like this:

```toml
[mirror.aptly]
archive = "http://repo.aptly.info/"
gpg-keys = [ "EE727D4449467F0E" ]
keyserver = "keyserver.ubuntu.com"
components = "main"
distribution = "nightly"
```

With this configuration the mirror can be created with the following command line:
```bash
pyaptly mirror ./config.toml create
```

As you can see we more or less just put the command line arguments into the configuration file.
Pyaptly also takes care of downloading the gpp key if it isn't availble yet. If you don't want pyaptly to fetch the gpg key, just omit the variables.

> For a list of all configuration options of a mirror, check out [the reference](TODO: Reference link).

## updating mirrors

We can also tell pyaptly to update all defined mirrors:
```bash
pyaptly mirror ./config.toml update
```

This is exactly the same as `aptly mirror update aptly` with the above config.
But it will update all defined mirrors if more than one is defined, making it a bit more convenient than using `aptly` directly.

# Snapshots

## Basic snapshots

Pyaptly has some extra features for snapshots, but let's start by creating a very simple snapshot first.

```toml
[snapshot."aptly"]
mirror = "aptly"
```
And create the snapshot:
```shell-session
$ pyaptly snapshot ./config.toml create
$ aptly snapshot list
List of snapshots:
* [aptly]: Snapshot from mirror [aptly]: http://repo.aptly.info/ nightly
```

An equal aptly command would be:
```bash
aptly snapshot create aptly from mirror aptly
```

This snapshot can now be updated by with pyaptly:
```shell-session
$ pyaptly snapshot ./config.toml update
$ aptly snapshot list
List of snapshots:
* [aptly]: Snapshot from mirror [aptly]: http://repo.aptly.info/ nightly
* [aptly-rotated-20240102T1315Z]: Snapshot from mirror [aptly]: http://repo.aptly.info/ nightly
```
As you see, `pyaptly` first "rotates" the snapshot by just renaming and postfixing it with a date. Afterwards, it creates a new snapshot `aptly` which is now up-to-date.

> Similar to mirrors, pyaptly allows a variety of configuration options for snapshots. Check out [the reference](TODO: Link to reference).

## Snapshots with retention

Snapshots with retention are a bit more complicated than simple snapshots.
The retention time is either 1 day or 1 week. Other types of retention are currently not implemented.
Another specialty is that the retention is always the "maximum allowed" retention.
Let's use a daily snapshot as an example:

```toml
[snapshot."aptly-%T"]
mirror = "aptly"

[snapshot."aptly-%T".timestamp]
time = "00:00"
# Uncomment for weekly retention starting on saturday
#repeat-weekly = "mon"
```

Now let's pretend today is January 2 2024 and we don't have a snapshot yet. This is what happens:

```shell-session
$ pyaptly snapshot config.toml create
$ aptly snapshot list -raw # list snapshot names
aptly-20240102T0000Z
$ aptly snapshot show aptly-20240101T0000Z
Name: aptly-20240102T0000Z
Created At: 2024-01-02 13:55:41 UTC
Description: Snapshot from mirror [aptly]: http://repo.aptly.info/ nightly
Number of packages: 173
Sources:
aptly [repo]
$
```

You will notice that the timestamp in the name is different than the timestamp after `Created At`.
The idea here is simple: We want to create one new Snapshot per *day*.
If it's been already past midnight (our defined `time` of `00:00`), create a snapshot and "backdate" it to this time. If a snapshot with this timestamp already exists, do nothing.
It's crucial to understand that we don't want to create a new snapshot "24 hours later than the previous one". We truly want one in each 24h window.
This is matches the typical use case of usual maintenance windows much more.
For example if Company A patches their servers every day at 20:00, it might makes sense to set `time = 19:00` in the config and run a cronjob at 19:05 to create a new snapshot.
At the same time it's much easier to implement in `pyaptly` this way. We can just generate the name a new snapshot would get, check if this snapshot exists and if it does, we do nothing.
This means if we rerun the same command `pyaptly snapshot config.toml create` a second time 5 minutes later it will do nothing, because the snapshot already exists.

It's also important to understand that `pyaptly snapshot config.toml update` will do nothing, as these snapshots with retention are considered "readonly".

If we were to patch our systems only once a week, then what we want is to uncomment the line `repeat-weekly: "mon"`. This way, our snapshot would be backdated a full day to `aptly-20240102T0000Z`.
This means that pyaptly would only create a new snapshot once a week, no matter how often the command has run.
Copy link

@rhizoome rhizoome Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

background

What I am saying here adds nothing new to your text. It is just the view from the application instead the user. Feel free to incorporate anything you think helps:

  • pyaptly reads the current-state of aptly
  • it will create all commands needed to create the should-state based on the current time
    • commands that create something that already exists in the current-state will be discarded (example)
  • it will resolve the dependencies into a command-tree
    • while resolving it will again check the state to see if the thing already exists
  • it will then execute the commands
    • if one command in the tree fails dependant commands will not and cannot be executed

this also ties into an issue where we are currently using pyaptly wrong (I do not know the details of how we are using it, I am inferring from questions I got). also take this with a grain of salt, I haven't been using pyaptly for a long time, maybe something slipped my memory.

  • there should only be one pyaptly cronjob: Calling a script doing all operations. Since it won't redo anything this is safe and efficient
    • the job has to be called in the smallest interval needed (week if there is only a weekly policy, daily if there is a daily policy)
    • this prevents snapshots from not being available when publish is called (I assume we currently have a snapshot-cronjob and a publish-cronjob)
  • if there is overlap in the cronjobs, the second job should probably not run (this is a feature we could add to pyaptly: some kind of "lock"-ckeck), currently this has to be scripted
  • if there chronic overload of pyaptly (because of the missing cleanup-features), pyaptly should run in a loop with a sleep-statement (to prevent overheating if pyaptly fails fast)

Of course on the point about create one script is interesting of this documentation. I just added the other points, because it came to my mind.


# Publish

Pyaptly publishes also come with some extra sugar building on the features of the snapshots. But let's start with a simple publish again:
```bash
aptly publish snapshot aptly aptly
```

This could be achieved with the following toml file in pyaptly:
```toml
[publish]
[[publish.aptly]]
distribution = "nightly"
components = "main"
#automatic-update = true
[[publish.aptly.snapshots]]
name = "aptly"
```

First we define a publish called `aptly`. Then - as pyaptly currently can't figure that out itself - we specify the distribution and components.
The last line says which snapshot we want to use.

Afterwards we run:
```shell-session
$ pyaptly publish pyaptly/publish.toml create -n aptly
$ aptly publish show nightly aptly
Prefix: aptly
Distribution: nightly
Architectures: amd64
Sources:
main: aptly [snapshot]
$
```
Copy link

@rhizoome rhizoome Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • maybe you could cover %T for referring to snapshots and archive-on-update. See
  • or do the same comparison to the aptly commands as you did above. (you could make use --info to see what commands are generated, if you use dry-run, you could change to config, without having to revert stuff all the time)
  • there is also the merge and the timestamp feature. See

digging into these features

  • the timestamp basically means n-th oldest timestamp. it is called back_ref in the code
  • merge is a snapshot feature, but a user might think of it in publishing, because only there it makes sense. See
  • archive-on-update seems to be meant to document when exactly a publish was changed, as it might have implications on policy or bugs discovered (before/after a change in publish). See


It's important that we specify `-n aptly` here. If we want to publish it every time we run the `pyaptly publish` command, we need to uncomment the line `automatic-update = true`