Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support docker-like references #6

Merged
merged 5 commits into from
Dec 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 69 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ powershell -c "irm https://github.com/fossas/circe/releases/latest/download/circ
> [!TIP]
> Check the help output for more details.

## extract
## subcommand: extract

Extracts the contents of the image to disk.

Expand All @@ -50,7 +50,7 @@ Extracts the contents of the image to disk.
#
# Arguments:
# <image>
# The image to extract.
# The image to extract. See image reference below for more details.
# <target>
# The directory to which the image is extracted.
#
Expand Down Expand Up @@ -84,7 +84,7 @@ Extracts the contents of the image to disk.
circe extract docker.io/contribsys/faktory:latest ./faktory --layers squash --platform linux/amd64
```

## list
## subcommand:list

Lists the contents of an image.

Expand All @@ -96,7 +96,7 @@ Lists the contents of an image.
#
# Arguments:
# <image>
# The image to list.
# The image to list. See image reference below for more details.
#
# Options for `circe list`:
# --platform
Expand All @@ -109,6 +109,71 @@ Lists the contents of an image.
circe list docker.io/contribsys/faktory:latest
```

## image reference

The primary recommendation for referencing an image is to use the fully qualified reference, e.g.:

```shell
circe list docker.io/contribsys/faktory:latest
circe list docker.io/library/ubuntu:14.04
circe list some-host.dev/some-namespace/some-project/some-image:latest
circe list some-host.dev/some-namespace/some-project/some-image@sha256:123abc
```

However, for convenience, you can specify a "partial image reference" in a few different ways:

```shell
# namespace + name + tag; infers to docker.io/contribsys/faktory:latest
circe list contribsys/faktory:latest

# namespace + name + digest; infers to docker.io/contribsys/faktory@sha256:123abc
circe list contribsys/faktory@sha256:123abc

# namespace + name; infers to docker.io/contribsys/faktory:latest
circe list contribsys/faktory

# name + tag; infers to docker.io/library/ubuntu:latest
circe list ubuntu:latest

# name + digest; infers to docker.io/library/ubuntu@sha256:123abc
circe list ubuntu@sha256:123abc

# name; infers to docker.io/library/ubuntu:latest
circe list ubuntu
```

By default, `circe` fills in `docker.io` for the registry and `library` for the namespace.
However, you can customize the registry and namespace by setting the `OCI_BASE` and `OCI_NAMESPACE` environment variables:

```shell
# Specify the registry and/or namespace:
export OCI_BASE=some-host.dev
export OCI_NAMESPACE=some-namespace

# namespace + name + tag; infers to some-host.dev/contribsys/faktory:latest
circe list contribsys/faktory:latest

# namespace + name + digest; infers to some-host.dev/contribsys/faktory@sha256:123abc
circe list contribsys/faktory@sha256:123abc

# namespace + name; infers to some-host.dev/contribsys/faktory:latest
circe list contribsys/faktory

# name + tag; infers to some-host.dev/some-namespace/ubuntu:latest
circe list ubuntu:latest

# name + digest; infers to some-host.dev/some-namespace/ubuntu@sha256:123abc
circe list ubuntu@sha256:123abc

# name; infers to some-host.dev/some-namespace/ubuntu:latest
circe list ubuntu
```

**The overall recommendation is to use fully qualified references.**
The intention with the ability to override `OCI_BASE` and `OCI_NAMESPACE` is to make setup easier for CI/CD pipelines
that need to extract multiple images from a custom host and/or namespace, but don't want to have to write scripts
to concatenate them into fully qualified references.

## platform selection

You can customize the platform used by `circe` by passing `--platform`.
Expand Down
34 changes: 26 additions & 8 deletions bin/src/extract.rs
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,26 @@ pub struct Options {
#[derive(Debug, Args)]
pub struct Target {
/// Image reference being extracted (e.g. docker.io/library/ubuntu:latest)
#[arg(value_parser = Reference::from_str)]
pub image: Reference,
///
/// If a fully specified reference is not provided,
/// the image is attempted to be resolved with the prefix
/// `docker.io/library`.
///
/// The reference may optionally provide a digest, for example
/// `docker.io/library/ubuntu@sha256:1234567890`.
///
/// Finally, the reference may optionally provide a tag, for example
/// `docker.io/library/ubuntu:latest` or `docker.io/library/ubuntu:24.04`.
/// If no digest or tag is provided, the tag "latest" is used.
///
/// Put all that together and you get the following examples:
/// - `ubuntu` is resolved as `docker.io/library/ubuntu:latest`
/// - `ubuntu:24.04` is resolved as `docker.io/library/ubuntu:24.04`
/// - `docker.io/library/ubuntu` is resolved as `docker.io/library/ubuntu:latest`
/// - `docker.io/library/ubuntu@sha256:1234567890` is resolved as `docker.io/library/ubuntu@sha256:1234567890`
/// - `docker.io/library/ubuntu:24.04` is resolved as `docker.io/library/ubuntu:24.04`
#[arg(verbatim_doc_comment)]
pub image: String,
Copy link

@csasarak csasarak Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

[Optional, probably more relevant for the CLI and super minor] It might be worth making the "base" configurable via env var. If I'm a company who has a list of image names it may be simpler for me to set an env var specifying the base rather than have to program my CI jobs to concatenate it together with the base name. It also means that I can create a CI job template that sets the var and that makes calls to this do the right thing rather than making my engineers have to remember to use the FQN.

^^ I'm a little dubious that this is super useful, but I think it's worth explicitly rejecting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's an incredible idea! made that quick change.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this more over lunch, and it would be similar to also have CLI options that can do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is behavior that we probably don't want in flags; it's not really intended to be used most of the time, it's just an escape hatch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks!


/// Platform to extract (e.g. linux/amd64)
///
Expand Down Expand Up @@ -131,20 +149,20 @@ pub enum Mode {
pub async fn main(opts: Options) -> Result<()> {
info!("extracting image");

let auth = match (opts.target.username, opts.target.password) {
(Some(username), Some(password)) => Authentication::basic(username, password),
_ => Authentication::default(),
};

let reference = Reference::from_str(&opts.target.image)?;
let layer_globs = Filters::parse_glob(opts.layer_glob.into_iter().flatten())?;
let file_globs = Filters::parse_glob(opts.file_glob.into_iter().flatten())?;
let layer_regexes = Filters::parse_regex(opts.layer_regex.into_iter().flatten())?;
let file_regexes = Filters::parse_regex(opts.file_regex.into_iter().flatten())?;
let auth = match (opts.target.username, opts.target.password) {
(Some(username), Some(password)) => Authentication::basic(username, password),
_ => Authentication::default(),
};

let output = canonicalize_output_dir(&opts.output_dir, opts.overwrite)?;
let registry = Registry::builder()
.maybe_platform(opts.target.platform)
.reference(opts.target.image)
.reference(reference)
.auth(auth)
.layer_filters(layer_globs + layer_regexes)
.file_filters(file_globs + file_regexes)
Expand Down
8 changes: 5 additions & 3 deletions bin/src/list.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
use circe_lib::{registry::Registry, Authentication};
use circe_lib::{registry::Registry, Authentication, Reference};
use clap::Parser;
use color_eyre::eyre::{Context, Result};
use derive_more::Debug;
use pluralizer::pluralize;
use std::collections::HashMap;
use std::{collections::HashMap, str::FromStr};
use tracing::{debug, info};

use crate::extract::Target;
Expand All @@ -19,13 +19,15 @@ pub struct Options {
pub async fn main(opts: Options) -> Result<()> {
info!("extracting image");

let reference = Reference::from_str(&opts.target.image)?;
let auth = match (opts.target.username, opts.target.password) {
(Some(username), Some(password)) => Authentication::basic(username, password),
_ => Authentication::default(),
};

let registry = Registry::builder()
.maybe_platform(opts.target.platform)
.reference(opts.target.image)
.reference(reference)
.auth(auth)
.build()
.await
Expand Down
2 changes: 1 addition & 1 deletion bin/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ async fn main() -> Result<()> {
.with_deferred_spans(true)
.with_bracketed_fields(true)
.with_span_retrace(true)
.with_targets(true),
.with_targets(false),
)
.with(
tracing_subscriber::EnvFilter::builder()
Expand Down
138 changes: 94 additions & 44 deletions lib/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

use bon::Builder;
use color_eyre::{
eyre::{self, bail, eyre, Context},
eyre::{self, bail, ensure, eyre, Context},
Result, Section, SectionExt,
};
use derive_more::derive::{Debug, Display, From};
Expand All @@ -11,12 +11,36 @@ use itertools::Itertools;
use std::{borrow::Cow, ops::Add, str::FromStr};
use strum::{AsRefStr, EnumIter, IntoEnumIterator};
use tap::{Pipe, Tap};
use tracing::debug;
use tracing::{debug, warn};

mod ext;
pub mod registry;
pub mod transform;

/// Users can set this environment variable to specify the OCI base.
/// If not set, the default is [`OCI_DEFAULT_BASE`].
pub const OCI_BASE_VAR: &str = "OCI_DEFAULT_BASE";

/// Users can set this environment variable to specify the OCI namespace.
/// If not set, the default is [`OCI_DEFAULT_NAMESPACE`].
pub const OCI_NAMESPACE_VAR: &str = "OCI_DEFAULT_NAMESPACE";

/// The default OCI base.
pub const OCI_DEFAULT_BASE: &str = "docker.io";

/// The default OCI namespace.
pub const OCI_DEFAULT_NAMESPACE: &str = "library";

/// The OCI base.
pub fn oci_base() -> String {
std::env::var(OCI_BASE_VAR).unwrap_or(OCI_DEFAULT_BASE.to_string())
}

/// The OCI namespace.
pub fn oci_namespace() -> String {
std::env::var(OCI_NAMESPACE_VAR).unwrap_or(OCI_DEFAULT_NAMESPACE.to_string())
}

/// Authentication method for a registry.
#[derive(Debug, Clone, Default, Display)]
pub enum Authentication {
Expand Down Expand Up @@ -388,28 +412,6 @@ impl Version {
}

/// A parsed container image reference.
///
/// ```
/// # use circe_lib::{Reference, Version};
/// # use std::str::FromStr;
/// // Default to latest tag
/// let reference = Reference::from_str("docker.io/library/ubuntu").expect("parse reference");
/// assert_eq!(reference.host, "docker.io");
/// assert_eq!(reference.repository, "library/ubuntu");
/// assert_eq!(reference.version, Version::tag("latest"));
///
/// // Parse a tag
/// let reference = Reference::from_str("docker.io/library/ubuntu:other").expect("parse reference");
/// assert_eq!(reference.host, "docker.io");
/// assert_eq!(reference.repository, "library/ubuntu");
/// assert_eq!(reference.version, Version::tag("other"));
///
/// // Parse a digest
/// let reference = Reference::from_str("docker.io/library/ubuntu@sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4").expect("parse reference");
/// assert_eq!(reference.host, "docker.io");
/// assert_eq!(reference.repository, "library/ubuntu");
/// assert_eq!(reference.version.to_string(), "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4");
/// ```
#[derive(Debug, Clone, PartialEq, Eq, Builder)]
pub struct Reference {
/// Registry host (e.g. "docker.io", "ghcr.io")
Expand Down Expand Up @@ -450,32 +452,80 @@ impl FromStr for Reference {
type Err = eyre::Error;

fn from_str(s: &str) -> Result<Self, Self::Err> {
let input_section = || s.to_string().header("Input:");
let (host, remainder) = s.split_once('/').ok_or_else(|| {
eyre!("invalid reference: missing host separator '/'").with_section(input_section)
})?;
// Returns an owned string so that we can support multiple name segments.
fn parse_name(name: &str) -> Result<(String, Version)> {
if let Some((name, digest)) = name.split_once('@') {
let digest = Digest::from_str(digest).context("parse digest")?;
Ok((name.to_string(), Version::Digest(digest)))
} else if let Some((name, tag)) = name.split_once(':') {
Ok((name.to_string(), Version::Tag(tag.to_string())))
} else {
Ok((name.to_string(), Version::latest()))
}
}

// Docker supports `docker pull ubuntu` and `docker pull library/ubuntu`,
// both of which are parsed as `docker.io/library/ubuntu`.
// The below recreates this behavior.
let base = oci_base();
let namespace = oci_namespace();
let parts = s.split('/').collect::<Vec<_>>();
let (host, namespace, name, version) = match parts.as_slice() {
// For docker compatibility, `{name}` is parsed as `{base}/{namespace}/{name}`.
[name] => {
let (name, version) = parse_name(name)?;
warn!("expanding '{name}' to '{base}/{namespace}/{name}'; fully specify the reference to avoid this behavior");
(base, namespace, name, version)
}

// Find either ':' for tag or '@' for digest.
// Check for '@' first since digest identifiers also contain ':'.
let (repository, version) = if let Some((repo, digest)) = remainder.split_once('@') {
let digest = Digest::from_str(digest).context("parse digest")?;
(repo, Version::Digest(digest))
} else if let Some((repo, tag)) = remainder.split_once(':') {
(repo, Version::Tag(tag.to_string()))
} else {
(remainder, Version::latest())
// Two segments may mean "{namespace}/{name}" or may mean "{base}/{name}".
// This is a special case for docker compatibility.
[host, name] if *host == base => {
let (name, version) = parse_name(name)?;
warn!("expanding '{host}/{name}' to '{base}/{namespace}/{name}'; fully specify the reference to avoid this behavior");
(host.to_string(), namespace, name, version)
}
[namespace, name] => {
let (name, version) = parse_name(name)?;
warn!("expanding '{namespace}/{name}' to '{base}/{namespace}/{name}'; fully specify the reference to avoid this behavior");
(base, namespace.to_string(), name, version)
}

// Some names have multiple segments, e.g. `docker.io/library/ubuntu/foo`.
// We can't handle multi-segment names in other branches since they conflict with the various shorthands,
// but handle them here since they're not ambiguous.
[host, namespace, name @ ..] => {
let name = name.join("/");
let (name, version) = parse_name(&name)?;
(host.to_string(), namespace.to_string(), name, version)
}
_ => {
return eyre!("invalid reference format: {s}")
.with_section(|| {
[
"Provide either a fully qualified OCI reference, or a short form.",
"Short forms are in the format `{name}` or `{namespace}/{name}`.",
"If you provide a short form, the default registry is `docker.io`.",
]
.join("\n")
.header("Help:")
})
.with_section(|| {
["docker.io/library/ubuntu", "library/ubuntu", "ubuntu"]
.join("\n")
.header("Examples:")
})
.pipe(Err)
}
};

if host.is_empty() {
return Err(eyre!("host cannot be empty").with_section(input_section));
}
if repository.is_empty() {
return Err(eyre!("repository cannot be empty").with_section(input_section));
}
ensure!(!host.is_empty(), "host cannot be empty: {s}");
ensure!(!namespace.is_empty(), "namespace cannot be empty: {s}");
ensure!(!name.is_empty(), "name cannot be empty: {s}");

Ok(Reference {
host: host.to_string(),
repository: repository.to_string(),
repository: format!("{namespace}/{name}"),
version,
})
}
Expand Down
Loading
Loading