Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kickstart the schemas revamp (WIP) #26

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from
9 changes: 9 additions & 0 deletions conda_models/_base.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,20 @@
from pydantic import BaseModel, Extra
from pydantic.main import ModelMetaclass


class ExtrasForbiddenModel(BaseModel):
class Config:
extra = Extra.forbid


class AllOptional(ModelMetaclass):
def __new__(mcls, name, bases, namespaces, **kwargs):
cls = super().__new__(mcls, name, bases, namespaces, **kwargs)
for field in cls.__fields__.values():
field.required = False
return cls


def export_to_json(model, path):
with open(path, "w") as f:
f.write(model.json(indent=2))
2 changes: 1 addition & 1 deletion conda_models/condarc.py
Original file line number Diff line number Diff line change
Expand Up @@ -721,4 +721,4 @@ class Condarc(ExtrasForbiddenModel):
higher this number, the longer the generation of the unsat hint will
take. Defaults to 3.
""",
)
)
1 change: 1 addition & 0 deletions conda_models/match_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ class MatchSpec(ExtrasForbiddenModel):
"""
TODO: In theory, any PackageRecord (scalar) keys should be admitted here.
"""

name: Optional[Union[PackageNameStr, Literal["*"]]] = None
"The name of the package"
version: Optional[VersionSpecStr] = None
Expand Down
23 changes: 18 additions & 5 deletions conda_models/repodata.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,15 @@
"""
from pydantic import AnyUrl, Field

from ._base import ExtrasForbiddenModel
from ._base import AllOptional, ExtrasForbiddenModel
from .package_record import PackageRecord
from .types import NonEmptyStr, Subdir
from .types import (
CondaPackageFileNameStr,
NonEmptyStr,
PackageFileNameStr,
Subdir,
TarBz2PackageFileNameStr,
)


class RepodataRecord(PackageRecord):
Expand All @@ -26,6 +32,10 @@ class RepodataRecord(PackageRecord):
"""


class AllOptionalRepodataRecord(PackageRecord, metaclass=AllOptional):
pass


class ChannelInfo(ExtrasForbiddenModel):
""" """

Expand All @@ -37,11 +47,14 @@ class Repodata(ExtrasForbiddenModel):

info: ChannelInfo
"Information about the repodata"
packages: dict
packages: dict[TarBz2PackageFileNameStr, AllOptionalRepodataRecord]
"The .tar.bz2 packages in the repodata"
packages_conda: dict = Field(..., alias="packages.conda")
packages_conda: dict[CondaPackageFileNameStr, AllOptionalRepodataRecord] = Field(
...,
alias="packages.conda",
)
"The .conda packages in the repodata"
removed: set[str]
removed: set[PackageFileNameStr]
"The packages that have been removed from the repodata"
version: int = Field(..., alias="repodata_version")
"The version of the repodata"
21 changes: 21 additions & 0 deletions conda_models/repodata_patch.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,24 @@
"""
WIP
"""
from pydantic import Field

from ._base import ExtrasForbiddenModel
from .repodata import AllOptionalRepodataRecord
from .types import CondaPackageFileNameStr, PackageFileNameStr, TarBz2PackageFileNameStr


class RepodataPatchInstructions(ExtrasForbiddenModel):
packages: dict[TarBz2PackageFileNameStr, AllOptionalRepodataRecord]
"The .tar.bz2 packages in the repodata that will be patched"
packages_conda: dict[CondaPackageFileNameStr, AllOptionalRepodataRecord] = Field(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it doesnt make sense to patch all fields from a RepodataRecord. Changing the name of a package doesnt make sense and changing the hash might break cache implementations. In Rattler we explicity limit the fields that can be patched for that reason. We also didnt encounter fields not covered by our implementation. Might be worth considering!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that it should be prevented and it's A Bad Idea (tm).

So while the file format seems to accept everything, there are some guidelines place for "reasonable" use and it looks like this is mostly addressed in code reviews, but sometimes we have exceptions. 🤷

I've been thinking whether we want to get descriptive or prescriptive in this PR, and I have mixed feelings about it. For scalars, I've been trying to be more prescriptive, but with structs... just reflecting what (I think) is in use these days. I am not saying I won't change my mind about it, just writing down what the approach have been so far :D

Thanks for the feedback!

Also, are we sure we have a packages.conda field? Maybe the readme is outdated, and the seed dict is too?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking whether we want to get descriptive or prescriptive in this PR, and I have mixed feelings about it. For scalars, I've been trying to be more prescriptive, but with structs... just reflecting what (I think) is in use these days. I am not saying I won't change my mind about it, just writing down what the approach have been so far :D

I understand where you're coming from. With rattler we definitely choose the prescriptive approach because I feel like there is way too much "legacy" code and supporting all those exceptions is simply harder. But the same logic does not necessarily hold for this project.

The packages.conda field is definitely used by the conda-forge-repodata-patches package. The readme does indeed seem to be outdated. Here is a small excerpt from the latest linux-64 patches:

 "packages.conda": {
    "4ti2-1.6.9-hbc9de56_1.conda": {
      "license_family": "GPL"
    },
    "acctaudem-1.0.1-h69042ef_3.conda": {
      "license_family": "GPL"
    },
    "actions-runner-2.299.1-he0ac6c6_0.conda": {
      "license_family": "MIT"
    },
    "actions-runner-2.300.0-h0cdce71_0.conda": {
      "license_family": "MIT"
    },

Copy link

@morremeyer morremeyer Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the current state, I thin we should be descriptive in the initial implementation: The first step in this effort is to capture what is used currently so that people and projects can refer to the schema to ensure that their software is implemented correctly.

In further steps, we could get more and more prescriptive.

This brings me to a second thing that we'll need to have: versioning. As with everything, at some point, the schema will change.

To enable implementations to reference "the schema" they're using, we'll need to version the schemas. If we don't, we effectively can't change them without breaking something (be that software or user trust).

Edit: Is the repodata_version the version of the schema?

...,
alias="packages.conda",
)
"The .conda packages in the repodata that will be patched"
remove: set[PackageFileNameStr]
"The packages (conda or tar.bz2) that should be removed from the index"
revoke: set[str]
"Unclear"
patch_instructions_version: int
"Version of the patch instructions schema"
8 changes: 8 additions & 0 deletions conda_models/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,14 @@ class Platform(str, Enum):
min_length=1,
regex=rf"({package_name_regex})-({version_regex})-({build_string_regex})\.(conda|tar\.bz2)",
)
TarBz2PackageFileNameStr = constr(
min_length=1,
regex=rf"({package_name_regex})-({version_regex})-({build_string_regex})\.tar\.bz2",
)
CondaPackageFileNameStr = constr(
min_length=1,
regex=rf"({package_name_regex})-({version_regex})-({build_string_regex})\.conda",
)
NameVersionBuildMatchSpecStr = constr(
min_length=1,
regex=rf"({package_name_regex})\s+("
Expand Down