Skip to content

Commit

Permalink
feat: add checks to determine if repo and commit came from provenance (
Browse files Browse the repository at this point in the history
…#704)

This PR adds two new checks that succeed if the repository URL or commit of the analysis target match those that can be extracted from the provenance, respectively. If the repository or provenance do not exist, or do not contain the needed information, or are not identical, these checks will fail.

Signed-off-by: Ben Selwyn-Smith <[email protected]>
  • Loading branch information
benmss authored May 15, 2024
1 parent 018f0be commit 9c44445
Show file tree
Hide file tree
Showing 41 changed files with 3,394 additions and 1,495 deletions.
6 changes: 6 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@ the requirements that are currently supported by Macaron.
* - 3
- **Provenance expectation** - Check if the provenance meets an expectation.
- The user can provide an expectation for the provenance as a CUE policy, which will be compared against the SLSA provenance.
* - 3
- **Provenance derived repo** - Check if the analysis target's repository matches the repository in the provenance.
- If there is no provenance, this check will fail.
* - 3
- **Provenance derived commit** - Check if the analysis target's commit matches the commit in the provenance.
- If there is no commit, this check will fail.

----------------------
How does Macaron work?
Expand Down
16 changes: 16 additions & 0 deletions docs/source/pages/developers_guide/apidoc/macaron.repo_finder.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,22 @@ macaron.repo\_finder.commit\_finder module
:undoc-members:
:show-inheritance:

macaron.repo\_finder.provenance\_extractor module
-------------------------------------------------

.. automodule:: macaron.repo_finder.provenance_extractor
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.provenance\_finder module
----------------------------------------------

.. automodule:: macaron.repo_finder.provenance_finder
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.repo\_finder module
----------------------------------------

Expand Down
8 changes: 8 additions & 0 deletions docs/source/pages/developers_guide/apidoc/macaron.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,14 @@ macaron.errors module
:undoc-members:
:show-inheritance:

macaron.json\_tools module
--------------------------

.. automodule:: macaron.json_tools
:members:
:undoc-members:
:show-inheritance:

macaron.util module
-------------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,14 @@ macaron.slsa\_analyzer.checks.provenance\_available\_check module
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.checks.provenance\_commit\_check module
--------------------------------------------------------------

.. automodule:: macaron.slsa_analyzer.checks.provenance_commit_check
:members:
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.checks.provenance\_l3\_check module
----------------------------------------------------------

Expand All @@ -81,6 +89,14 @@ macaron.slsa\_analyzer.checks.provenance\_l3\_content\_check module
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.checks.provenance\_repo\_check module
------------------------------------------------------------

.. automodule:: macaron.slsa_analyzer.checks.provenance_repo_check
:members:
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.checks.provenance\_witness\_l1\_check module
-------------------------------------------------------------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,11 @@ macaron.slsa\_analyzer.git\_service.gitlab module
:members:
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.git\_service.local\_repo\_git\_service module
--------------------------------------------------------------------

.. automodule:: macaron.slsa_analyzer.git_service.local_repo_git_service
:members:
:undoc-members:
:show-inheritance:
6 changes: 6 additions & 0 deletions src/macaron/slsa_analyzer/analyze_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,10 @@ class ChecksOutputs(TypedDict):
"""The package registries for the target software component."""
provenance: InTotoPayload | None
"""The provenance payload for the target software component."""
provenance_repo_url: str | None
"""The repository URL extracted from provenance, if applicable."""
provenance_commit_digest: str | None
"""The commit digest extracted from provenance, if applicable."""


class AnalyzeContext:
Expand Down Expand Up @@ -97,6 +101,8 @@ def __init__(
is_inferred_prov=True,
expectation=None,
provenance=None,
provenance_repo_url=None,
provenance_commit_digest=None,
)

@property
Expand Down
58 changes: 31 additions & 27 deletions src/macaron/slsa_analyzer/analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -317,10 +317,22 @@ def run_single(
# Try to find the provenance file for the parsed PURL.
provenance_payload = ProvenanceFinder().find_provenance(parsed_purl)

# Try to extract the repository URL and commit digest from the Provenance, if it exists.
provenance_repo_url = provenance_commit_digest = None
if provenance_payload:
try:
provenance_repo_url, provenance_commit_digest = extract_repo_and_commit_from_provenance(
provenance_payload
)
except ProvenanceError as error:
logger.debug("Failed to extract repo or commit from provenance: %s", error)

# Create the analysis target.
available_domains = [git_service.hostname for git_service in GIT_SERVICES if git_service.hostname]
try:
analysis_target = Analyzer.to_analysis_target(config, available_domains, parsed_purl, provenance_payload)
analysis_target = Analyzer.to_analysis_target(
config, available_domains, parsed_purl, provenance_repo_url, provenance_commit_digest
)
except InvalidAnalysisTargetError as error:
return Record(
record_id=repo_id,
Expand All @@ -330,7 +342,6 @@ def run_single(
)

# Create the component.
component = None
try:
component = self.add_component(
analysis,
Expand Down Expand Up @@ -368,6 +379,8 @@ def run_single(
analyze_ctx.dynamic_data["provenance"] = provenance_payload
if provenance_payload:
analyze_ctx.dynamic_data["is_inferred_prov"] = False
analyze_ctx.dynamic_data["provenance_repo_url"] = provenance_repo_url
analyze_ctx.dynamic_data["provenance_commit_digest"] = provenance_commit_digest
analyze_ctx.check_results = self.perform_checks(analyze_ctx)

return Record(
Expand Down Expand Up @@ -506,6 +519,8 @@ def add_component(
The target of this analysis.
existing_records : dict[str, Record] | None
The mapping of existing records that the analysis has run successfully.
provenance_payload: InTotoVPayload | None
The provenance intoto payload for the analyzed software component.
Returns
-------
Expand Down Expand Up @@ -621,7 +636,8 @@ def to_analysis_target(
config: Configuration,
available_domains: list[str],
parsed_purl: PackageURL | None,
provenance_payload: InTotoPayload | None = None,
provenance_repo_url: str | None = None,
provenance_commit_digest: str | None = None,
) -> AnalysisTarget:
"""Resolve the details of a software component from user input.
Expand All @@ -634,8 +650,10 @@ def to_analysis_target(
of the corresponding software component.
parsed_purl: PackageURL | None
The PURL to use for the analysis target, or None if one has not been provided.
provenance_payload : InToToPayload | None
The provenance in-toto payload for the software component.
provenance_repo_url: str | None
The repository URL extracted from provenance, or None if not found or no provenance.
provenance_commit_digest: str | None
The commit extracted from provenance, or None if not found or no provenance.
Returns
-------
Expand All @@ -662,24 +680,17 @@ def to_analysis_target(
# Note that we can't always extract the repository path from any provided PURL.
converted_repo_path = None
repo: str | None = None
digest: str | None = None
# parsed_purl cannot be None here, but mypy cannot detect that without some extra help.
if parsed_purl is not None:
if provenance_payload:
# Try to find repository and commit via provenance.
try:
repo, digest = extract_repo_and_commit_from_provenance(provenance_payload)
except ProvenanceError as error:
logger.debug("Failed to extract repo or commit from provenance: %s", error)

if provenance_repo_url or provenance_commit_digest:
return Analyzer.AnalysisTarget(
parsed_purl=parsed_purl,
repo_path=repo or "",
repo_path=provenance_repo_url or "",
branch="",
digest=digest or "",
digest=provenance_commit_digest or "",
)

# As there is no provenance, use the Repo Finder to find the repo.
# As there is no repo or commit from provenance, use the Repo Finder to find the repo.
converted_repo_path = repo_finder.to_repo_path(parsed_purl, available_domains)
if converted_repo_path is None:
# Try to find repo from PURL
Expand All @@ -693,8 +704,8 @@ def to_analysis_target(
)

case (_, _) | (None, _):
# 1. If only the repository path is provided, we will use the user-provided repository path to create the
# ``Repository`` instance. Note that if this case happen, the software component will be initialized
# 1. If only the repository path is provided, we will use the user-provided repository path to create
# the``Repository`` instance. Note that if this case happen, the software component will be initialized
# with the PURL generated from the ``Repository`` instance (i.e. as a PURL pointing to a git repository
# at a specific commit). For example: ``pkg:github.com/org/name@<commit_digest>``.
# 2. If both the PURL and the repository are provided, we will use the user-provided repository path to
Expand All @@ -710,22 +721,15 @@ def to_analysis_target(
digest=input_digest,
)

prov_digest = None
if provenance_payload:
try:
_, prov_digest = extract_repo_and_commit_from_provenance(provenance_payload)
except ProvenanceError as error:
logger.debug("Failed to extract commit from provenance: %s", error)

return Analyzer.AnalysisTarget(
parsed_purl=parsed_purl,
repo_path=repo_path_input,
branch=input_branch,
digest=prov_digest or "",
digest=provenance_commit_digest or "",
)

case _:
# Even though this case is unecessary, it is still put here because mypy cannot type-narrow tuples
# Even though this case is unnecessary, it is still put here because mypy cannot type-narrow tuples
# correctly (see https://github.com/python/mypy/pull/16905, which was fixed, but not released).
raise InvalidAnalysisTargetError(
"Cannot determine the analysis target: PURL and repository path are missing."
Expand Down
88 changes: 88 additions & 0 deletions src/macaron/slsa_analyzer/checks/provenance_commit_check.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Copyright (c) 2024 - 2024, Oracle and/or its affiliates. All rights reserved.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.

"""This module adds a check that determines whether the repository URL came from provenance."""
import logging

from sqlalchemy import ForeignKey, String
from sqlalchemy.orm import Mapped, mapped_column

from macaron.database.table_definitions import CheckFacts
from macaron.slsa_analyzer.analyze_context import AnalyzeContext
from macaron.slsa_analyzer.checks.base_check import BaseCheck
from macaron.slsa_analyzer.checks.check_result import CheckResultData, CheckResultType, Confidence, JustificationType
from macaron.slsa_analyzer.registry import registry
from macaron.slsa_analyzer.slsa_req import ReqName

logger: logging.Logger = logging.getLogger(__name__)


class ProvenanceDerivedCommitFacts(CheckFacts):
"""The ORM mapping for justifications in the commit from provenance check."""

__tablename__ = "_provenance_derived_commit_check"

#: The primary key.
id: Mapped[int] = mapped_column(ForeignKey("_check_facts.id"), primary_key=True) # noqa: A003

#: The state of the commit.
commit_info: Mapped[str] = mapped_column(String, nullable=True, info={"justification": JustificationType.TEXT})

__mapper_args__ = {
"polymorphic_identity": __tablename__,
}


class ProvenanceDerivedCommitCheck(BaseCheck):
"""This check tries to extract the repo from the provenance and compare it to what is in the context."""

def __init__(self) -> None:
"""Initialize instance."""
check_id = "mcn_provenance_derived_commit_1"
description = "Check whether the commit came from provenance."
depends_on: list[tuple[str, CheckResultType]] = []
eval_reqs = [ReqName.EXPECTATION]
super().__init__(
check_id=check_id,
description=description,
depends_on=depends_on,
eval_reqs=eval_reqs,
result_on_skip=CheckResultType.FAILED,
)

def run_check(self, ctx: AnalyzeContext) -> CheckResultData:
"""Implement the check in this method.
Parameters
----------
ctx : AnalyzeContext
The object containing processed data for the target repo.
Returns
-------
CheckResultData
The result of the check.
"""
if ctx.dynamic_data["provenance_commit_digest"]:
if not ctx.component.repository:
return CheckResultData(
result_tables=[],
result_type=CheckResultType.FAILED,
)

current_commit = ctx.component.repository.commit_sha

if current_commit == ctx.dynamic_data["provenance_commit_digest"]:
return CheckResultData(
result_tables=[
ProvenanceDerivedCommitFacts(
commit_info="The commit digest was found from provenance.", confidence=Confidence.HIGH
)
],
result_type=CheckResultType.PASSED,
)

return CheckResultData(result_tables=[], result_type=CheckResultType.FAILED)


registry.register(ProvenanceDerivedCommitCheck())
Loading

0 comments on commit 9c44445

Please sign in to comment.