AI Platform Engineering & MLOps · Part XVI of 34

SBOM and signing for ML

Extending supply-chain security from container images to model artifacts — SLSA levels, Cosign keyless signing, and CycloneDX ML-BOM, with an end-to-end example from registry promotion to admission gate.

11 min read·2 interactive components·9 references

TrainingRegistrySignAdmissionServingTrust boundary

Container images have a mature supply-chain story. Build provenance, cryptographic signatures, software bills of materials (SBOMs), and admission-time verification are well understood, widely tooled, and increasingly mandated by regulation. Model artifacts — the SafeTensors shards, ONNX graphs, and fine-tuned checkpoints that flow from a training job into a serving runtime — are the unsolved half of the same problem.

A model artifact that travels from a training cluster through a model registry and into a serving deployment crosses at least three trust boundaries — each one a place where an unverified or tampered artifact can enter. This article maps the container supply-chain toolkit (SLSA, Sigstore/Cosign, SBOM standards) onto the model lifecycle, names what each tool enforces, works through a concrete end-to-end example, and closes with the gap that remains: the data side of provenance.

If you are coming from the previous article on governance, lineage, and model cards, this article picks up at the moment of registry promotion — when a model transitions from experimental to production-eligible — and focuses on the cryptographic machinery that makes that transition auditable and enforceable.

SLSA levels mapped to the model lifecycle

SLSA (Supply-chain Levels for Software Artifacts) is an end-to-end framework, maintained under the OpenSSF, that defines graded provenance guarantees for software build pipelines [1]. SLSA v1.0 (released April 2023) organises the build track into three levels:

L1 — Provenance exists. The build emits a provenance document recording its inputs, environment, and output digest. The document may be incomplete and is not required to be signed.
L2 — Signed provenance from a hosted platform. The build runs on a hosted service (e.g. a CI platform), which itself generates and digitally signs the provenance. Because the control plane signs — not the tenant — forging provenance requires compromising the build platform itself. This is the non-repudiation threshold.
L3 — Hardened, isolated build environment. The build runner is isolated to prevent cross-build contamination, and signing keys are cryptographically inaccessible to user-defined processes. Provenance is unforgeable even under a compromised build step.

The framework was designed for software packages, but the mapping to a model lifecycle is direct: treat the training run as the build, and the model artifact as the package.

SLSA levels for model artifacts

L1 equivalent: The training job records dataset version, code commit SHA, hyperparameters, and the output artifact digest as run metadata in the model registry. Provenance exists; anyone can read it; no signature.
L2 equivalent: The training platform signs the provenance attestation under its own workload identity (keyless, via OIDC). Registry promotion to a production-eligible stage is gated on the presence of a valid signature. Unsigned models cannot be promoted.
L3 equivalent: Training runs in an isolated job runner with a non-forgeable workload identity. Provenance cannot be backdated or replayed from a different run. The signing key material is held by the training platform, not visible to the training code.

For most teams, L2 is the realistic near-term target. It closes the most exploited gap — an artifact whose claimed provenance cannot be verified — without requiring the hardened-runner infrastructure of L3. The XZ Utils backdoor (March 2024) illustrated exactly the L2 gap: a malicious contributor introduced a supply-chain compromise that had no cryptographic barrier preventing the tampered build from being trusted.

The signing primitive: Cosign and keyless signing

Cosign is the primary signing and verification tool in the Sigstore project [2]. It signs OCI images and arbitrary blobs (binaries, SBOMs, model manifest files), stores signatures alongside the artifact in an OCI registry, and supports in-toto attestations — including SLSA provenance predicates. The Sigstore ecosystem has three interdependent components:

Fulcio (Certificate Authority): Issues short-lived signing certificates (approximately 10 minutes) after verifying the requester’s identity via OpenID Connect (OIDC). Because certificates are short-lived, there is no certificate revocation list (CRL) to manage — the certificate expires before it can be meaningfully abused.
Rekor (Transparency Log): An immutable, append-only ledger that records every signing event with its artifact digest, signature, and signing certificate. Each entry has a trusted timestamp and Merkle-tree inclusion proof. Rekor v2 (GA October 2025) uses Trillian-Tessera as its backend.
Cosign: The CLI and library that orchestrates the signing transaction: authenticates via OIDC, requests a certificate from Fulcio, signs the artifact, submits the signing event to Rekor, and stores the resulting bundle (certificate + signature + log entry) in the OCI registry or as a sidecar file.

Keyless signing is the preferred mode for CI pipelines. The CI workload authenticates with its own OIDC token (a Kubernetes service account, a GitHub Actions workflow identity, or equivalent), and the certificate issued by Fulcio is bound to that identity. At verification time, the policy declares the expected issuer and subject — the verifier is checking not just that a signature exists, but that the artifact was signed by this specific pipeline identity.

For environments without public-internet egress, keyless signing requires either a self-hosted Sigstore stack (the sigstore/scaffolding project provides Helm charts for this) or fallback to key-based signing with a key stored in a managed secrets service. The self-hosted path is worth the operational investment in regulated or air-gapped environments; the key-based path reintroduces key-rotation obligations.

End-to-end example: sign at promotion, verify at admission

The following flow covers a model from training completion through registry promotion to a serving cluster, with signing, attestation storage, and admission-gate verification at each step.

Step 1 — Training run emits provenance (SLSA L1)

The training job records its inputs as run metadata in the model registry: dataset version URI, code commit SHA, hyperparameters, and the SHA-256 digest of the output artifact. This satisfies L1 — provenance exists — but the document is unsigned and trust-level is low.

register_model.py

# Illustrative — adapt to your registry client and schema
registry.log_run(
    name="sentiment-classifier",
    params={
        "dataset_version": "s3://datasets/sentiment/v14",
        "base_model": "bert-base-uncased",
        "git_sha": os.environ["GIT_SHA"],
    },
    metrics={"eval/f1": 0.912, "eval/accuracy": 0.924},
    artifact_digest="sha256:4a7f8b...",
)
# State at this point: registered, unsigned, experimental

Step 2 — Sign the artifact at promotion (SLSA L2)

Signing belongs at promotion, not at registration. Any experiment can be registered; only a model that has passed the curation and evaluation gate is worth signing. The signature is the machine-readable expression of that approval.

promote.sh

# Illustrative — run by the CI promotion job after curation gate approval

# Sign the OCI artifact (if model weights are OCI-wrapped)
cosign sign \
  --identity-token="$(cat /var/run/secrets/ci-oidc-token)" \
  registry.internal/models/sentiment-classifier:v2.1

# Attach a SLSA provenance attestation
cosign attest \
  --predicate slsa-provenance.json \
  --type slsa \
  registry.internal/models/sentiment-classifier:v2.1

# Update registry state: mark as staging
registry.transition_stage("sentiment-classifier", "v2.1", stage="staging")
registry.set_tag("sentiment-classifier", "v2.1", "cosign.signed_at", "2026-06-09T08:00:00Z")

The Cosign bundle — containing the certificate, signature, and Rekor log entry — is stored in the OCI registry alongside the artifact. For non-OCI-wrapped artifacts (plain object-store blobs), Cosign can sign the artifact manifest file as a local blob and store the bundle as a sidecar in the same object-store prefix. The key constraint: the bundle must be stored in a write-once or immutable location. If the artifact can be overwritten, the bundle can be overwritten with it.

Step 3 — Verify at admission (enforcement gate)

Verification belongs in the serving cluster, at admission. A Kubernetes admission controller — Kyverno or an alternative such as OPA/Gatekeeper — intercepts every Pod creation request after authentication and authorization but before persistence. A Kyverno ImageValidatingPolicy (formerly verifyImages) verifies the Cosign signature and attestation against the configured identity and issuer [4].

verify-model-policy.yaml

# Illustrative Kyverno ImageValidatingPolicy
# Adapt issuer, subject, and registry pattern to your environment
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-model-serving-image
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-cosign-signature-and-provenance
      match:
        any:
          - resources:
              kinds: [Pod]
              selector:
                matchExpressions:
                  - key: serving.platform/inferenceservice
                    operator: Exists
      verifyImages:
        - imageReferences:
            - "registry.internal/models/*"
          attestors:
            - entries:
                - keyless:
                    issuer: "https://ci.internal/oidc"
                    subject: "training-platform@ci.internal"
                    rekor:
                      url: "https://rekor.internal"
          attestations:
            - type: "https://slsa.dev/provenance/v1"
              attestors:
                - entries:
                    - keyless:
                        issuer: "https://ci.internal/oidc"
                        subject: "training-platform@ci.internal"

If the signature check fails — wrong identity, missing signature, tampered artifact — the admission webhook returns a rejection and the pod is never scheduled. The previously verified model version continues serving. The gate adds seconds to the rollout path; a compromised artifact costs hours of incident response.

Pin both the OIDC issuer and the subject in your admission policy. A policy that specifies only that a signature exists — without constraining whose identity signed — verifies the signature format but not the signer. A policy with a wildcard subject accepts signatures from any Sigstore identity, which defeats the L2 non-repudiation guarantee.

The tracer below walks the artifact through this exact flow. Toggle tampering to overwrite the artifact after the bundle is stored and watch the admission gate reject the rollout.

Sign-and-Verify Pipeline

Step the artifact through the pipeline. Toggle tampering to overwrite the artifact after signing and watch the admission gate catch it.

1Train

2 · L1Register

3Curate

4 · L2Sign

5Store

6Verify

7Serve

Artifactsha256:4a7f8b…

Step 1 of 7

Training run completes

The training job writes the model artifact (e.g. SafeTensors shards) to storage and computes its SHA-256 digest. Everything downstream refers to the artifact by this digest.

Tampering overwrites the artifact after the bundle is stored — the signature still verifies, but it verifies the original digest, not the artifact the cluster is about to pull.

SBOM for model artifacts: what the standards cover

Executive Order 14028 (May 2021, Section 10(j)) defined an SBOMas a “formal record containing the details and supply chain relationships of various components used in building software” and directed federal agencies to require SBOMs from software vendors [3]. NIST SP 800-218 (SSDF) extends this by requiring that provenance data — including component dependency graphs — be collected, maintained, and shared for each software release.

Two SBOM standards have added ML-specific support:

CycloneDX 1.5+ (ML-BOM): Released June 2023 [5], CycloneDX introduced “machine-learning-model” as a first-class component type. An ML-BOM document records the model type, training datasets (with version and hash), serving framework version, quantisation applied, and any merged adapters. CycloneDX 1.6 (April 2024) and 1.7 (October 2025) extend the schema further.
SPDX 3.0 (AI profile): Released April 2024, SPDX 3.0 added AI profiles with fields for dataset relationships, model provenance, and usage constraints. The AI profile lets existing SPDX toolchains cover model artifacts.

In practice, the near-term approach for most teams is layered: attach a SLSA provenance attestation via Cosign to cover build provenance, and emit a CycloneDX or SPDX SBOM from a tool such as Syftfor the serving image’s dependency tree. Adopt the ML-BOM extension for the model artifact itself as tooling stabilises and your compliance posture requires it.

The glue between signing and SBOM is the in-totoattestation framework [6]. in-toto is a CNCF graduated project that defines the cryptographic envelope (Statement → Predicate) used by both SLSA provenance documents and SBOM attestations. When you run cosign attest, you are creating a signed in-toto Statement whose Predicate is either the SLSA provenance JSON or the SBOM document.

The explorer below compares facet by facet what a classic container SBOM captures against what an ML-BOM / AI-BOM captures — and where both still fall short.

BOM Coverage Explorer

What a container SBOM captures versus an ML-BOM / AI-BOM. Select a facet to see the detail.

FacetContainer SBOMML-BOM

OS packages & system libraries

The home turf of the container SBOM: a tool such as Syft walks the image layers and inventories every OS package and system library, ready for CVE matching. An ML-BOM does not re-describe the serving image — it describes the model component itself.

Layered approach in practice: SLSA provenance attestation via Cosign for build provenance + a CycloneDX/SPDX SBOM from Syft for the serving image, adopting ML-BOM for the model itself as tooling stabilises.

The gap that remains: training-data provenance

Signing and provenance establish where an artifact came from and that it has not been altered in transit. They do not establish that the model is safe, unbiased, or free of a poisoned training set. Supply-chain integrity is necessary but not sufficient: pair it with the evaluation and governance gates described in the preceding article.

The more specific gap is the data side of the SBOM. The SLSA provenance attestation records a training dataset URI and hash at a point in time — it does not describe the dataset’s contents, lineage, consent status, or licence terms. CycloneDX ML-BOM allows a dataset component to be declared, but the standard does not yet define how to verify that the declared dataset URI was the actual dataset used, or how to express data-lineage across preprocessing stages. SPDX 3.0’s AI profile adds dataset relationship fields but faces the same verification gap.

Teams today can close the gap partially by: recording the dataset version and hash in the provenance attestation; storing a separate dataset manifest in the feature or data platform with its own integrity check; and referencing both from the model card. The linkage is human-auditable but not yet machine-enforceable in the way that artifact signing is.

Common pitfalls

Signing at registration instead of promotion. Any experiment can be registered; a signature on an experimental model carries no governance weight. Sign at the curation gate transition, not before.
Storing the bundle in a mutable location. If the artifact can be overwritten, the bundle can be overwritten alongside it. Store bundles in a write-once prefix, use OCI push-immutability, or rely on Rekor as the canonical bundle source.
Admission policy with a wildcard subject. A policy that accepts any Sigstore identity verifies signature existence but not signer identity. Pin the issuer and subject to the specific training workload identity.
Skipping verification for fast rollouts. Set the admission policy to Enforce mode, not Audit. An Audit-mode policy produces findings but does not block the rollout.
Confusing the SBOM with the governance record. An SBOM is a dependency inventory; it is not a model card, eval record, or risk assessment. Use the SBOM to answer what libraries the serving image depends on; use the model card and eval gate to decide whether the model belongs in production.

References

[1] SLSA — Security levels specification v1.0. OpenSSF / slsa.dev, 2023. slsa.dev/spec/v1.0/levels
[2] Sigstore — Cosign signing overview. Sigstore project documentation. docs.sigstore.dev/cosign/signing/overview
[3] NIST — EO 14028: Software Security in Supply Chains (SBOM). National Institute of Standards and Technology, 2021. nist.gov/itl/executive-order-14028
[4] Kyverno — ImageValidatingPolicy documentation. Kyverno project documentation. kyverno.io/docs/policy-types/image-validating-policy
[5] CycloneDX — Machine Learning Bill of Materials (ML-BOM). OWASP CycloneDX project, 2023. cyclonedx.org/capabilities/mlbom
[6] SLSA blog — in-toto and SLSA. OpenSSF / slsa.dev, 2023. slsa.dev/blog/2023/05/in-toto-and-slsa
[7] Sigstore — Security model and trust root. Sigstore project documentation. docs.sigstore.dev/about/security
[8] OpenSSF — Scaling Up Supply Chain Security: Implementing Sigstore (2024). OpenSSF blog. openssf.org/blog/2024/02/16
[9] Analyzing Challenges in Deployment of the SLSA Framework for Software Supply Chain Security. arXiv:2409.05014, 2024. arxiv.org/pdf/2409.05014

Continue the Journey

AI Platform