To trace software back to the source and define the moving parts in a complex supply chain, provenance needs to be there from the very beginning. It’s the verifiable information about software artifacts describing where, when and how something was produced. For higher SLSA levels and more resilient integrity guarantees, provenance requirements are stricter and need a deeper, more technical understanding of the predicate.
This document defines the following predicate type within the in-toto attestation framework:
"predicateType": "https://slsa.dev/provenance/v0.2"
Important: Always use the above string for
predicateType
rather than what is in the URL bar. ThepredicateType
URI will always resolve to the latest minor version of this specification. See parsing rules for more information.
Purpose
Describe how an artifact or set of artifacts was produced.
This predicate is the recommended way to satisfy the SLSA provenance requirements.
Model
Provenance is an attestation that some entity (builder
) produced one or more
software artifacts (the subject
of an in-toto attestation Statement) by
executing some invocation
, using some other artifacts as input (materials
).
The invocation in turn runs the buildConfig
, which is a record of what was
executed. The builder is trusted to have faithfully recorded the provenance;
there is no option but to trust the builder. However, the builder may have
performed this operation at the request of some external, possibly untrusted
entity. These untrusted parameters are captured in the invocation’s parameters
and some of the materials
. Finally, the build may have depended on various
environmental parameters (environment
) that are needed for
reproducing the build but that are not under external control.
See Example for a concrete example.
Schema
{
// Standard attestation fields:
"_type": "https://in-toto.io/Statement/v0.1",
"subject": [{ ... }],
// Predicate:
"predicateType": "https://slsa.dev/provenance/v0.2",
"predicate": {
"builder": {
"id": "<URI>"
},
"buildType": "<URI>",
"invocation": {
"configSource": {
"uri": "<URI>",
"digest": { /* DigestSet */ },
"entryPoint": "<STRING>"
},
"parameters": { /* object */ },
"environment": { /* object */ }
},
"buildConfig": { /* object */ },
"metadata": {
"buildInvocationId": "<STRING>",
"buildStartedOn": "<TIMESTAMP>",
"buildFinishedOn": "<TIMESTAMP>",
"completeness": {
"parameters": true/false,
"environment": true/false,
"materials": true/false
},
"reproducible": true/false
},
"materials": [
{
"uri": "<URI>",
"digest": { /* DigestSet */ }
}
]
}
}
Parsing rules
This predicate follows the in-toto attestation parsing rules. Summary:
- Consumers MUST ignore unrecognized fields.
- The
predicateType
URI includes the major version number and will always change whenever there is a backwards incompatible change. - Minor version changes are always backwards compatible and “monotonic.” Such
changes do not update the
predicateType
. - Producers MAY add extension fields using field names that are URIs.
- Optional fields MAY be unset or null, and should be treated equivalently. Both are equivalent to empty for object or array values.
Fields
NOTE: This section describes the fields within predicate
. For a description
of the other top-level fields, such as subject
, see Statement.
Identifies the entity that executed the invocation, which is trusted to have correctly performed the operation and populated this provenance.
The identity MUST reflect the trust base that consumers care about. How detailed to be is a judgement call. For example, GitHub Actions supports both GitHub-hosted runners and self-hosted runners. The GitHub-hosted runner might be a single identity because it’s all GitHub from the consumer’s perspective. Meanwhile, each self-hosted runner might have its own identity because not all runners are trusted by all consumers.
Consumers MUST accept only specific (signer, builder) pairs. For example, “GitHub” can sign provenance for the “GitHub Actions” builder, and “Google” can sign provenance for the “Google Cloud Build” builder, but “GitHub” cannot sign for the “Google Cloud Build” builder.
Design rationale: The builder is distinct from the signer because one signer may generate attestations for more than one builder, as in the GitHub Actions example above. The field is required, even if it is implicit from the signer, to aid readability and debugging. It is an object to allow additional fields in the future, in case one URI is not sufficient.
builder.id
string (TypeURI), required
URI indicating the builder’s identity.
buildType
string (TypeURI), required
URI indicating what type of build was performed. It determines the meaning of
invocation
,buildConfig
andmaterials
.
Identifies the event that kicked off the build. When combined with
materials
, this SHOULD fully describe the build, such that re-running this invocation results in bit-for-bit identical output (if the build is reproducible).MAY be unset/null if unknown, but this is DISCOURAGED.
invocation.configSource
object, optional
Describes where the config file that kicked off the build came from. This is effectively a pointer to the source where
buildConfig
came from.
invocation.configSource.uri
string (ResourceURI), optional
URI indicating the identity of the source of the config.
invocation.configSource.digest
object (DigestSet), optional
Collection of cryptographic digests for the contents of the artifact specified by
invocation.configSource.uri
.
invocation.configSource.entryPoint
string, optional
String identifying the entry point into the build. This is often a path to a configuration file and/or a target label within that file. The syntax and meaning are defined by
buildType
. For example, if thebuildType
were “make”, then this would reference the directory in which to runmake
as well as which target to use.Consumers SHOULD accept only specific
invocation.entryPoint
values. For example, a policy might only allow the “release” entry point but not the “debug” entry point.MAY be omitted if the
buildType
specifies a default value.Design rationale: The
entryPoint
is distinct fromparameters
to make it easier to write secure policies without having to parseparameters
.
invocation.parameters
object, optional
Collection of all external inputs that influenced the build on top of
invocation.configSource
. For example, if the invocation type were “make”, then this might be the flags passed tomake
aside from the target, which is captured ininvocation.configSource.entryPoint
.Consumers SHOULD accept only “safe”
invocation.parameters
. The simplest and safest way to achieve this is to disallow anyparameters
altogether.This is an arbitrary JSON object with a schema defined by
buildType
.This is considered to be incomplete unless
metadata.completeness.parameters
is true.
invocation.environment
object, optional
Any other builder-controlled inputs necessary for correctly evaluating the build. Usually only needed for reproducing the build but not evaluated as part of policy.
This SHOULD be minimized to only include things that are part of the public API, that cannot be recomputed from other values in the provenance, and that actually affect the evaluation of the build. For example, this might include variables that are referenced in the workflow definition, but it SHOULD NOT include a dump of all environment variables or include things like the hostname (assuming hostname is not part of the public API).
This is an arbitrary JSON object with a schema defined by
buildType
.This is considered to be incomplete unless
metadata.completeness.environment
is true.
Other properties of the build.
metadata.buildInvocationId
string, optional
Identifies this particular build invocation, which can be useful for finding associated logs or other ad-hoc analysis. The exact meaning and format is defined by
builder.id
; by default it is treated as opaque and case-sensitive. The value SHOULD be globally unique.
metadata.buildStartedOn
string (Timestamp), optional
The timestamp of when the build started.
metadata.buildFinishedOn
string (Timestamp), optional
The timestamp of when the build completed.
metadata.completeness
object, optional
Indicates that the
builder
claims certain fields in this message to be complete.
metadata.completeness.parameters
boolean, optional
If true, the
builder
claims thatinvocation.parameters
is complete, meaning that all external inputs are properly captured ininvocation.parameters
.
metadata.completeness.environment
boolean, optional
If true, the
builder
claims thatinvocation.environment
is complete.
metadata.completeness.materials
boolean, optional
If true, the
builder
claims thatmaterials
is complete, usually through some controls to prevent network access. Sometimes called “hermetic”.
metadata.reproducible
boolean, optional
If true, the
builder
claims that runninginvocation
onmaterials
will produce bit-for-bit identical output.
Lists the steps in the build. If
invocation.configSource
is not available,buildConfig
can be used to verify information about the build.This is an arbitrary JSON object with a schema defined by
buildType
.
materials
array of objects, optional
The collection of artifacts that influenced the build including sources, dependencies, build tools, base images, and so on.
This is considered to be incomplete unless
metadata.completeness.materials
is true.
materials[*].uri
string (ResourceURI), optional
The method by which this artifact was referenced during the build.
TODO: Should we differentiate between the “referenced” URI and the “resolved” URI, e.g. “latest” vs “3.4.1”?
TODO: Should wrap in a
locator
object to allow for extensibility, in case we add other types of URIs or other non-URI locators?
materials[*].digest
object (DigestSet), optional
Collection of cryptographic digests for the contents of this artifact.
Example
WARNING: This is just for demonstration purposes.
Suppose the builder downloaded example-1.2.3.tar.gz
, extracted it, and ran
make -C src foo CFLAGS=-O3
, resulting in a file with hash 5678...
. Then the
provenance might look like this:
{
"_type": "https://in-toto.io/Statement/v0.1",
// Output file; name is "_" to indicate "not important".
"subject": [{"name": "_", "digest": {"sha256": "5678..."}}],
"predicateType": "https://slsa.dev/provenance/v0.2",
"predicate": {
"buildType": "https://example.com/Makefile",
"builder": { "id": "mailto:person@example.com" },
"invocation": {
"configSource": {
"uri": "https://example.com/example-1.2.3.tar.gz",
"digest": {"sha256": "1234..."},
"entryPoint": "src:foo", // target "foo" in directory "src"
},
"parameters": {"CFLAGS": "-O3"} // extra args to `make`
},
"materials": [{
"uri": "https://example.com/example-1.2.3.tar.gz",
"digest": {"sha256": "1234..."}
}]
}
}
More examples
GitHub Actions
WARNING: This is only for demonstration purposes. The GitHub Actions team has not yet reviewed or approved this design, and it is not yet implemented. Details are subject to change!
If GitHub is the one to generate provenance, and the runner is GitHub-hosted, then the builder would be as follows:
"builder": {
"id": "https://github.com/Attestations/GitHubHostedActions@v1"
}
Self-hosted runner: Not yet supported. We need to figure out a URI scheme that
represents what system hosted the runner, or perhaps add additional properties
in builder
.
GitHub Actions Workflow
"buildType": "https://github.com/Attestations/GitHubActionsWorkflow@v1",
"invocation": {
"configSource": {
"entryPoint": "build.yaml:build",
// The git repo that contains the build.yaml referenced in the entrypoint.
"uri": "git+https://github.com/foo/bar.git",
// The resolved git commit hash reflecting the version of the repo used
// for this build.
"digest": {"sha1": "abc..."}
},
// The only possible user-defined parameters that can affect the build are the
// "inputs" to a workflow_dispatch event. This is unset/null for all other
// events.
"parameters": {
"inputs": { ... }
},
// Other variables that are required to reproduce the build and that cannot be
// recomputed using existing information. (Documentation would explain how to
// recompute the rest of the fields.)
"environment": {
// The architecture of the runner.
"arch": "amd64",
// Environment variables. These are always set because it is not possible
// to know whether they were referenced or not.
"env": {
"GITHUB_RUN_ID": "1234",
"GITHUB_RUN_NUMBER": "5678",
"GITHUB_EVENT_NAME": "push"
},
// The context values that were referenced in the workflow definition.
// Secrets are set to the empty string.
"context": {
"github": {
"run_id": "abcd1234"
},
"runner": {
"os": "Linux",
"temp": "/tmp/tmp.iizj8l0XhS",
}
}
}
}
"materials": [{
// The git repo that contains the build.yaml referenced above.
"uri": "git+https://github.com/foo/bar.git",
// The resolved git commit hash reflecting the version of the repo used
// for this build.
"digest": {"sha1": "abc..."}
}]
GitLab CI
The GitLab CI team has implemented an artifact attestation capability in their GitLab Runner 15.1 release.
If GitLab is the one to generate provenance, and the runner is GitLab-hosted or self-hosted, then the builder would be as follows:
"builder": {
"id": "https://gitlab.com/foo/bar/-/runners/12345678"
}
GitLab CI Job
"buildType": "https://gitlab.com/gitlab-org/gitlab-runner/-/blob/v15.1.0/PROVENANCE.md",
"invocation": {
"configSource": {
// the git repo that contains the GitLab CI job referenced in the entrypoint
"uri": "https://gitlab.com//foo/bar",
// The resolved git commit hash reflecting the version of the repo used
// for this build.
"digest": {
"sha256": "abc..."
},
// the name of the CI job that triggered the build
"entryPoint": "build"
},
// Other variables that are required to reproduce the build and that cannot be
// recomputed using existing information. (Documentation would explain how to
// recompute the rest of the fields.)
"environment": {
// Name of the GitLab runner
"name": "hosted-gitlab-runner",
// The runner executor
"executor": "kubernetes",
// The architecture on which the CI job is run
"architecture": "amd64"
},
// Collection of all external inputs (CI variables) related to the job
"parameters": {
"CI_PIPELINE_ID": "",
"CI_PIPELINE_URL": "",
// All other CI variable names are listed here. Values are always represented as empty strings to avoid leaking secrets.
}
},
"metadata": {
"buildStartedOn": "2022-06-17T00:47:27+03:00",
"buildFinishedOn": "2022-06-17T00:47:28+03:00",
"completeness": {
"parameters": true,
"environment": true,
"materials": false
},
"reproducible": false
}
Google Cloud Build
WARNING: This is only for demonstration purposes. The Google Cloud Build team has not yet reviewed or approved this design, and it is not yet implemented. Details are subject to change!
If Google is the one to generate provenance, and the worker is Google-hosted, then the builder would be as follows:
"builder": {
"id": "https://cloudbuild.googleapis.com/GoogleHostedWorker@v1"
}
Custom worker: Not yet supported. We need to figure out a URI scheme that
represents what system hosted the worker, or perhaps add additional properties
in builder
.
Cloud Build config-as-code
Here entryPoint
references the filename
from the CloudBuild
BuildTrigger.
"buildType": "https://cloudbuild.googleapis.com/CloudBuildYaml@v1",
"invocation": {
// ... in the git repo described by `materials[0]` ...
"configSource": {
"entryPoint": "path/to/cloudbuild.yaml",
// The git repo that contains the cloudbuild.yaml referenced above.
"uri": "git+https://source.developers.google.com/p/foo/r/bar",
// The resolved git commit hash reflecting the version of the repo used
// for this build.
"digest": {"sha1": "abc..."}
},
// The only possible user-defined parameters that can affect a BuildTrigger
// are the subtitutions in the BuildTrigger.
"parameters": {
"substitutions": {...}
}
}
"buildConfig": {
// each step in the recipe corresponds to a step in the cloudbuild.yaml
// the format of this is determined by `buildType`
"steps": [
{
"image": "pkg:docker/make@sha256:244fd47e07d1004f0aed9c",
"arguments": ["build"]
}
]
}
"materials": [{
// The git repo that contains the cloudbuild.yaml referenced above.
"uri": "git+https://source.developers.google.com/p/foo/r/bar",
// The resolved git commit hash reflecting the version of the repo used
// for this build.
"digest": {"sha1": "abc..."}
}]
Cloud Build RPC
Here we list the steps defined in a trigger or over RPC:
"buildType": "https://cloudbuild.googleapis.com/CloudBuildSteps@v1",
"invocation": {
// Build steps were provided as an argument. No `configSource`
"parameters": {
// The substitutions in the build trigger.
"substitutions": {...}
// TODO: Any other arguments?
}
}
"buildConfig": {
// The steps that were performed. (Format TBD.)
"steps": [...]
}
Explicitly run commands
WARNING: This is just a proof-of-concept. It is not yet standardized.
Execution of arbitrary commands:
"buildType": "https://example.com/ManuallyRunCommands@v1",
// There was no entry point, and the commands were run in an ad-hoc fashion.
// There is no `configSource`.
"invocation": null,
"buildConfig": {
// The list of commands that were executed.
"commands": [
"tar xvf foo-1.2.3.tar.gz",
"cd foo-1.2.3",
"./configure --enable-some-feature",
"make foo.zip"
],
// Indicates how to parse the strings in `commands`.
"shell": "bash"
}
Migrating from 0.1
To migrate from version 0.1 (old
):
{
"builder": old.builder, // (unchanged)
"buildType": old.recipe.type,
"invocation": {
"configSource": {
"uri": old.materials[old.recipe.definedInMaterial].uri,
"digest": old.materials[old.recipe.definedInMaterial].digest,
"entrypoint": old.recipe.entryPoint
},
"parameters": old.recipe.arguments,
"environment": old.recipe.environment // (unchanged)
},
"buildConfig": null, // no equivalent in 0.1
"metadata": {
"buildInvocationId": old.metadata.buildInvocationId, // (unchanged)
"buildStartedOn": old.metadata.buildStartedOn, // (unchanged)
"buildFinishedOn": old.metadata.buildFinishedOn, // (unchanged)
"completeness": {
"parameters": old.metadata.completeness.arguments,
"environment": old.metadata.completeness.environment, // (unchanged)
"materials": old.metadata.completeness.materials, // (unchanged)
},
"reproducible": old.metadata.reproducible // (unchanged)
},
"materials": old.materials // optionally removing the configSource
}
Change history
- 0.2: Refactored to aid clarity and added
buildConfig
. The model is unchanged.- Replaced
definedInMaterial
andentryPoint
withconfigSource
. - Renamed
recipe
toinvocation
. - Moved
invocation.type
to top-levelbuildType
. - Renamed
arguments
toparameters
. - Added
buildConfig
, which can be used as an alternative toconfigSource
to validate the configuration.
- Replaced
- Renamed to “slsa.dev/provenance”.
- 0.1.1: Added
metadata.buildInvocationId
. - 0.1: Initial version, named “in-toto.io/Provenance”