To trace software back to the source and define the moving parts in a complex supply chain, provenance needs to be there from the very beginning. It’s the verifiable information about software artifacts describing where, when and how something was produced. For higher SLSA levels and more resilient integrity guarantees, provenance requirements are stricter and need a deeper, more technical understanding of the predicate.
This document defines the following predicate type within the in-toto attestation framework:
"predicateType": "https://slsa.dev/provenance/v1-rc1"
Important: Always use the above string for
predicateType
rather than what is in the URL bar. ThepredicateType
URI will always resolve to the latest minor version of this specification. See parsing rules for more information.
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.
Purpose
Describe how an artifact or set of artifacts was produced so that:
- Consumers of the provenance can verify that the artifact was built according to expectations.
- Others can rebuild the artifact, if desired.
This predicate is the RECOMMENDED way to satisfy the SLSA v1.0 provenance requirements.
Model
Provenance is an attestation that the builder
produced the subject
software
artifacts through execution of the buildDefinition
.
The model is as follows:
-
Each build runs as an independent process on a multi-tenant platform. The
builder.id
identifies this platform, representing the transitive closure of all entities that are trusted to faithfully run the build and record the provenance. (Note: The same model can be used for platform-less or single-tenant build systems.) -
The build process is defined by a parameterized template, identified by
buildType
. This encapsulates the process that ran, regardless of what system ran it. Often the build type is specific to the build platform because most build platforms have their own unique interfaces. -
All top-level, independent inputs are captured by the parameters to the template. There are two types of parameters:
-
externalParameters
: the external interface to the build. In SLSA, these values are untrusted; they MUST be included in the provenance and MUST be verified downstream. -
systemParameters
: set internally by the platform. In SLSA, these values are trusted because the platform is trusted; they are OPTIONAL and need not be verified downstream. They MAY be included to enable reproducible builds, debugging, or incident response.
-
-
All artifacts fetched during initialization or execution of the build process are considered dependencies, including those referenced directly by parameters. The
resolvedDependencies
captures these dependencies, if known. For example, a build that takes a git repository URI as a parameter might record the specific git commit that the URI resolved to as a dependency. -
During execution, the build process might communicate with the build platform’s control plane and/or build caches. This communication is not captured directly in the provenance, but is instead implied by
builder.id
and subject to SLSA Requirements. Such communication SHOULD NOT influence the definition of the build; if it does, it SHOULD go inresolvedDependencies
instead. -
Finally, the build process outputs one or more artifacts, identified by
subject
.
For concrete examples, see index of build types.
Parsing rules
This predicate follows the in-toto attestation parsing rules. Summary:
- Consumers MUST ignore unrecognized fields unless otherwise noted.
- The
predicateType
URI includes the major version number and will always change whenever there is a backwards incompatible change. - Minor version changes are always backwards compatible and “monotonic.” Such
changes do not update the
predicateType
. - Producers MAY add extension fields using field names that are URIs.
- Unset, null, and empty field values MUST be interpreted equivalently.
Schema
NOTE: This section describes the fields within predicate
. For a description
of the other top-level fields, such as subject
, see Statement.
{
// Standard attestation fields:
"_type": "https://in-toto.io/Statement/v1",
"subject": [...],
// Predicate:
"predicateType": "https://slsa.dev/provenance/v1-rc1",
"predicate": {
"buildDefinition": {
"buildType": string,
"externalParameters": object,
"systemParameters": object,
"resolvedDependencies": [ ...#ArtifactReference ],
},
"runDetails": {
"builder": {
"id": string,
"version": string,
"builderDependencies": [ ...#ArtifactReference ],
},
"metadata": {
"invocationId": string,
"startedOn": #Timestamp,
"finishedOn": #Timestamp,
},
"byproducts": [ ...#ArtifactReference ],
}
}
}
#ArtifactReference: {
"uri": string,
"digest": {
"sha256": string,
"sha512": string,
"sha1": string,
// TODO: list the other standard algorithms
[string]: string,
},
"localName": string,
"downloadLocation": string,
"mediaType": string,
}
#Timestamp: string // <YYYY>-<MM>-<DD>T<hh>:<mm>:<ss>Z
Protocol buffer schema
Link: provenance.proto
syntax = "proto3";
package slsa.v1;
import "google/protobuf/struct.proto";
import "google/protobuf/timestamp.proto";
// NOTE: While file uses snake_case as per the Protocol Buffers Style Guide, the
// provenance is always serialized using JSON with lowerCamelCase. Protobuf
// tooling performs this case conversion automatically.
message Provenance {
BuildDefinition build_definition = 1;
RunDetails run_details = 2;
}
message BuildDefinition {
string build_type = 1;
google.protobuf.Struct external_parameters = 2;
google.protobuf.Struct system_parameters = 3;
repeated ArtifactReference resolved_dependencies = 4;
}
message ArtifactReference {
string uri = 1;
map<string, string> digest = 2;
string local_name = 3;
string download_location = 4;
string media_type = 5;
}
message RunDetails {
Builder builder = 1;
BuildMetadata metadata = 2;
repeated ArtifactReference byproducts = 3;
}
message Builder {
string id = 1;
map<string, string> version = 2;
repeated ArtifactReference builder_dependencies = 3;
}
message BuildMetadata {
string invocation_id = 1;
google.protobuf.Timestamp started_on = 2;
google.protobuf.Timestamp finished_on = 3;
}
Provenance
REQUIRED for SLSA Build L1: buildDefinition
, runDetails
Field | Type | Description |
---|---|---|
buildDefinition
| BuildDefinition |
The input to the build. The accuracy and completeness are implied by
|
runDetails
| RunDetails |
Details specific to this particular execution of the build. |
BuildDefinition
REQUIRED for SLSA Build L1: buildType
, externalParameters
Field | Type | Description |
---|---|---|
buildType
| string (TypeURI) |
Identifies the template for how to perform the build and interpret the parameters and dependencies. The URI SHOULD resolve to a human-readable specification that includes: overall
description of the build type; schema for |
externalParameters
| object |
The parameters that are under external control, such as those set by a user or tenant of the build system. They MUST be complete at SLSA Build L3, meaning that that there is no additional mechanism for an external party to influence the build. (At lower SLSA Build levels, the completeness MAY be best effort.) The build system SHOULD be designed to minimize the size and complexity of
Verifiers SHOULD reject unrecognized or unexpected fields within
|
systemParameters
| object |
The parameters that are under the control of the entity represented by
NOTE: This field is named |
resolvedDependencies
| array (ArtifactReference) |
Unordered collection of artifacts needed at build time. Completeness is best effort, at least through SLSA Build L3. For example, if the build script fetches and executes “example.com/foo.sh”, which in turn fetches “example.com/bar.tar.gz”, then both “foo.sh” and “bar.tar.gz” SHOULD be listed here. |
The BuildDefinition describes all of the inputs to the build. It SHOULD contain all the information necessary and sufficient to initialize the build and begin execution.
The externalParameters
and systemParameters
are the top-level inputs to the
template, meaning inputs not derived from another input. Each is an arbitrary
JSON object, though it is RECOMMENDED to keep the structure simple with string
values to aid verification. The same field name SHOULD NOT be used for both
externalParameters
and systemParameters
.
The parameters SHOULD only contain the actual values passed in through the
interface to the build system. Metadata about those parameter values,
particularly digests of artifacts referenced by those parameters, SHOULD instead
go in resolvedDependencies
. The documentation for buildType
SHOULD explain
how to convert from a parameter to the dependency uri
. For example:
"externalParameters": {
"repository": "https://github.com/octocat/hello-world",
"ref": "refs/heads/main"
},
"resolvedDependencies": [{
"uri": "git+https://github.com/octocat/hello-world@refs/heads/main",
"digest": {"sha1": "7fd1a60b01f91b314f59955a4e4d4e80d8edf11d"}
}]
Guidelines:
-
Maximize the amount of information that is implicit from the meaning of
buildType
. In particular, any value that is boilerplate and the same for every build SHOULD be implicit. -
Reduce parameters by moving configuration to input artifacts whenever possible. For example, instead of passing in compiler flags via an external parameter that has to be verified separately, require the flags to live next to the source code or build configuration so that verifying the latter automatically verifies the compiler flags.
-
In some cases, additional external parameters might exist that do not impact the behavior of the build, such as a deadline or priority. These extra parameters SHOULD be excluded from the provenance after careful analysis that they indeed pose no security impact.
-
If possible, architect the build system to use this definition as its sole top-level input, in order to guarantee that the information is sufficient to run the build.
-
When build configuration is evaluated client-side before being sent to the server, such as transforming version-controlled YAML into ephemeral JSON, some solution is needed to make verification practical. Consumers need a way to know what configuration is expected and the usual way to do that is to map it back to version control, but that is not possible if the server cannot verify the configuration’s origins. Possible solutions:
-
(RECOMMENDED) Rearchitect the build service to read configuration directly from version control, recording the server-verified URI in
externalParameters
and the digest inresolvedDependencies
. -
Record the digest in the provenance1 and use a separate provenance attestation to link that digest back to version control. In this solution, the client-side evaluation is considered a separate “build” that SHOULD be independently secured using SLSA, though securing it can be difficult since it usually runs on an untrusted workstation.
-
-
The purpose of
resolvedDependencies
is to facilitate recursive analysis of the software supply chain. Where practical, it is valuable to record the URI and digest of artifacts that, if compromised, could impact the build. At SLSA Build L3, completeness is considered “best effort”.
⚠ RFC: We are particularly looking for feedback on this schema from potential implementers. Does this model map cleanly to existing build systems? Is it natural to identify and express the external parameters? Is anything confusing or ambiguous?
ArtifactReference
REQUIRED: at least one of uri
or digest
Field | Type | Description |
---|---|---|
uri
| string (URI) |
URI describing where this artifact came from. When possible, this SHOULD be a universal and stable identifier, such as a source location or Package URL (purl). |
digest
| DigestSet |
One or more cryptographic digests of the contents of this artifact. |
localName
| string |
The name for this artifact local to the build. |
downloadLocation
| string (URI) |
URI identifying the location that this artifact was downloaded from, if
different and not derivable from |
mediaType
| string (MediaType) |
Media type (aka MIME type) of this artifact was interpreted. |
Example:
{
"uri": "pkg:pypi/pyyaml@6.0",
"digest": {"sha256": "5f0689d54944564971f2811f9788218bfafb21aa20f532e6490004377dfa648f"},
"localName": "PyYAML-6.0.tar.gz",
"downloadLocation": "https://files.pythonhosted.org/packages/36/2b/61d51a2c4f25ef062ae3f74576b01638bebad5e045f747ff12643df63844/PyYAML-6.0.tar.gz",
"mediaType": "application/gzip"
}
⚠ RFC: Do we need all these fields? Is this adding too much complexity?
RunDetails
REQUIRED for SLSA Build L1: builder
Field | Type | Description |
---|---|---|
builder
| Builder |
Identifies the entity that executed the invocation, which is trusted to have correctly performed the operation and populated this provenance. |
metadata
| BuildMetadata |
Metadata about this particular execution of the build. |
byproducts
| array (ArtifactReference) |
Additional artifacts generated during the build that are not considered the “output” of the build but that might be needed during debugging or incident response. For example, this might reference logs generated during the build and/or a digest of the fully evaluated build configuration. In most cases, this SHOULD NOT contain all intermediate files generated during the build. Instead, this SHOULD only contain files that are likely to be useful later and that cannot be easily reproduced. |
Builder
REQUIRED for SLSA Build L1: id
Field | Type | Description |
---|---|---|
id
| string (TypeURI) |
URI indicating the transitive closure of the trusted builder. This is intended to be the sole determiner of the SLSA Build level. If a build platform has multiple modes of operations that have differing
security attributes or SLSA Build levels, each mode MUST have a different
|
version
| map (string→string) |
Version numbers of components of the builder. |
builderDependencies
| array (ArtifactReference) |
Dependencies used by the orchestrator that are not run within the workload and that do not affect the build, but might affect the provenance generation or security guarantees. |
The builder represents the transitive closure of all the entities that are, by necessity, trusted to faithfully run the build and record the provenance.
The id
MUST reflect the trust base that consumers care about. How detailed to
be is a judgement call. For example, GitHub Actions supports both GitHub-hosted
runners and self-hosted runners. The GitHub-hosted runner might be a single
identity because it’s all GitHub from the consumer’s perspective. Meanwhile,
each self-hosted runner might have its own identity because not all runners are
trusted by all consumers.
Consumers MUST accept only specific signer-builder pairs. For example, “GitHub” can sign provenance for the “GitHub Actions” builder, and “Google” can sign provenance for the “Google Cloud Build” builder, but “GitHub” cannot sign for the “Google Cloud Build” builder.
Design rationale: The builder is distinct from the signer in order to support the case where one signer generates attestations for more than one builder, as in the GitHub Actions example above. The field is REQUIRED, even if it is implicit from the signer, to aid readability and debugging. It is an object to allow additional fields in the future, in case one URI is not sufficient.
⚠ RFC: Do we need more explicit guidance on how to choose a URI?
⚠ RFC: Would it be preferable to allow builders to set arbitrary properties, rather than calling out
version
andbuilderDependencies
? We don’t expect verifiers to use any of them, so maybe that’s the simpler approach? Or have aproperties
that is an arbitrary object? (#319)
⚠ RFC: Do we want/need to identify the tenant of the build system, separately from the build system itself? If so, a single
id
that combines both (e.g.https://builder.example/tenants/company1.example/project1
) or two separate fields (e.g.{"id": "https://builder.example", "tenant": "https://company1.example/project1"}
)? What would the use case be for this? How does verification work?
BuildMetadata
REQUIRED: (none)
Field | Type | Description |
---|---|---|
invocationId
| string |
Identifies this particular build invocation, which can be useful for finding
associated logs or other ad-hoc analysis. The exact meaning and format is
defined by |
startedOn
| string (Timestamp) |
The timestamp of when the build started. |
finishedOn
| string (Timestamp) |
The timestamp of when the build completed. |
Verification
Verification of provenance encompasses the following steps.
Step 1: Check SLSA Build level
First, check the SLSA Build level by comparing the artifact to its provenance and the provenance to a preconfigured root of trust. The goal is to ensure that the provenance actually applies to the artifact in question and to assess the trustworthiness of the provenance. This mitigates some or all of threats “D”, “F”, “G”, and “H”, depending on SLSA Build level and where verification happens.
Up front:
-
Configure the verifier’s roots of trust, meaning the recognized builder identities and the maximum SLSA Build level each builder is trusted up to. Different verifiers might use different roots of trust, but usually a verifier uses the same roots of trust for all packages. This is likely in the form of a map from (builder public key identity,
builder.id
) to (SLSA Build level).Example root of trust configuration
The following snippet shows conceptually how a verifier’s roots of trust might be configured using made-up syntax.
"slsaRootsOfTrust": [ // A builder trusted at SLSA Build L3, using a fixed public key. { "publicKey": "HKJEwI...", "builderId": "https://somebuilder.example.com/slsa/l3", "slsaBuildLevel": 3 }, // A different builder that claims to be SLSA Build L3, // but this verifier only trusts it to L2. { "publicKey": "tLykq9...", "builderId": "https://differentbuilder.example.com/slsa/l3", "slsaBuildLevel": 2 }, // A builder that uses Sigstore for authentication, without a builder.id { "sigstore": { "root": "global", // identifies fulcio/rekor roots "subjectAlternativeNamePattern": "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v*.*.*" } "builderId": "", // empty for this particular builder "slsaBuildLevel": 3, } ... ],
Given an artifact and its provenance:
- Verify the envelope’s signature using the roots of trust, resulting in a list of recognized public keys (or equivalent).
- Verify that statement’s
subject
matches the digest of the artifact in question. - Verify that the
predicateType
ishttps://slsa.dev/provenance/v1-rc1
. - Look up the SLSA Build Level in the roots of trust, using the recognized
public keys and the
builder.id
, defaulting to SLSA Build L1.
Resulting threat mitigation:
- Threat “D”: SLSA Build L3 requires protection against compromise of the
build process and provenance generation by an external adversary, such as
persistence between builds or theft of the provenance signing key. In other
words, SLSA Build L3 establishes that the provenance is accurate and
trustworthy, assuming you trust the build platform.
- IMPORTANT: SLSA Build L3 does not cover compromise of the build platform itself, such as by a malicious insider. Instead, verifiers SHOULD carefully consider which build platforms are added to the roots of trust. For advice on establishing trust in build platforms, see Verifying build systems.
- Threat “F”: SLSA Build L2 covers tampering of the artifact or provenance
after the build. This is accomplished by verifying the
subject
and signature in the steps above. - Threat “G”: Verification by the consumer or otherwise outside of the package registry covers compromise of the registry itself. (Verifying within the registry at publication time is also valuable, but does not cover Threat “G” or “H”.)
- Threat “H”: Verification by the consumer covers compromise of the package
in transit. (Many ecosystems also address this threat using package
signatures or checksums.)
- NOTE: SLSA does not cover adversaries tricking a consumer to use an unintended package, such as through typosquatting.
Step 2: Check expectations
Next, check that the package’s provenance meets expectations for that package in order to mitigate threat “C”.
In our threat model, the adversary has ability to invoke a build and to publish
to the registry but not to write to the source repository, nor do they have
insider access to any trusted systems. Expectations MUST be sufficient to detect
or prevent this adversary from injecting unofficial behavior into the package.
Example threats in this category include building from an unofficial fork or
abusing a build parameter to modify the build. Usually expectations identify the
canonical source repository (which is the entry in externalParameters
) and any
other security-relevant external parameters.
The expectations SHOULD cover the following:
What | Why |
---|---|
Builder identity from Step 1 | To prevent an adversary from building the correct code on an unintended system |
buildType |
To ensure that externalParameters are interpreted as intended |
externalParameters |
To prevent an adversary from injecting unofficial behavior |
Verifiers SHOULD reject unrecognized fields in externalParameters
to err on
the side of caution. It is acceptable to allow a parameter to have a range of
values (possibly any value) if it is known that any value in the range is safe.
Implementations need not special-case the buildType
if JSON comparisons are
sufficient.
Possible models for implementing expectation setting in package ecosystems (not exhaustive):
-
Trust on first use: Accept the first version of the package as-is. On each version update, compare the old provenance to the new provenance and alert on any differences. This can be augmented by having rules about what changes are benign, such as a parameter known to be safe or a heuristic about safe git refs.
-
Explicit policy: Package producer defines the expectations for the package and distributes it to the verifier; the verifier uses these expectations after verifying their authenticity. In this model, there MUST be some protection against an adversary unilaterally modifying the policy. For example, this might involve two-party control over policy modifications, or having consumers accept each policy change (another form of trust on first use).
-
Immutable policy: Expectations for a package cannot change. In this model, the package name is immutably bound to a source repository and all other expectations are defined in the source repository. This is how go works, for example, since the package name is the source repository location.
TIP: Difficulty in setting meaningful expectations for externalParameters
can
be a sign that the buildType
’s level of abstraction is too low. For example,
externalParameters
that record a list of commands to run is likely impractical
to verify because the commands change on every build. Instead, consider a
buildType
that defines the list of commands in a configuration file in a
source repository, then make put only the source repository in
externalParameters
. Such a design is easier to verify because the source
repository is constant across builds.
Step 3: Check dependencies (recursively)
Finally, recursively check the resolvedDependencies
as available and to the
extent desired. This mitigates threat “E”. While SLSA v1.0 does not have any
requirements on the completeness or verification of resolvedDependencies
, one
might wish to go beyond SLSA’s minimum requirements in order to protect against
threats further up the supply chain.
One possible approach is to recursively verify each entry in
resolvedDependencies
. A Verification Summary Attestation (VSA) can make
this process more efficient by recording the result of prior verifications. A
trimming heuristic or exception mechanism will almost always be necessary
because there will always be some transitive dependencies that are SLSA Build
L0. (For example, consider the compiler’s compiler’s compiler’s … compiler.)
If resolvedDependencies
is incomplete, this can be done on a best-effort
basis.
Index of build types
The following is an partial index of build type definitions. Each contains a complete example predicate.
TODO: Before marking the spec stable, add at least 1-2 other build types to validate that the design is general enough to apply to other builders.
Migrating from 0.2
To migrate from version 0.2 (old
), use the following
pseudocode. The meaning of each field is unchanged unless otherwise noted.
{
"buildDefinition": {
// The `buildType` MUST be updated for v1.0 to describe how to
// interpret `inputArtifacts`.
"buildType": /* updated version of */ old.buildType,
"externalParameters":
old.invocation.parameters + {
// It is RECOMMENDED to rename "entryPoint" to something more
// descriptive.
"entryPoint": old.invocation.configSource.entryPoint,
// It is OPTIONAL to rename "source" to something more descriptive,
// especially if "source" is ambiguous or confusing.
"source": old.invocation.configSource.uri,
},
"systemParameters": old.invocation.environment,
"resolvedDependencies":
old.materials + [
{
"uri": old.invocation.configSource.uri,
"digest": old.invocation.configSource.digest,
}
]
},
"runDetails": {
"builder": {
"id": old.builder.id,
"version": null, // not in v0.2
"builderDependencies": null, // not in v0.2
},
"metadata": {
"invocationId": old.metadata.buildInvocationId,
"startedOn": old.metadata.buildStartedOn,
"finishedOn": old.metadata.buildFinishedOn,
},
"byproducts": null, // not in v0.2
},
}
The following fields from v0.2 are no longer present in v1.0:
entryPoint
: UseexternalParameters[<name>]
instead.buildConfig
: No longer inlined into the provenance. Instead, either:- If the configuration is a top-level input, record its digest in
externalParameters["config"]
. - Else if there is a known use case for knowing the exact resolved
build configuration, record its digest in
byproducts
. An example use case might be someone who wishes to parse the configuration to look for bad patterns, such ascurl | bash
. - Else omit it.
- If the configuration is a top-level input, record its digest in
metadata.completeness
: Now implicit frombuilder.id
.metadata.reproducible
: Now implicit frombuilder.id
.
Change history
v1.0 RC1
Major refactor to reduce misinterpretation, including a minor change in model.
- Significantly expanded all documentation.
- Altered the model slightly to better align with real-world build systems, align with reproducible builds, and make verification easier.
- Grouped fields into
buildDefinition
vsrunDetails
. - Renamed, with slight changes in semantics:
parameters
->externalParameters
environment
->systemParameters
materials
->resolvedDependencies
- Removed:
configSource
: No longer special-cased. Now represented asexternalParameters
+resolvedDependencies
.buildConfig
: No longer inlined into the provenance. Can be replaced with a reference inexternalParameters
orbyproducts
, depending on the semantics, or omitted if not needed.completeness
andreproducible
: Now implied bybuilder.id
.
- Added:
localName
,downloadLocation
, andmediaType
builder.version
byproducts
v0.2
Refactored to aid clarity and added buildConfig
. The model is unchanged.
- Replaced
definedInMaterial
andentryPoint
withconfigSource
. - Renamed
recipe
toinvocation
. - Moved
invocation.type
to top-levelbuildType
. - Renamed
arguments
toparameters
. - Added
buildConfig
, which can be used as an alternative toconfigSource
to validate the configuration.
rename: slsa.dev/provenance
Renamed to “slsa.dev/provenance”.
v0.1.1
- Added
metadata.buildInvocationId
.
v0.1
Initial version, named “in-toto.io/Provenance”
-
The
externalParameters
SHOULD reflect reality. If clients send the evaluated configuration object directly to the build server, record the digest directly inexternalParameters
. If clients upload the configuration object to a temporary storage location and send that location to the build server, record the location inexternalParameters
as a URI and record theuri
anddigest
inresolvedDependencies
. ↩