Skip to content

Add Archivista Storage Backend#1316

Merged
tekton-robot merged 1 commit intotektoncd:mainfrom
testifysec:archivista-storage2
May 1, 2025
Merged

Add Archivista Storage Backend#1316
tekton-robot merged 1 commit intotektoncd:mainfrom
testifysec:archivista-storage2

Conversation

@mikhailswift
Copy link
Contributor

@mikhailswift mikhailswift commented Feb 27, 2025

Changes

Adds the ability to store signed TaskRun and PipelineRun results in Archivista. Archivista currently only supports payloads that are wrapped in a DSSE envelope, so any signatures that are not DSSE envelopes will not be stored.

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs included if any changes are user facing
  • Has Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings)
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

Signed attestations are now able to be stored in Archivista

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Feb 27, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: mikhailswift / name: Mikhail Swift (0b46102)

@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 27, 2025
@tekton-robot
Copy link

Hi @mikhailswift. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mikhailswift
Copy link
Contributor Author

Tagging @colek42 for visibility

@mikhailswift mikhailswift force-pushed the archivista-storage2 branch 2 times, most recently from 41cd51c to 986ab3b Compare February 28, 2025 04:33
@PuneetPunamiya
Copy link
Member

/ok-to-test

@tekton-robot tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 4, 2025
@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-chains-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/chains/storage/archivista/archivista.go Do not exist 61.0%
pkg/chains/storage/storage.go 33.9% 31.7% -2.2

@colek42
Copy link

colek42 commented Mar 12, 2025

@PuneetPunamiya do you see any issues with this PR? Let us know if you would like to discuss at a community meeting. FYSA Archivista is an intoto project which is CNCF graduated.

"chains.tekton.dev/archivista-gitoid": uploadResp.Gitoid,
"chains.tekton.dev/archivista-url": a.url,
}
obj.SetAnnotations(annotations)
Copy link
Contributor

@anithapriyanatarajan anithapriyanatarajan Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lcarva - right now we do not set any annotation on tekton objects post storage with other backend types. Is there any concern having the info captured post upload?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the PR yet, but this is what comes to mind:

Chains will add annotations once it is done processing a Tekton object (TaskRun or PipelineRun). That happens right at the end of reconciliation. This update triggers another reconciliation but that's generally ok because Chains quickly detects the object has already been processed (based on it annotations) and skips that iteration of the reconciliation loop.

If Chains updates the Tekton object before those final annotations are added, the reconciliation that is triggered as a side-effect will cause signing and attesting to happen again. This may happen multiple times.

I would advise against changing the object during reconciliation. If needed, try tying into the already existing mechanism Chains has for adding the final annotations, see here.

NOTE: The tekton storage does update the object during reconciliation. We should probably fix that, but that's less of a concern given that that storage type is meant for test/develop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lcarva Are you suggesting that tekton storage should not be used in production? Isn't it included in some of the default configuration?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be best for us to remove this for now and follow up with the annotation in a separate PR. cc @mikhailswift

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that tekton storage should not be used in production?

Yes, 100%. This storage stores payloads as annotations which can get quite large if you have a pipeline with many tasks, for example. You'll face etcd issues eventually.

Isn't it included in some of the default configuration?

Yes, it's not ideal. We could pursue changing this default to OCI which is a more suitable storage type that requires no configuration if you are already building container images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotations have been removed from the Archivista storage backend

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikhailswift could you please help with the steps a user has to perform with cosign to verify-blob-attestation or verify-attestation accessing the uploaded content from archivista?

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-chains-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/chains/storage/archivista/archivista.go Do not exist 63.6%
pkg/chains/storage/storage.go 33.9% 31.7% -2.2

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-chains-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/chains/storage/archivista/archivista.go Do not exist 63.6%

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-chains-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/chains/storage/archivista/archivista.go Do not exist 63.6%
pkg/chains/storage/storage.go 33.9% 31.7% -2.2

return nil
}

// RetrievePayload is not implemented for Archivista.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to create an issue in Archivista project where it can have a function to retrieve the data

cc @lcarva @wlynch

Copy link
Contributor Author

@mikhailswift mikhailswift Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main thing we need here is details of what Tekton uses to retrieve the data. I looked into it a bit and I am a bit unclear on what the ShortKey is that is being used to find the payload.

Archivista can search for payloads with its GraphQL API and download the DSSE envelope. From Archivista's side, we need to make sure we are indexing the ShortKey or some other identifier that Tekton will use to retrieve the payload.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikhailswift - Please lookup - https://github.com/tektoncd/chains/blob/main/pkg/config/options.go#L26C1-L36C1. Does this help?

// ShortKey is the short version of an artifact identifier. This is useful for annotation based storage
// because annotation key has limitations (https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/#syntax-and-character-set).
// - For OCI artifact, it is first 12 chars of the image digest.
// - For TaskRun/PipelineRun artifact, it is `<KIND>-<UID>`.
ShortKey string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this helps. One thing I'm not entirely clear on -- does the UID appear in the in-toto attestation payload?

If so archivista can easily just look for that and put it in a searchable edge off of the DSSE envelope

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the Retrieve* functions are only used as part of the tkn chain CLI tool. They're not that useful and I'd be shocked if anyone actually uses it. So I think it's fine to leave these as not implemented.

@PuneetPunamiya
Copy link
Member

PuneetPunamiya commented Mar 19, 2025

One thing I would encourage and it would be nice if you can submit a follow up pr to add a small tutorial for using the storage backend you have added for example - https://github.com/tektoncd/chains/tree/main/docs/tutorials

@mikhailswift
Copy link
Contributor Author

I addressed the few comments here -- If I can get clarification on what ShortKey is in regards to the retrieval functions I can add the indexes to Archivista to support retrieval.

I can add that and a tutorial in a followup PR.

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-chains-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/chains/storage/archivista/archivista.go Do not exist 63.6%
pkg/chains/storage/storage.go 33.9% 31.7% -2.2

Adds the ability to store signed TaskRun and PipelineRun results in
Archivista. Archivista currently only supports payloads that are wrapped
in a DSSE envelope, so any signatures that are not DSSE envelopes will
not be stored.
@mikhailswift
Copy link
Contributor Author

Re-pushed to address the linting job errors

@tekton-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-tekton-chains-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/chains/storage/archivista/archivista.go Do not exist 63.6%
pkg/chains/storage/storage.go 33.9% 31.7% -2.2

@mikhailswift
Copy link
Contributor Author

Hi @PuneetPunamiya , just wanted to check if there's anything I need to do yet for this PR. As discussed I will follow up with a PR that includes a tutorial.

@PuneetPunamiya
Copy link
Member

Hi @PuneetPunamiya , just wanted to check if there's anything I need to do yet for this PR. As discussed I will follow up with a PR that includes a tutorial.

Sorry for the late response
The PR looks good to me
/approve

cc @lcarva @chitrangpatel

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 14, 2025
@afrittoli
Copy link
Member

@tektoncd/chains-maintainers this is ready for a second reviewer

return nil
}

// RetrievePayload is not implemented for Archivista.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the Retrieve* functions are only used as part of the tkn chain CLI tool. They're not that useful and I'd be shocked if anyone actually uses it. So I think it's fine to leave these as not implemented.

@tekton-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lcarva, PuneetPunamiya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [PuneetPunamiya,lcarva]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lcarva
Copy link
Contributor

lcarva commented May 1, 2025

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2025
@tekton-robot tekton-robot merged commit 91977df into tektoncd:main May 1, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants