Kafka Connect: Add Trivy CVE scan to CI#15430
Conversation
|
i think its better to do this in the docker image publishing step, similar to what Superset is doing |
|
@kevinjqliu I'm not sure I follow. How's this relate to publishing docker images? This is to identify CVEs in the Kafka Connect connector itself. thanks. |
|
Looking at how Trivy is used in Superset, its scanning the docker image only. In this PR, we're unpacking the jars to scan. I think it makes more sense to build a kafka connect image and then use Trivy to scan the image. Just my preference |
Not really clear why we would add the infra to build a docker image just to use trivy. Seems like a direct scan is more parsimonious |
rymurr
left a comment
There was a problem hiding this comment.
LGTM, I just want to see if we can avoid 2x build w/o over complicating the CI job
Just looking at general patterns from the apache repos. https://grep.app/search?f.repo.pattern=apache%2F&q=uses%3A+aquasecurity%2Ftrivy-action The only instance of From the docs, it seems like Filesystem scan and Container image scan are similar in that they both scan for Vulnerabilities, Misconfigurations, Secrets, and Licenses. I think it would be helpful here if you can run the change on your fork repo and see if the fs trivy scan catches the currently open CVE for kafka connect (#15440) |
rymurr
left a comment
There was a problem hiding this comment.
The changes themselves look good to me. I am just confused/questioning the github actions config...
My assumption is we want:
- results to be uploaded regardless of pass/fail states of steps above
- the job to fail if it detects errors
- the 2nd scan to run regardless of if the other failed
I think whats happening now is the first always looks like it fails, the second always looks like it passes and the entire job always looks like it passes.
It might be worth trying this a few times in your own fork to make sure this actaully has the edge cases we expect.
There was a problem hiding this comment.
Thanks for the PR! I think this is a great addition, however I'm concerned about supply chain risk due to the recent (second) compromise of trivy's infra.
See
- https://opensourcemalware.com/repository/https%3A%2F%2Fgithub.com%2Faquasecurity%2Ftrivy%2F
- https://labs.boostsecurity.io/articles/20-days-later-trivy-compromise-act-ii/
- https://www.wiz.io/blog/trivy-compromised-teampcp-supply-chain-attack
I see we are using aquasecurity/trivy-action@e368e328979b113139d6f9068e03accaed98a518 . Glad to see we're using the commit hash.
I believe this is still ongoing; we should wait until theres resolution to proceed with this PR.
EDIT: looks like trivy was just pulled from the list of allowed actions by asf-infra apache/infrastructure-actions#548
https://infra.apache.org/blog/trivy_security_incident.html
@kevinjqliu PR to add it: apache/infrastructure-actions#573 |
|
apache/infrastructure-actions#582 is merged ASF Infra has already allowlisted it |
Thanks @kevinjqliu. |
7f169af to
9d21bb3
Compare
|
OK, I think this PR is ready for folk to take another look please. How it works:
Here are the execution paths, expected behaviours, and test evidence:
¹ Push-event workflows only trigger when present on the default branch, so scenarios 3 & 4 reference runs from an earlier iteration where Trivy was a step in the test job. The scan configuration and SARIF upload logic are unchanged. N.B. To test "no CVEs present" condition, known CVEs were temporarily suppressed via |
ed906dd to
0264d75
Compare
| # Behaviour: | ||
| # - Flag, don't block: the scan step uses exit-code 1 so it | ||
| # "fails" when CVEs are found, but continue-on-error keeps | ||
| # the overall job green. GitHub Actions shows the step with |
There was a problem hiding this comment.
I don't think this is true. Perhaps a hallucination.
My suggestion would be to invert the flow.
- remove
continue-on-errorandexit-codefrom the first run. - change the
ifstatement toalways() && matrix.jvm==21
This job will fail adn show 'red' in the UI but its a non-blocking check and the PR can still be merged. the always clause ensures that the results are always uploaded and printed regardless of the state of the scan.
I think this has the desired ux: the user knows they have a CVE and why but it isn't blocking for the PR (eg if the patched version hasn';t been released yet)
There was a problem hiding this comment.
Makes sense, thanks @rymurr. Sounds like a good option.
Out of interest where can I see which checks will block a merge and which won't?
Re. exit-code, don't we still want this to be 1, otherwise the step won't fail (it defaults to 0) and show the job as red?
There was a problem hiding this comment.
OK, I've made this change now, and updated the above detail & testing to reflect it.
@rymurr please could you take another look?
6d19226 to
d1b2ca2
Compare
216abed to
e3a2a5d
Compare
Add a new workflow (kafka-connect-cve-scan.yml) that scans bundled Kafka Connect jars for known CVEs using Trivy in rootfs mode. The scan runs as a separate, non-required check so that CVE findings are visible without blocking merges. On push to main/release branches, results are uploaded as SARIF to the GitHub Security tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cf65721 to
8856129
Compare
Add a new workflow (kafka-connect-cve-scan.yml) that scans bundled Kafka Connect jars for known CVEs using Trivy in rootfs mode. The scan runs as a separate, non-required check so that CVE findings are visible without blocking merges. On push to main/release branches, results are uploaded as SARIF to the GitHub Security tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8856129 to
507443e
Compare
507443e to
7e1308a
Compare
Add a new workflow (kafka-connect-cve-scan.yml) that scans bundled Kafka Connect jars for known CVEs using Trivy in rootfs mode. The scan runs as a separate, non-required check so that CVE findings are visible without blocking merges. On push to main/release branches, results are uploaded as SARIF to the GitHub Security tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a new workflow (kafka-connect-cve-scan.yml) that scans bundled Kafka Connect jars for known CVEs using Trivy in rootfs mode. The scan runs as a separate, non-required check so that CVE findings are visible without blocking merges. On push to main/release branches, results are uploaded as SARIF to the GitHub Security tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7e1308a to
da8cc4c
Compare
Add a new workflow (kafka-connect-cve-scan.yml) that scans bundled Kafka Connect jars for known CVEs using Trivy in rootfs mode. The scan runs as a separate, non-required check so that CVE findings are visible without blocking merges. On push to main/release branches, results are uploaded as SARIF to the GitHub Security tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
da8cc4c to
1239c13
Compare
Add a new workflow (kafka-connect-cve-scan.yml) that scans bundled Kafka Connect jars for known CVEs using Trivy in rootfs mode. The scan runs as a separate, non-required check so that CVE findings are visible without blocking merges. On push to main/release branches, results are uploaded as SARIF to the GitHub Security tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1239c13 to
ba1fccc
Compare
|
This LGTM. I think its a good start. We can tweak to see how much people pay attention to this alert and how easy they are to imporve |
|
looks like this is blocking CI for main branch, i opened #16287 as a followup. Please take a look! |
Summary
Context
Discussion on dev@ mailing list: https://lists.apache.org/thread/kbf98950pzstzgon92st7mh9vrbv5yhb
Confluent Marketplace requires a Trivy scan before listing connectors. This has previously caught CVEs that needed patching (e.g. #14985). Running the scan in CI catches vulnerabilities during development and — critically — on RC tags before the release vote starts, when fixes can still be applied.
This is independent of #15212 (adding the KC artifact to the release process) and can land in either order.