Skip to content

CA certificate rotation not working correctly #744

@sbernauer

Description

@sbernauer

Motivation

As soon as CA cert rotation happens Trino clusters will suddenly/silently break and won't recover ever again.
This can happen at any time.

When a customer runs into this, this is very hard to detect and debug.

Problem

Coordinator and workers Pods all are green, but don't work because of

io.airlift.discovery.client.DiscoveryException: Announcement failed for https://trino-coordinator-default-0.trino-coordinator-default.platform.svc.cluster.local:8443
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors

Problem analysis

secret-operator correctly adds both (old and new) CA cert to the truststores:

stackable@trino-coordinator-default-0 /stackable/trino-server-451 $ openssl pkcs12 -info -in /stackable/mount_server_tls/truststore.p12 -passin "pass:" -legacy
MAC: sha1, Iteration 2048
MAC length: 20, salt length: 8
PKCS7 Encrypted data: pbeWithSHA1And40BitRC2-CBC, Iteration 2048
Certificate bag
Bag Attributes
    Trusted key usage (Oracle): <No Values>
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----
Certificate bag
Bag Attributes
    Trusted key usage (Oracle): <No Values>
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJAOOUHWjeHWJDMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yNDA3MjUxNjAzMTha
Fw0yNjA3MjUxNjA4MThaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKu3yDo4xXkg
w7o3DGWFgbdwgf5N/QjWNlmJiu43DP0vM9cmLOpCvCctL6qeuZytGl5HZm1kAvow
v+QrY/GAb7w3fMl8JjlGgSRhilQD+lj/pBOWhGbL0wovgMiUudED6R2ZWGpJwfKG
3TCfbi4RYu9tG/9exZZCfs92OMBRgm6R7IXOagNUFqyKSS3E/pusG+TA4G1B2l2P
Jl0OXeo7Pr+zJ9KLbVTCxrN2BaOR0JmE5rRyq1qZWzDxfyaeC4lX++Hsrg6zy8qe
aRzs5qQVDaqGcAssdso5kzb3VDRvmz6xbx2ptZ2saq4Yu471G00NRRDtaQFQUqwU
7kt5YjvwZQECAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRf
SB4bSzUxYNYEk0rTKmxyep5BqDBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQDjlB1o3h1iQzAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBAG1/Hasiwk32UcxWVUWLw2B62kav19ls
KQI0rdCA6+4/aTRZmKYYMIcbkF+Dgzq1uCOJ6DK3OrTRTLfYFCMbav3SzjB/Aqse
/79umgp11o1O4p4Ks3lR5x5mu7Ll3QXtpfc/+qXERmmqOBRlkui41/NzA23hHG+0
/qS5FpW1fJA1K71+DnsIHlDD7kvI8+XkaJdmHVa0HlDDy9FBHPNU7DVjBfNJIuaf
o24WxVNybSaNBxqjomKjXhn0EQCFwck6CqkMsOZL67es3+NLqA+n9oSyQ8DHz5c1
T0WPlKlN7yzcnfuhRgI0uBxS4iVoGQlAXgPIu41ykYphGplUFWV3hrA=
-----END CERTIFICATE-----

Same output for openssl pkcs12 -info -in /stackable/mount_internal_tls/truststore.p12 -passin "pass:" -legacy.

However, the generated "non-mount" truststores are missing the new cert, only the old one is contained:

stackable@trino-coordinator-default-0 /stackable/trino-server-451 $ openssl pkcs12 -info -in /stackable/server_tls/truststore.p12 -passin "pass:changeit" -legacy
MAC: sha256, Iteration 10000
MAC length: 32, salt length: 20
PKCS7 Encrypted data: PBES2, PBKDF2, AES-256-CBC, Iteration 10000, PRF hmacWithSHA256
Certificate bag
Bag Attributes
    friendlyName: d4a39ffb-8f03-42fd-88e9-86e0c4150a2b
    Trusted key usage (Oracle): Any Extended Key Usage (2.5.29.37.0)
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----

The internal TLS is also broken.
It actually contains the old CA twice!

stackable@trino-coordinator-default-0 /stackable/trino-server-451 $ openssl pkcs12 -info -in /stackable/internal_tls/truststore.p12 -passin "pass:changeit" -legacy
MAC: sha256, Iteration 10000
MAC length: 32, salt length: 20
PKCS7 Encrypted data: PBES2, PBKDF2, AES-256-CBC, Iteration 10000, PRF hmacWithSHA256
Certificate bag
Bag Attributes
    friendlyName: 384cec2d-b97a-4250-a519-b4095085d8e8
    Trusted key usage (Oracle): Any Extended Key Usage (2.5.29.37.0)
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----
Certificate bag
Bag Attributes
    friendlyName: 8da0af5d-bd81-48ea-9624-69125c900f4a
    Trusted key usage (Oracle): Any Extended Key Usage (2.5.29.37.0)
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----

This happens, because we have the following init-container script

echo Importing /stackable/mount_server_tls/truststore.p12 to /stackable/server_tls/truststore.p12
keytool -importkeystore -srckeystore /stackable/mount_server_tls/truststore.p12 -srcstoretype PKCS12 -srcstorepass "" -srcalias 1 -destkeystore /stackable/server_tls/truststore.p12 -deststoretype PKCS12 -deststorepass changeit -destalias $(cat /proc/sys/kernel/random/uuid) -noprompt
echo Importing /stackable/mount_server_tls/keystore.p12 to /stackable/server_tls/keystore.p12
keytool -importkeystore -srckeystore /stackable/mount_server_tls/keystore.p12 -srcstoretype PKCS12 -srcstorepass "" -destkeystore /stackable/server_tls/keystore.p12 -deststoretype PKCS12 -deststorepass changeit -noprompt
echo Importing /stackable/mount_internal_tls/truststore.p12 to /stackable/internal_tls/truststore.p12
keytool -importkeystore -srckeystore /stackable/mount_internal_tls/truststore.p12 -srcstoretype PKCS12 -srcstorepass "" -srcalias 1 -destkeystore /stackable/internal_tls/truststore.p12 -deststoretype PKCS12 -deststorepass changeit -destalias $(cat /proc/sys/kernel/random/uuid) -noprompt
echo Importing /stackable/mount_internal_tls/keystore.p12 to /stackable/internal_tls/keystore.p12
keytool -importkeystore -srckeystore /stackable/mount_internal_tls/keystore.p12 -srcstoretype PKCS12 -srcstorepass "" -destkeystore /stackable/internal_tls/keystore.p12 -deststoretype PKCS12 -deststorepass changeit -noprompt
echo Importing /stackable/mount_server_tls/truststore.p12 to /stackable/internal_tls/truststore.p12
keytool -importkeystore -srckeystore /stackable/mount_server_tls/truststore.p12 -srcstoretype PKCS12 -srcstorepass "" -srcalias 1 -destkeystore /stackable/internal_tls/truststore.p12 -deststoretype PKCS12 -deststorepass changeit -destalias $(cat /proc/sys/kernel/random/uuid) -noprompt
echo Importing /etc/pki/java/cacerts to /stackable/client_tls/truststore.p12
keytool -importkeystore -srckeystore /etc/pki/java/cacerts -srcstoretype jks -srcstorepass changeit -destkeystore /stackable/client_tls/truststore.p12 -deststoretype pkcs12 -deststorepass changeit -noprompt

The first though was "why not just copy the truststore files?".
But we need to actually union the truststores, also other certs as S3 CA cert are added to these truststores later on, so we likely need something a bit more advanced than cp.

Solution

#764

Workaround

Delete the secret-op CA-cert Secret, causing a new CA cert to be created.
Afterwards restart all Stackable Pods using TLS.

Environment

Everywhere where secret-op autoTls backend is used

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions