-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Motivation
As soon as CA cert rotation happens Trino clusters will suddenly/silently break and won't recover ever again.
This can happen at any time.
When a customer runs into this, this is very hard to detect and debug.
Problem
Coordinator and workers Pods all are green, but don't work because of
io.airlift.discovery.client.DiscoveryException: Announcement failed for https://trino-coordinator-default-0.trino-coordinator-default.platform.svc.cluster.local:8443
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
Problem analysis
secret-operator correctly adds both (old and new) CA cert to the truststores:
stackable@trino-coordinator-default-0 /stackable/trino-server-451 $ openssl pkcs12 -info -in /stackable/mount_server_tls/truststore.p12 -passin "pass:" -legacy
MAC: sha1, Iteration 2048
MAC length: 20, salt length: 8
PKCS7 Encrypted data: pbeWithSHA1And40BitRC2-CBC, Iteration 2048
Certificate bag
Bag Attributes
Trusted key usage (Oracle): <No Values>
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----
Certificate bag
Bag Attributes
Trusted key usage (Oracle): <No Values>
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJAOOUHWjeHWJDMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yNDA3MjUxNjAzMTha
Fw0yNjA3MjUxNjA4MThaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKu3yDo4xXkg
w7o3DGWFgbdwgf5N/QjWNlmJiu43DP0vM9cmLOpCvCctL6qeuZytGl5HZm1kAvow
v+QrY/GAb7w3fMl8JjlGgSRhilQD+lj/pBOWhGbL0wovgMiUudED6R2ZWGpJwfKG
3TCfbi4RYu9tG/9exZZCfs92OMBRgm6R7IXOagNUFqyKSS3E/pusG+TA4G1B2l2P
Jl0OXeo7Pr+zJ9KLbVTCxrN2BaOR0JmE5rRyq1qZWzDxfyaeC4lX++Hsrg6zy8qe
aRzs5qQVDaqGcAssdso5kzb3VDRvmz6xbx2ptZ2saq4Yu471G00NRRDtaQFQUqwU
7kt5YjvwZQECAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRf
SB4bSzUxYNYEk0rTKmxyep5BqDBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQDjlB1o3h1iQzAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBAG1/Hasiwk32UcxWVUWLw2B62kav19ls
KQI0rdCA6+4/aTRZmKYYMIcbkF+Dgzq1uCOJ6DK3OrTRTLfYFCMbav3SzjB/Aqse
/79umgp11o1O4p4Ks3lR5x5mu7Ll3QXtpfc/+qXERmmqOBRlkui41/NzA23hHG+0
/qS5FpW1fJA1K71+DnsIHlDD7kvI8+XkaJdmHVa0HlDDy9FBHPNU7DVjBfNJIuaf
o24WxVNybSaNBxqjomKjXhn0EQCFwck6CqkMsOZL67es3+NLqA+n9oSyQ8DHz5c1
T0WPlKlN7yzcnfuhRgI0uBxS4iVoGQlAXgPIu41ykYphGplUFWV3hrA=
-----END CERTIFICATE-----Same output for openssl pkcs12 -info -in /stackable/mount_internal_tls/truststore.p12 -passin "pass:" -legacy.
However, the generated "non-mount" truststores are missing the new cert, only the old one is contained:
stackable@trino-coordinator-default-0 /stackable/trino-server-451 $ openssl pkcs12 -info -in /stackable/server_tls/truststore.p12 -passin "pass:changeit" -legacy
MAC: sha256, Iteration 10000
MAC length: 32, salt length: 20
PKCS7 Encrypted data: PBES2, PBKDF2, AES-256-CBC, Iteration 10000, PRF hmacWithSHA256
Certificate bag
Bag Attributes
friendlyName: d4a39ffb-8f03-42fd-88e9-86e0c4150a2b
Trusted key usage (Oracle): Any Extended Key Usage (2.5.29.37.0)
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----The internal TLS is also broken.
It actually contains the old CA twice!
stackable@trino-coordinator-default-0 /stackable/trino-server-451 $ openssl pkcs12 -info -in /stackable/internal_tls/truststore.p12 -passin "pass:changeit" -legacy
MAC: sha256, Iteration 10000
MAC length: 32, salt length: 20
PKCS7 Encrypted data: PBES2, PBKDF2, AES-256-CBC, Iteration 10000, PRF hmacWithSHA256
Certificate bag
Bag Attributes
friendlyName: 384cec2d-b97a-4250-a519-b4095085d8e8
Trusted key usage (Oracle): Any Extended Key Usage (2.5.29.37.0)
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----
Certificate bag
Bag Attributes
friendlyName: 8da0af5d-bd81-48ea-9624-69125c900f4a
Trusted key usage (Oracle): Any Extended Key Usage (2.5.29.37.0)
subject=CN=secret-operator self-signed
issuer=CN=secret-operator self-signed
-----BEGIN CERTIFICATE-----
MIIDVTCCAj2gAwIBAgIJALXTYRseLx2QMA0GCSqGSIb3DQEBCwUAMCYxJDAiBgNV
BAMMG3NlY3JldC1vcGVyYXRvciBzZWxmLXNpZ25lZDAeFw0yMzA3MjYxNDE1MDla
Fw0yNTA3MjUxNDIwMDlaMCYxJDAiBgNVBAMMG3NlY3JldC1vcGVyYXRvciBzZWxm
LXNpZ25lZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMFnNZK3ak+q
idsqpCSbSDoC/T//pd5om4DviLc7Drs2a/c0yYeOHmKkWxVauY35eKdSq1PmUI+Z
oSd8AtFEg1KXngfyUWQthKEfVKuXGrMrAYpASU+shsaDXt1ShcIuudAVWg76aqHw
wRHzDP4OPrzu08mef7cSFpg0W/ZgInmE1sOFfIoSDFPt0rN3WiaiCbAgtNATzNA0
8FxxSE3N4oe2T49Owy2CVwLSDAAiuEPc0NXbbfdptbf0mhYJ+abadXYpHX0VzVnT
QyTH0ufnbE9RB/rVGnvNEeG5nBRMJrOPvgNSioi0XwKNxql7yvTtIHu8RB24/+/B
bIEJ9stb++kCAwEAAaOBhTCBgjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSF
esSthRpbxiDMrtNQroKB7ZHQgzBABgNVHSMEOTA3oSqkKDAmMSQwIgYDVQQDDBtz
ZWNyZXQtb3BlcmF0b3Igc2VsZi1zaWduZWSCCQC102EbHi8dkDAOBgNVHQ8BAf8E
BAMCAYYwDQYJKoZIhvcNAQELBQADggEBADpv8TKnlLLLSyZntpLCvch40l3hG5D3
TxA1/K0uqMBO+emq25oJO6qDBJ3u1W1sCW+dV2CSBzSlZi/cBJ9xdHA3aWr7suwG
Dp7hVy2LrERUBQNWs/p7/vw/wJRrKT4MxN3VOl9n751TQMohKumCTvCS1ebtAGlp
JqMfZP59CUwfqXCP3ymfK1v7Bt+ZGG9EesE0hOQO8jZ3xCoI9Ub2dyzn4qIIlbFo
Bq5TJHJNnt1BusRTgzFkuLSMNxJ1BVeItjRpYeyTclfoD4Smpr3RDTHKKWloWYzi
n4xGnN4bYt3epn+xsSvQjkcC2HWc1n3OXM5HJPnUiIt5VfubamOUXTo=
-----END CERTIFICATE-----This happens, because we have the following init-container script
echo Importing /stackable/mount_server_tls/truststore.p12 to /stackable/server_tls/truststore.p12
keytool -importkeystore -srckeystore /stackable/mount_server_tls/truststore.p12 -srcstoretype PKCS12 -srcstorepass "" -srcalias 1 -destkeystore /stackable/server_tls/truststore.p12 -deststoretype PKCS12 -deststorepass changeit -destalias $(cat /proc/sys/kernel/random/uuid) -noprompt
echo Importing /stackable/mount_server_tls/keystore.p12 to /stackable/server_tls/keystore.p12
keytool -importkeystore -srckeystore /stackable/mount_server_tls/keystore.p12 -srcstoretype PKCS12 -srcstorepass "" -destkeystore /stackable/server_tls/keystore.p12 -deststoretype PKCS12 -deststorepass changeit -noprompt
echo Importing /stackable/mount_internal_tls/truststore.p12 to /stackable/internal_tls/truststore.p12
keytool -importkeystore -srckeystore /stackable/mount_internal_tls/truststore.p12 -srcstoretype PKCS12 -srcstorepass "" -srcalias 1 -destkeystore /stackable/internal_tls/truststore.p12 -deststoretype PKCS12 -deststorepass changeit -destalias $(cat /proc/sys/kernel/random/uuid) -noprompt
echo Importing /stackable/mount_internal_tls/keystore.p12 to /stackable/internal_tls/keystore.p12
keytool -importkeystore -srckeystore /stackable/mount_internal_tls/keystore.p12 -srcstoretype PKCS12 -srcstorepass "" -destkeystore /stackable/internal_tls/keystore.p12 -deststoretype PKCS12 -deststorepass changeit -noprompt
echo Importing /stackable/mount_server_tls/truststore.p12 to /stackable/internal_tls/truststore.p12
keytool -importkeystore -srckeystore /stackable/mount_server_tls/truststore.p12 -srcstoretype PKCS12 -srcstorepass "" -srcalias 1 -destkeystore /stackable/internal_tls/truststore.p12 -deststoretype PKCS12 -deststorepass changeit -destalias $(cat /proc/sys/kernel/random/uuid) -noprompt
echo Importing /etc/pki/java/cacerts to /stackable/client_tls/truststore.p12
keytool -importkeystore -srckeystore /etc/pki/java/cacerts -srcstoretype jks -srcstorepass changeit -destkeystore /stackable/client_tls/truststore.p12 -deststoretype pkcs12 -deststorepass changeit -nopromptThe first though was "why not just copy the truststore files?".
But we need to actually union the truststores, also other certs as S3 CA cert are added to these truststores later on, so we likely need something a bit more advanced than cp.
Solution
Workaround
Delete the secret-op CA-cert Secret, causing a new CA cert to be created.
Afterwards restart all Stackable Pods using TLS.
Environment
Everywhere where secret-op autoTls backend is used