-
Notifications
You must be signed in to change notification settings - Fork 0
Description
After a discussion with @lfrancke we want to consolidate the way we handle encryption, authentication and authorization. This means having a unified global structure across all operators.
This will result in breaking changes for the released versions of ZooKeeper and Superset according to the feature tracker.
Problem
1. Mixing encryption and authentication
Currently Kafka and Zookeeper mix TLS encryption and authentication as follows (see the client_authentication part):
pub struct GlobalKafkaConfig {
/// Only affects client connections. This setting controls:
/// - If TLS encryption is used at all
/// - Which cert the servers should use to authenticate themselves against the client
/// Defaults to `TlsSecretClass` { secret_class: "tls".to_string() }.
#[serde(
default = "tls_secret_class_default",
skip_serializing_if = "Option::is_none"
)]
pub tls: Option<TlsSecretClass>,
/// Only affects client connections. This setting controls:
/// - If clients need to authenticate themselves against the server via TLS
/// - Which ca.crt to use when validating the provided client certs
/// Defaults to `None`
#[serde(skip_serializing_if = "Option::is_none")]
pub client_authentication: Option<ClientAuthenticationClass>,
/// Only affects internal communication. Use mutual verification between Kafka nodes
/// This setting controls:
/// - Which cert the servers should use to authenticate themselves against other servers
/// - Which ca.crt to use when validating the other server
#[serde(
default = "tls_secret_class_default",
skip_serializing_if = "Option::is_none"
)]
pub internal_tls: Option<TlsSecretClass>,
}
This was ok for now since we did not support any other authentication methods.
Now if we want to support different authentication methods e.g. LDAP, this will get a little messy.
2. No global top level config
Other products do not use a "global" config. Authentication (LDAP) resides in the top level, OPA discovery configmap (authorization) as well. Example from superset:
pub struct SupersetClusterSpec {
/// Emergency stop button, if `true` then all pods are stopped without affecting configuration (as setting `replicas` to `0` would)
#[serde(default, skip_serializing_if = "Option::is_none")]
pub stopped: Option<bool>,
/// Desired Superset version
#[serde(default, skip_serializing_if = "Option::is_none")]
pub version: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub statsd_exporter_version: Option<String>,
pub credentials_secret: String,
pub mapbox_secret: Option<String>,
#[serde(default)]
pub load_examples_on_init: Option<bool>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub authentication_config: Option<SupersetClusterAuthenticationConfig>,
#[serde(default, skip_serializing_if = "Option::is_none")]
...
}
This should be consistent across all operators.
Proposed solution
Therefore we propose the following clean "common" operator config (we need to discuss what we could move to operator-rs), where we clearly separate encryption, authentication and authorization:
version: ....
clusterConfig:
tls/encryption:
client: secret_class...
internal: secret_class
authentication:
(m)Tls: authentication_class
ldap: authentication_class
# or
authentication:
- authentication_class (tls)
- authentication_class (ldap)
authorization:
opa:
configmap: ....
Outcome
We decided to use the following spec (example represents mostly Trino/Superset, mixed with random fields). This should be properly refined in the sub tickets for each operator.
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: simple-trino
spec:
version: 396-stackable0.1.0
clusterConfig:
catalogLabelSelector:
matchLabels:
trino: simple-trino
# Just copy paste existing structure
# In case of Trino (easiest)
authentication:
method:
ldap: ...
# We recommend the following list of authenticationClasses
# If this recommendation does not match a particular product (e.g. no authentication at all) it can do it differently
# (1) In case of Trino (intermediate solution as it is quicker to implement)
authentication: # TODO call it clientAuthentication?
- authenticationClass: authentication_class (tls)
- authenticationClass: authentication_class (ldap)
- multiUser: <secret-name>
# (2) OR in case of Trino (final solution)
authentication:
- authenticationClass: trino-ldap-authentication-class (tls) # String
- authenticationClass: authentication_class (ldap)
- authenticationClass: authentication_class (static)
# (1) In case of Superset (solution that is quicker to implement)
authentication:
authenticationClass: my-ldap (ldap)
userRegistration: true
userRegistration_role: Admin
syncRolesAt: Login
# (2) OR in case of Superset (alternative solution)
authentication:
- authenticationClass: my-ldap (ldap)
userRegistration: true
userRegistration_role: Admin
authorization: # TODO define
# Maybe a AuthorizationClass (probably not)?
# How should we handle group memberships?
# E.g. for LDAP give the option of providing a LDAP AuthenticationClass to retrieve ldap group membership?
# For now we want to start with the following opa section for all the products that currently support opa
# Individual struct in every operator, no operator-rs struct
opa:
configmap: my-opa
package: simple-trino
# We don't want to start with an operator-rs struct as we have products that don't support tls at all,
# some only server tls and some server + internal tls. We can later on non-breaking move it to a operator-rs
# struct if we want
tls:
serverSecretClass: secret_class # Option<String>. *In general* defaults to "tls"
internalSecretClass: secret_class # Option<String>. *In general* defaults to "tls"
# Not Trino/Superset related
zookeeperConfigMapName: xyz
deepStorage: ...
coodinators: ....
Since all operators are currently inconsistent, we decided to implement a final approach for the Druid operator. Druid because it has a clean slate regarding TLS and Authentication and is able to use OPA as well.
The final approach (concerning authentication) are the fields annotated with (2). This is our most sophisticated approach and therefore will be prototyped on Druid.
Furthermore all top level configuration parameters except version/image (Deepstorage, S3, Discovery configmaps etc.) should be moved under the clusterConfig top level.
Tasks
Note: Github Taskslist may break the order, check before pulling new tasks in!
### Tasks
- [x] https://github.com/stackabletech/operator-rs/issues/494
- [x] https://github.com/stackabletech/druid-operator/issues/6
- [x] https://github.com/stackabletech/druid-operator/issues/144
- [x] https://github.com/stackabletech/zookeeper-operator/issues/596
- [x] https://github.com/stackabletech/kafka-operator/issues/529
- [ ] https://github.com/stackabletech/hdfs-operator/issues/289
- [ ] https://github.com/stackabletech/hbase-operator/issues/329
- [x] https://github.com/stackabletech/hive-operator/issues/291
- [x] https://github.com/stackabletech/airflow-operator/issues/247
- [x] https://github.com/stackabletech/trino-operator/issues/395
- [x] https://github.com/stackabletech/trino-operator/issues/401
- [x] https://github.com/stackabletech/superset-operator/issues/339
- [ ] https://github.com/stackabletech/nifi-operator/pull/417
- [ ] https://github.com/stackabletech/airflow-operator/issues/298
- [ ] https://github.com/stackabletech/nifi-operator/issues/485
Followups
- How do we expose TLS/Authentication in discovery config maps?
Metadata
Metadata
Assignees
Type
Projects
Status