This Helm chart deploys a one-time data upload job or continuous replication service using Kubernetes. It supports three deployment types:
- SNAPSHOT_ONLY: A one-time
Jobfor uploading a snapshot of data. - INCREMENT_ONLY: A continuous data replication
StatefulSet. - SNAPSHOT_AND_INCREMENT: A
Jobfor a one-time data snapshot upload followed by a continuous replicationStatefulSetafter the job completes.
- Kubernetes 1.18+
- Helm 3.0+
- A Docker image with the required commands:
trctl activate --file /etc/config/config.yamlfor the snapshot job.trctl replication --file /etc/config/config.yamlfor continuous replication.
To install the chart, run the following command:
helm pull oci://ghcr.io/transferia/transferia-helm/transferFor example:
helm install transfer ./transfer --set transferSpec.type=SNAPSHOT_ONLYTo uninstall the chart, run:
helm uninstall <release-name>The chart is highly configurable. You can specify various parameters in the values.yaml file or through the --set option when running the helm install command.
| Parameter | Description | Default |
|---|---|---|
transferSpec.id |
Unique ID for the data transfer job. | dtttest |
transferSpec.type |
Type of deployment: SNAPSHOT_ONLY, INCREMENT_ONLY, SNAPSHOT_AND_INCREMENT. |
SNAPSHOT_ONLY |
transferSpec.src.type |
Source type (e.g., pg). |
pg |
transferSpec.src.params |
Source parameters. | {} |
transferSpec.dst.type |
Destination type (e.g., ch). |
ch |
transferSpec.dst.params |
Destination parameters. | {} |
snapshot.worker_count |
Number of parallel instances for the snapshot job. | 1 |
replication.worker_count |
Number of replicas for the continuous replication StatefulSet. |
1 |
resources.requests.cpu |
CPU resource requests for the pods. | 100m |
resources.requests.memory |
Memory resource requests for the pods. | 128Mi |
resources.limits.cpu |
CPU resource limits for the pods. | 500m |
resources.limits.memory |
Memory resource limits for the pods. | 256Mi |
coordinator.type |
Type of external coordinator service, e.g., s3 or memory. |
s3 |
coordinator.job_count |
Number of parallel instances the workload. | 1 |
coordinator.process_count |
How many threads will be run inside each job. | 4 |
coordinator.bucket |
Name of the S3 bucket for coordination. | place_your_bucket |
transferSpec.regular_snapshot.incremental |
List of objects defining incremental snapshot settings. | [] |
transferSpec.regular_snapshot.enabled |
Enable or disable the regular snapshot mechanism. | false |
transferSpec.regular_snapshot.cron_expression |
Cron expression for scheduled cron job. | 0 1 * * * |
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
coordinator:
job_count: 1 # set more than one means work would be sharded, coordinator must be non memory
process_count: 4 # default is 4, how many threads will be run inside each job
type: s3 # type of coordination, one of: s3 or memory
bucket: place_your_bucket # Bucket with write access, will be used to store state
transferSpec:
id: mytransfer # Unique ID of a transfer
name: awesome transfer # Human friendly name for a transfer
type: INCREMENT_ONLY # type of transfer, one of: INCREMENT_ONLY, SNAPSHOT_ONLY, SNAPSHOT_AND_INCREMENT
src:
type: source_type # for example: pg, s3, kafka ...
params:
... # source type params, all params can be founded in `model_source.go` for provider folder
dst:
type: target_type # for example: s3, ch, kafka ...
params:
... # target type params, all params can be founded in `model_destination.go` for provider folder
regular_snapshot:
enabled: true
incremental:
- namespace: public
name: playing_with_neon
cursor_field: idThe transferSpec section configures the data source (src) and destination (dst). These fields accept JSON-formatted strings in the params section for both source and destination configurations. The type defines the kind of source/destination system (e.g., PostgreSQL, ClickHouse).
- Source Example:
src:
type: pg # Postgres source type
params:
Hosts:
- YOUR_NEON_DB
Database: YOUR_DB_NAME
User: YOUR_DB_USER
Password: YOUR_DB_PASSWORD
Port: 5432
BatchSize: 1024
SlotByteLagLimit: 53687091200
EnableTLS: true
KeeperSchema: public
DesiredTableSize: 1073741824
SnapshotDegreeOfParallelism: 4- Destination Example:
dst:
type: ch # Clickhouse destination type
params:
User: YOUR_CH_USER
Password: YOUR_CH_PASSWORD
ShardsList:
- Hosts:
- YOUR_CH_HOST
Database: default
HTTPPort: 8443
SSLEnabled: true
NativePort: 9440
MigrationOptions:
AddNewColumns: true
InsertParams:
MaterializedViewsIgnoreErrors: true
RetryCount: 20
UseSchemaInTableName: true
Interval: 1000000000
Cleanup: Drop
BufferTriggingSize: 536870912The coordinator section configures an external coordination service, such as an S3 bucket, that can be used for coordinating data transfers or snapshots. This is useful when external storage or coordination mechanisms are required.
- Example:
coordinator:
type: s3
bucket: my-coordination-bucketThe regular_snapshot feature allows for periodic snapshots with optional incremental replication. This is particularly useful when a regular snapshot schedule is needed alongside continuous data replication.
- Incremental Snapshot Example:
transferSpec:
regular_snapshot:
enabled: true
incremental:
- namespace: public
name: playing_with_neon
cursor_field: idThis example enables the regular_snapshot feature and specifies that incremental replication should track changes using the id field from the playing_with_neon table within the public namespace.
This mode deploys a one-time Job to upload a data snapshot. You can control the number of parallel job instances by setting job.instanceCount.
To deploy:
helm install transfer ./transfer --set transferSpec.type=SNAPSHOT_ONLY --set job.instanceCount=3This mode deploys a StatefulSet for continuous data replication. You can control the number of replicas with statefulSet.replicaCount.
To deploy:
helm install transfer ./transfer --set transferSpec.type=INCREMENT_ONLY --set statefulSet.replicaCount=2This mode first runs a Job to take a data snapshot, followed by a StatefulSet for continuous replication. The StatefulSet will only start after the Job completes successfully.
To deploy:
helm install transfer ./transfer --set transferSpec.type=SNAPSHOT_AND_INCREMENT --set job.instanceCount=2 --set statefulSet.replicaCount=2You can adjust the CPU and memory allocation for the pods using the resources section of the values.yaml file. Ensure you specify both requests and limits for better resource management.
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"You can override any configuration option in the values.yaml file by using the --set flag when installing the Helm chart. For example:
helm install transfer ./transfer --set transferSpec.id=my-custom-id --set transferSpec.type=INCREMENT_ONLYTo view logs of a running Job or StatefulSet pod, you can use kubectl:
kubectl logs <pod-name>For example, to get
logs from the Job:
kubectl get pods -l job-name=transfer
kubectl logs <pod-name>For the StatefulSet:
kubectl get pods -l statefulset.kubernetes.io/pod-name=data-upload-statefulset
kubectl logs <pod-name>Feel free to open issues and submit PRs to improve this chart!