helm

Transfer Helm Chart

This Helm chart deploys a one-time data upload job or continuous replication service using Kubernetes. It supports three deployment types:

SNAPSHOT_ONLY: A one-time Job for uploading a snapshot of data.
INCREMENT_ONLY: A continuous data replication StatefulSet.
SNAPSHOT_AND_INCREMENT: A Job for a one-time data snapshot upload followed by a continuous replication StatefulSet after the job completes.

Prerequisites

Kubernetes 1.18+
Helm 3.0+
A Docker image with the required commands:
- trctl activate --file /etc/config/config.yaml for the snapshot job.
- trctl replication --file /etc/config/config.yaml for continuous replication.

Chart Installation

1. Install the Helm chart

To install the chart, run the following command:

helm pull oci://ghcr.io/transferia/transferia-helm/transfer

For example:

helm install transfer ./transfer --set transferSpec.type=SNAPSHOT_ONLY

2. Uninstall the Helm chart

To uninstall the chart, run:

helm uninstall <release-name>

Configuration

The chart is highly configurable. You can specify various parameters in the values.yaml file or through the --set option when running the helm install command.

Parameters

Parameter	Description	Default
`transferSpec.id`	Unique ID for the data transfer job.	`dtttest`
`transferSpec.type`	Type of deployment: `SNAPSHOT_ONLY`, `INCREMENT_ONLY`, `SNAPSHOT_AND_INCREMENT`.	`SNAPSHOT_ONLY`
`transferSpec.src.type`	Source type (e.g., `pg`).	`pg`
`transferSpec.src.params`	Source parameters.	`{}`
`transferSpec.dst.type`	Destination type (e.g., `ch`).	`ch`
`transferSpec.dst.params`	Destination parameters.	`{}`
`snapshot.worker_count`	Number of parallel instances for the snapshot job.	`1`
`replication.worker_count`	Number of replicas for the continuous replication `StatefulSet`.	`1`
`resources.requests.cpu`	CPU resource requests for the pods.	`100m`
`resources.requests.memory`	Memory resource requests for the pods.	`128Mi`
`resources.limits.cpu`	CPU resource limits for the pods.	`500m`
`resources.limits.memory`	Memory resource limits for the pods.	`256Mi`
`coordinator.type`	Type of external coordinator service, e.g., `s3` or `memory`.	`s3`
`coordinator.job_count`	Number of parallel instances the workload.	`1`
`coordinator.process_count`	How many threads will be run inside each job.	`4`
`coordinator.bucket`	Name of the S3 bucket for coordination.	`place_your_bucket`
`transferSpec.regular_snapshot.incremental`	List of objects defining incremental snapshot settings.	`[]`
`transferSpec.regular_snapshot.enabled`	Enable or disable the regular snapshot mechanism.	`false`
`transferSpec.regular_snapshot.cron_expression`	Cron expression for scheduled cron job.	`0 1 * * *`

Example `values.yaml`

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

coordinator:
  job_count: 1              # set more than one means work would be sharded, coordinator must be non memory
  process_count: 4          # default is 4, how many threads will be run inside each job
  type: s3                  # type of coordination, one of: s3 or memory
  bucket: place_your_bucket # Bucket with write access, will be used to store state

transferSpec:
  id: mytransfer         # Unique ID of a transfer
  name: awesome transfer # Human friendly name for a transfer
  type: INCREMENT_ONLY   # type of transfer, one of: INCREMENT_ONLY, SNAPSHOT_ONLY, SNAPSHOT_AND_INCREMENT
  src:
    type: source_type # for example: pg, s3, kafka ...
    params:
      ...             # source type params, all params can be founded in `model_source.go` for provider folder
  dst:
    type: target_type # for example: s3, ch, kafka ...
    params:
      ...             # target type params, all params can be founded in `model_destination.go` for provider folder
  regular_snapshot:
    enabled: true
    incremental:
      - namespace: public
        name: playing_with_neon
        cursor_field: id

Transfer Spec Configuration

The transferSpec section configures the data source (src) and destination (dst). These fields accept JSON-formatted strings in the params section for both source and destination configurations. The type defines the kind of source/destination system (e.g., PostgreSQL, ClickHouse).

Source Example:

src:
  type: pg  # Postgres source type
  params:
    Hosts:
      - YOUR_NEON_DB
    Database: YOUR_DB_NAME
    User: YOUR_DB_USER
    Password: YOUR_DB_PASSWORD
    Port: 5432
    BatchSize: 1024
    SlotByteLagLimit: 53687091200
    EnableTLS: true
    KeeperSchema: public
    DesiredTableSize: 1073741824
    SnapshotDegreeOfParallelism: 4

Destination Example:

dst:
  type: ch  # Clickhouse destination type
  params:
    User: YOUR_CH_USER
    Password: YOUR_CH_PASSWORD
    ShardsList:
      - Hosts:
          - YOUR_CH_HOST
    Database: default
    HTTPPort: 8443
    SSLEnabled: true
    NativePort: 9440
    MigrationOptions:
      AddNewColumns: true
    InsertParams:
      MaterializedViewsIgnoreErrors: true
    RetryCount: 20
    UseSchemaInTableName: true
    Interval: 1000000000
    Cleanup: Drop
    BufferTriggingSize: 536870912

New Features: Coordinator and Regular Snapshots

Coordinator Configuration

The coordinator section configures an external coordination service, such as an S3 bucket, that can be used for coordinating data transfers or snapshots. This is useful when external storage or coordination mechanisms are required.

Example:

coordinator:
  type: s3
  bucket: my-coordination-bucket

Regular Snapshot with Incremental Configuration

The regular_snapshot feature allows for periodic snapshots with optional incremental replication. This is particularly useful when a regular snapshot schedule is needed alongside continuous data replication.

Incremental Snapshot Example:

transferSpec:
  regular_snapshot:
    enabled: true
    incremental:
      - namespace: public
        name: playing_with_neon
        cursor_field: id

This example enables the regular_snapshot feature and specifies that incremental replication should track changes using the id field from the playing_with_neon table within the public namespace.

Deployment Types

1. SNAPSHOT_ONLY

This mode deploys a one-time Job to upload a data snapshot. You can control the number of parallel job instances by setting job.instanceCount.

To deploy:

helm install transfer ./transfer --set transferSpec.type=SNAPSHOT_ONLY --set job.instanceCount=3

2. INCREMENT_ONLY

This mode deploys a StatefulSet for continuous data replication. You can control the number of replicas with statefulSet.replicaCount.

To deploy:

helm install transfer ./transfer --set transferSpec.type=INCREMENT_ONLY --set statefulSet.replicaCount=2

3. SNAPSHOT_AND_INCREMENT

This mode first runs a Job to take a data snapshot, followed by a StatefulSet for continuous replication. The StatefulSet will only start after the Job completes successfully.

To deploy:

helm install transfer ./transfer --set transferSpec.type=SNAPSHOT_AND_INCREMENT --set job.instanceCount=2 --set statefulSet.replicaCount=2

Resource Allocation

You can adjust the CPU and memory allocation for the pods using the resources section of the values.yaml file. Ensure you specify both requests and limits for better resource management.

Example:

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Advanced Deployment Options

Setting Custom Values

You can override any configuration option in the values.yaml file by using the --set flag when installing the Helm chart. For example:

helm install transfer ./transfer --set transferSpec.id=my-custom-id --set transferSpec.type=INCREMENT_ONLY

Monitoring and Logs

To view logs of a running Job or StatefulSet pod, you can use kubectl:

kubectl logs <pod-name>

For example, to get

logs from the Job:

kubectl get pods -l job-name=transfer
kubectl logs <pod-name>

For the StatefulSet:

kubectl get pods -l statefulset.kubernetes.io/pod-name=data-upload-statefulset
kubectl logs <pod-name>

Contributing

Feel free to open issues and submit PRs to improve this chart!

Name		Name	Last commit message	Last commit date
parent directory ..
Secrets		Secrets
configs		configs
db-hosts		db-hosts
templates		templates
transfer		transfer
.helmignore		.helmignore
Chart.yaml		Chart.yaml
README.md		README.md
values.demo.yaml		values.demo.yaml
values.yaml		values.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Transfer Helm Chart

Prerequisites

Chart Installation

1. Install the Helm chart

2. Uninstall the Helm chart

Configuration

Parameters

Example `values.yaml`

Transfer Spec Configuration

New Features: Coordinator and Regular Snapshots

Coordinator Configuration

Regular Snapshot with Incremental Configuration

Deployment Types

1. SNAPSHOT_ONLY

2. INCREMENT_ONLY

3. SNAPSHOT_AND_INCREMENT

Resource Allocation

Example:

Advanced Deployment Options

Setting Custom Values

Monitoring and Logs

Contributing

FilesExpand file tree

helm

Directory actions

More options

Directory actions

More options

Latest commit

History

helm

Folders and files

parent directory

README.md

Transfer Helm Chart

Prerequisites

Chart Installation

1. Install the Helm chart

2. Uninstall the Helm chart

Configuration

Parameters

Example values.yaml

Transfer Spec Configuration

New Features: Coordinator and Regular Snapshots

Coordinator Configuration

Regular Snapshot with Incremental Configuration

Deployment Types

1. SNAPSHOT_ONLY

2. INCREMENT_ONLY

3. SNAPSHOT_AND_INCREMENT

Resource Allocation

Example:

Advanced Deployment Options

Setting Custom Values

Monitoring and Logs

Contributing

Example `values.yaml`