Installation
Using Homebrew
Install the oleander CLI:Configuration
Authenticate with your API key. Find it in your oleander settings.Oleander Managed Spark
Upload, list, and delete jobs only on the oleander managed cluster.List your Spark jobs
List your uploaded Spark scripts and their status:Upload your Spark script
Upload your Spark application to oleander. The script is stored and ready to run:Include Python dependencies
If your Spark script needs additional Python modules, package them in a ZIP and include them with--py-files:
Update an existing job
To replace an existing Spark job with a new version of your script, use the--overwrite flag:
Delete a Spark job
Delete a Spark job:Submit and execute a Spark job
Submit your uploaded Spark script to the oleander managed cluster. Use the exact uploaded file name without the path, such asprocess_sales_data.py. The --wait flag keeps the command running until the job finishes.
Submit options
--namespace(required): Namespace for the job, a logical group such as a team or project.--name(required): Job name. Runs with the same namespace and name are grouped under the same job.--args: Spark job entrypoint arguments.--sparkConf: Spark configurations without--conf, for examplespark.default.parallelism=8. Separate multiple configurations with whitespace.--jobTags: Job-specific tags inkey=valueform. Separate multiple tags with whitespace.--runTags: Run-specific tags.--executionIamPolicy: IAM policy for job permissions. Final permissions are the intersection of the job execution role and this policy.--driverMachineType: oleander Spark driver machine type.--executorMachineType: oleander Spark executor machine type.--executorNumbers: Number of executor instances.--wait: Wait until the job finishes.
Registered EMR Serverless Spark
Register your EMR Serverless cluster and target it by name when submitting jobs. Include--cluster <name> and provide the S3 entrypoint (PySpark script or JAR).
Register an EMR Serverless cluster
Register options
--region: AWS region of the EMR Serverless application.--account-id: AWS account ID of the EMR Serverless application.--controller-role-arn: IAM role ARN oleander assumes to start job runs. Add this to the role’s trust policy so oleander can assume it:
--execution-role-arn: IAM role ARN the job uses; the Spark application runs with this role’s permissions.--application-id: EMR Serverless application ID.--log-bucket: S3 bucket for job logs.
Submit a job to EMR Serverless
Submit options
--cluster(required): Name of the registered cluster.--namespace(required): Namespace for the job, a logical group such as a team or project.--name(required): Job name. Runs with the same namespace and name are grouped under the same job.--args: Spark job entrypoint arguments.--sparkConf: Spark configurations without--conf, for examplespark.default.parallelism=8. Separate multiple configurations with whitespace.--jobTags: Job-specific tags inkey=valueform. Separate multiple tags with whitespace.--runTags: Run-specific tags.--executionIamPolicy: IAM policy for job permissions. Final permissions are the intersection of the job execution role and this policy.--pyFiles: ExtrapyFilesfor the PySpark job. Mutually exclusive with--mainClass.--mainClass: Entrypoint main class for the Java/Scala Spark job. Mutually exclusive with--pyFiles.--wait: Wait until the job finishes.
Registered Glue Spark
Register your Glue cluster and target it by name when submitting jobs. Include--cluster <name>. Submit uses the existing Glue job name in your environment.
Register a Glue cluster
Register options
--controller-role-arn: IAM role ARN oleander assumes to start job runs. Add this to the role’s trust policy so oleander can assume it:
Submit a job to Glue
Use--cluster to select the registered cluster:
Submit options
--cluster(required): Name of the registered cluster.--namespace(required): Namespace for the job, a logical group such as a team or project.--name(required): Job name. Runs with the same namespace and name are grouped under the same job.--args: Spark job entrypoint arguments.--sparkConf: Spark configurations without--conf, for examplespark.default.parallelism=8. Separate multiple configurations with whitespace.--jobTags: Job-specific tags inkey=valueform. Separate multiple tags with whitespace.--runTags: Run-specific tags.--executionIamPolicy: IAM policy for job permissions. Final permissions are the intersection of the job execution role and this policy.--workerType: Glue worker type.--numberOfWorkers: Number of Glue workers.--enableAutoScaling: Set totruefor auto scaling,falseotherwise.--executionClass: Glue execution class. EitherSTANDARDorFLEX.--timeoutMinutes: Glue job timeout in minutes.--wait: Wait until the job finishes.