Skip to content

Latest commit

 

History

History
310 lines (250 loc) · 9.5 KB

File metadata and controls

310 lines (250 loc) · 9.5 KB
copyright
years
2017, 2022
lastupdated 2022-11-24
subcollection AnalyticsEngine

{:new_window: target="_blank"} {:shortdesc: .shortdesc} {:codeblock: .codeblock} {:screen: .screen} {:pre: .pre} {:note: .note}

Livy batch APIs

{: #livy-api-serverless}

Livy batches API is a REST interface for submitting Spark batch jobs. This interface is very similar to the open source Livy REST interface (see Livy) except for a few limitations which are described in the following topic.

The open source Livy batches log API to retrieve log lines from a batch job is not supported. The logs are added to the {{site.data.keyword.cos_full_notm}} bucket that was referenced as the service instance "instance_home". At a later time during the beta release, the logs can be forwarded to {{site.data.keyword.la_full_notm}}. {: note}

Gets the log lines from this batch.

Submitting Spark batch jobs

{: #livy-api-serverless-1}

To submit a Spark batch job by using the Livy batches API, enter:

curl \
-H 'Authorization: Bearer <TOKEN>' \
-H 'Content-Type: application/json' \
-d '{ "file": "/ cos://<application-bucket-name>.<cos-reference-name>/my_spark_application.py"
", \
"conf": { \
      "spark.hadoop.fs.cos.<cos-reference-name>.endpoint": "https://s3.direct.us-south.cloud-object-storage.appdomain.cloud", \
      "spark.hadoop.fs.cos.<cos-reference-name>.access.key": "<access_key>", \
      "spark.hadoop.fs.cos.<cos-reference-name>.secret.key": "<secret_key>", \
      "spark.app.name": "MySparkApp" \
      } \
}' \
-X POST https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches

{: codeblock}

Request body for a submitted batch job using the Livy batches API:

Name Description Type
file File containing the application to execute string (required)
className Application Java/Spark main class string
args Command line arguments for the application list of string
jars jars to be used in this session list of string
pyFiles Python files to be used in this session list of string
files files to be used in this session list of string
driverMemory Amount of memory to use for the driver process string
driverCores Number of cores to use for the driver process int
executorMemory Amount of memory to use per executor process string
executorCores Number of cores to use for each executor int
numExecutors Number of executors to launch for this session int
name The name of this session string
conf Spark configuration properties map of key=val
{: caption="Request body for batch jobs" caption-side="top"}

The proxyUser, archives and queue properties are not supported in the request body although they are supported in the open source Livy REST interface. {: note}

Response body of a submitted batch job using the Livy batches API:

Name Description Type
id The batch ID int
appId The Spark application ID string
appInfo Detailed application information map of key=val
state State of submitted batch job string
{: caption="Response body of a submitted batch job" caption-side="top"}

Examples using the Livy API

{: #livy-api-serverless-2}

The following sections show you how to use the Livy batches APIs.

Submit a batch job with job file in {{site.data.keyword.cos_full_notm}}

{: #livy-api-serverless-13}

To submit a batch job where the job file is located in an {{site.data.keyword.cos_full_notm}} bucket, enter:

curl -i -X POST https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN" -d @livypayload.json

{: codeblock}

The endpoint to your {{site.data.keyword.cos_full_notm}} instance in the payload JSON file should be the public endpoint.

Sample payload:

{
  "file": "cos://<bucket>.mycos/wordcount.py",
  "className": "org.apache.spark.deploy.SparkSubmit",
  "args": ["/opt/ibm/spark/examples/src/main/resources/people.txt"],
  "conf": {
    "spark.hadoop.fs.cos.mycos.endpoint": "https://s3.direct.us-south.cloud-object-storage.appdomain.cloud",
    "spark.hadoop.fs.cos.mycos.access.key": "XXXX",
    "spark.hadoop.fs.cos.mycos.secret.key": "XXXX",
    "spark.app.name": "MySparkApp"
    }
}

{: codeblock}

Sample response:

{"id":13,"app_info":{},"state":"not_started"}

Submit batch job with job file on local disk

{: #livy-api-serverless-4}

To submit a batch job where the job file is located on a local disk enter:

curl -i -X POST https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN" -d @livypayload.json

{: codeblock}

Sample payload:

{
  "file": "/opt/ibm/spark/examples/src/main/python/wordcount.py",
  "args": ["/opt/ibm/spark/examples/src/main/resources/people.txt"],
  "className": "org.apache.spark.deploy.SparkSubmit"
}

{: codeblock}

Sample response:

{"id":15,"app_info":{},"state":"not_started"}

The SparkUiUrl property in response will have a non-null value when the UI is available for serverless Spark instance. {: note}

List the details of a job

{: #livy-api-serverless-5}

To list the job details for a particular Spark batch job enter:

curl \
-H 'Authorization: Bearer <TOKEN>' \
-H 'Content-Type: application/json' \
-X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches/<batch-id>

{: codeblock}

The response body for listing the job details:

Name Description Type
id The batch ID int
appId The Spark application ID string
appInfo Detailed application information map of key=val
state State of submitted batch job string
{: caption="Response body for listing job details" caption-side="top"}

An example:

curl -i -X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/43f79a18-768c-44c9-b9c2-b19ec78771bf/livy/batches/14 -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN"

{: codeblock}

Sample response:

{
 "id": 14,
 "appId": "app-20201213175030-0000",
 "appInfo": {
   "sparkUiUrl": null
 },
 "state": "success"
}

The SparkUiUrl property in response will have a non-null value when the UI is available for serverless Spark instance. {: note}

Get job state

{: #livy-api-serverless-6}

To get the state of your submitted job enter:

curl \
-H 'Authorization: Bearer <TOKEN>' \
-H 'Content-Type: application/json' \
-X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches/<batch-id>/state

{: codeblock}

The response body for getting the state of the batch job:

Name Description Type
id The batch ID int
state State of submitted batch job string
{: caption="Response body for getting state of batch job" caption-side="top"}

For example:

curl -i -X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/43f79a18-768c-44c9-b9c2-b19ec78771bf/livy/batches/14/state -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN"

{: codeblock}

Sample response:

{
	"id": 14,
	"state": "success"
}

List all submitted jobs

{: #livy-api-serverless-7}

To list all of the submitted Spark batch jobs enter:

curl \
-H 'Authorization: Bearer <TOKEN>' \
-H 'Content-Type: application/json' \
-X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches

{: codeblock}

The from and size properties are not supported in the request body although they are supported in the open source Livy REST interface. {: note}

The response body for listing all submitted Spark batch jobs:

Name Description Type
from The start index of the Spark batch jobs that are retrieved int
total The total number of batch jobs that are retireved int
sessions The details for each batch job in a session list
{: caption="Response body for listing all submitted batch jobs" caption-side="top"}

For example:

curl -i -X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/43f79a18-768c-44c9-b9c2-b19ec78771bf/livy/batches -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN"

{: codeblock}

Sample response:

{
  "from": 0,
  "sessions": [{
    "id": 13,
		"appId": "app-20201203115111-0000",
		"appInfo": {
			"sparkUiUrl": null
		},
		"state": "success"
    },
    {
		"id": 14,
		"appId": "app-20201213175030-0000",
		"appInfo": {
			"sparkUiUrl": null
		},
		"state": "success"
	}],
	"total": 2
}

The SparkUiUrl property in response will have a non-null value when the UI is available for serverless Spark instance. {: note}

Delete a job

{: #livy-api-serverless-8}

To delete a submitted batch job enter:

curl \
-H 'Authorization: Bearer <TOKEN>' \
-H 'Content-Type: application/json' \
-X DELETE https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/<instance-id>/livy/batches/<batch-id>

{: codeblock}

For example:

curl -i -X DELETE https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/43f79a18-768c-44c9-b9c2-b19ec78771bf/livy/batches/14 -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN"

{: codeblock}

Sample response:

{
	"msg": "deleted"
}