feat: add global support for refresh_worker

justinmerrell · justinmerrell · commit 76532c5e1530 · 2023-04-11T20:17:53.000-04:00
diff --git a/docs/serverless/infer_file.md b/docs/serverless/infer_file.md
diff --git a/docs/serverless/worker.md b/docs/serverless/worker.md
@@ -1,67 +1,10 @@
-# RunPod Pod Worker
+# The Serverless Worker
 
-To convert a Pod to a Worker, you need to add the following annotations to the Pod:
 
-## Environment Variables
 
-```bash
-# Development
-RUNPOD_DEBUG= # Set to 'true' to enable debug mode, otherwise leave blank
-RUNPOD_DEBUG_LEVEL= # ERROR, WARN, INFO, DEBUG
+## Worker Refresh
 
-# Realtime
-RUNPOD_REALTIME_PORT= # Port to listen on for realtime connections (Default: None)
-RUNPOD_REALTIME_CONCURRENCY= # Number of workers to spawn (Default: 1)
-```
+For more complex operations where you are downloading files or making changes to the worker it can be beneficial to refresh the worker between jobs. This can be accomplished by enabling a refresh_worker worker flag in 1 of two ways:
 
-The following environment variables are set automatically by RunPod, you DO NOT need to set them manually:
-
-```bash
-# API Endpoints
-RUNPOD_AI_API_KEY= # Serverless API Key
-RUNPOD_WEBHOOK_GET_JOB= # URL to get job work from
-RUNPOD_WEBHOOK_POST_OUTPUT= # URL to post output to
-RUNPOD_WEBHOOK_PING= # URL to ping
-RUNPOD_PING_INTERVAL= # Interval in milliseconds to ping the API (Default: 10000)
-
-RUNPOD_ENDPOINT_ID= # Endpoint ID
-RUNPOD_ENDPOINT_SECRET= # Endpoint Secret (SET BY RUNPOD Replaces RUNPOD_AI_API_KEY, not implemented yet)
-```
-
-### Additional Variables
-
-These are variables that are accessed from the RunPod container and not required to be set manually:
-
-```bash
-# Pod Information
-RUNPOD_POD_ID= # Pod ID
-```
-
-## Error Handling
-
-If an error occurs, the worker will send a message to the API with the error message and the job will be marked as failed.
-
-To report a job error call `job.error(worker_id, job_id, error_message)`.
-
----
-
-## Worker Lifecycle Management
-
-The duration of the worker is managed within the [lifecycle](../PodWorker/modules/lifecycle.py) module.
-
-The worker starts with a TTL as specified by the environment variable `TERMINATE_IDLE_TIME` or defaults to 60 seconds as specified with `self.ttl`. When a new job is received, a `work_in_progress` flag is set. When the job is completed, the `work_in_progress` flag is cleared and the TTL is reset. If the `work_in_progress` flag is not cleared within the `work_timeout` period, the worker will exit.
-
- If the worker does not receive a new job within idle period, the worker will exit.
-
-## Local Testing
-
-To test locally, create the file `test_input.json` in the root directory that contains the following:
-
-```json
-{
-    "id": "LOCAL-TEST",
-    "input":{}
-}
-```
-
-If the required webhook environment variables are not set, the worker will default to local testing.
+   1. Enable on start with `runpod.serverless.start({"handler": handler, "refresh_worker": True})`, this will refresh the worker after every job return, even if the handler raises an error.
+   2. Return `refresh_worker=True` as a top level dictionary key in the handler return. This can selectively be used to refresh the worker based on the job return.
diff --git a/runpod/serverless/work_loop.py b/runpod/serverless/work_loop.py
@@ -48,6 +48,10 @@ async def start_worker(config):
 
             job_result = run_job(config["handler"], job)
 
+            # If refresh_worker is set, pod will be reset after job is complete.
+            if config.get("refresh_worker", False):
+                job_result["stopPod"] = True
+
             await send_result(session, job_result, job)
 
             set_job_id(None)