Adding scripts & Readme steps for vLLM based workloads over IBM LSF by arshabbir · Pull Request #23 · IBMSpectrumComputing/lsf-integrations

arshabbir · 2026-04-10T08:46:19Z

No description provided.

michaelspriggs · 2026-04-20T14:00:41Z

+This repository shows how to run a long-running vLLM inference service under IBM LSF,
+validate it through a standard OpenAI-compatible API, access it from a Jupyter notebook,
+and reuse the same service from a downstream batch job.
+


A little wordsmithing:

In this repository we demonstrate how to deploy a large-language model inference service on an LSF cluster using vLLM. The service exposes an OpenAI-compatible API. We show how various clients can use the model for interactive or batch inference.

Addressed in the latest commit

michaelspriggs · 2026-04-20T14:02:44Z

+- python3 installed
+- curl installed
+- network access from the execution host to pull the vLLM image and model
+- a single-node IBM LSF setup is sufficient for this implementation


I think we also require a shared $HOME directory, correct? That is not strictly necessary for LSF, but is a common deployment.

Added this in the latest README

michaelspriggs · 2026-04-20T14:03:41Z

+- scripts/batch_client.py
+  Reads a prompt corpus and sends requests to the registered vLLM service.
+- notebook/LSF_vLLM_Client.ipynb
+  Jupyter notebook for interactive validation against the IBM LSF-managed runtime.


There is no notebook subdirectory

Addressed in the latest commit

michaelspriggs · 2026-04-20T14:03:55Z

+- notebook/LSF_vLLM_Client.ipynb
+  Jupyter notebook for interactive validation against the IBM LSF-managed runtime.
+- corpus/prompts.txt
+  Sample prompt corpus for downstream batch validation.


There is no corpus subdirectory

Addressed in the latest commit

michaelspriggs · 2026-04-20T14:05:01Z

+Prerequisites
+-------------
+- IBM LSF installed and operational
+- podman installed


I guess this must be installed on all compute nodes of the cluster right? Not sure whether we need to use the LSF podman integration? I guess likely not (which is fine)

Yeah. We dont need LSF podmain integration

michaelspriggs · 2026-04-20T14:07:48Z

+
+```bash
+cp corpus/prompts.txt ~/lsf_vllm_poc/corpus/prompts.txt
+```


I suggest to include some lines to say to clone this repo, and cd into whatever base directory. Just make it easy for people to cut-and-paste lines so that they can reproduce this without having to think too much.

Also, need to update corpus -> scripts

I have addressed it and added the below ..hope this is fine . Please verify

git clone https://github.com/IBMSpectrumComputing/lsf-integrations.git cd lsf-integrations/LSF-vLLM

After this follow the instructions step by step given below.

michaelspriggs · 2026-04-20T14:08:53Z

+
+```bash
+MODEL=Qwen/Qwen3-0.6B PORT=8001 API_KEY=local-vllm-key
+```


how to do this? grep a line in one of the config files?

sounds like the step should be to update the API_KEY. Where do users get this key from? Should this be a prerequisite?

I have added the below note, in the updated README.

NOTE :
Default demo API key: local-vllm-key

The service script uses this value unless API_KEY is explicitly set before submission.
If you choose a different value, update the curl commands, notebook cells, and batch client inputs accordingly.

michaelspriggs · 2026-04-20T14:26:30Z

+
+```
+http://127.0.0.1:8001/v1
+```


for this one, looks like you are starting the notebook on the cluster node, and then connecting from the laptop through ssh tunnel.

You should mention which host each command gets run on (laptop vs. LSF compute host) and also for the URL to use that in the web browser.

Also mention that a prerequisite for this is to have ssh access to a cluster node.

Updated the README with these steps explaining where to run the commands

michaelspriggs · 2026-04-20T14:32:47Z

+bjobs
+bpeek ${BATCH_JOBID}
+cat ~/lsf_vllm_poc/results/batch_${JOBID}.jsonl
+```


Overall, I suggest to break this into a few sections:

(1) Deploy the LLM

deploy

monitor

kill
(2) Use the LLM

curl

Jupyter

LSF job

Please review the new Restructured readme file.

michaelspriggs

See my inline comments

michaelspriggs reviewed Apr 20, 2026

View reviewed changes

arshabbir force-pushed the LSF-vLLM branch from 8688fc7 to 6ed0adf Compare April 20, 2026 14:29

michaelspriggs reviewed Apr 20, 2026

View reviewed changes

michaelspriggs requested changes Apr 20, 2026

View reviewed changes

arshabbir force-pushed the LSF-vLLM branch 5 times, most recently from 76cc627 to cd5baca Compare April 23, 2026 14:23

arshabbir added 2 commits April 23, 2026 07:35

Adding scripts & Readme for vLLM based workloads over IBM LSF

eccaac3

Addressing review comments

27e5a5c

arshabbir force-pushed the LSF-vLLM branch from cd5baca to 27e5a5c Compare April 23, 2026 14:36

Conversation

arshabbir commented Apr 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelspriggs Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelspriggs Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelspriggs Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelspriggs left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelspriggs Apr 20, 2026 •

edited

Loading

michaelspriggs Apr 20, 2026 •

edited

Loading

michaelspriggs Apr 20, 2026 •

edited

Loading