Added lambda that integrates Forklift by JaimeZepeda08 · Pull Request #411 · open-lambda/open-lambda

JaimeZepeda08 · 2026-02-06T18:25:13Z

No description provided.

tylerharter

Good work! Haven't read everything super closely yet, but leaving some initial feedback.

Let's not have big data files here, like deps.json and wl.json. In the docs, we can describe a user would use your lambda. wget those files (from wherever they are, probably on the forklift repo), and then curl to your function.

tylerharter · 2026-02-26T17:11:38Z

examples/forklift-service/f.py

+            event = json.loads(event)
+
+        workload_data = event.get("workload")
+        if not workload_data:


Do we need these checks? Seems it is all in a big try/except, and you'll return a similar error below anyway

tylerharter · 2026-02-26T17:12:38Z

examples/forklift-service/f.py

+
+    Expected event format:
+    {
+        "workload": { ... workload data ... },


Can you show the next level of detail (below) about how workload data is structured?

tylerharter · 2026-02-26T17:13:13Z

examples/forklift-service/f.py

+        }
+    """
+    try:
+        if isinstance(event, str):


Why do we need to check? Are there different possible inputs?

tylerharter · 2026-02-26T17:14:35Z

examples/forklift-service/f.py

+
+        result = generate_tree(workload_data, num_nodes, multi_package)
+
+        return {


Can we write a flask lambda instead? Then, we could return a status code, like 500 or something. The benefit is that somebody could do a "curl ... > tree.json" to get a tree, without needing to extract the result separately.

tylerharter · 2026-02-26T17:15:02Z

examples/forklift-service/f.py

+
+# Testing code
+if __name__ == "__main__":
+    with open("wl.json", "r") as file:


why not sys.argv[1]?

tylerharter · 2026-02-26T17:18:03Z

examples/forklift-service/f.py

+
+def generate_tree(workload_data, num_nodes, multi_package=True):
+    calls = parse_workload(workload_data)
+    deps_json = load_deps_json()


Let's not load from a file. Can we include include this in the POST body passed to our function?

tylerharter · 2026-02-26T17:20:23Z

examples/forklift-service/f.py

+candidate_queue = [] # priority queue for candidate nodes
+
+
+def enqueue_top_child_candidate(parent, deps, multi_package=True):


It seems we pass around deps and multi_package to all our functions. Perhaps put this all in a class, and these are attributes?

tylerharter · 2026-02-26T17:21:37Z

examples/forklift-service/f.py

+            continue
+
+        pkg_name = child_pkgV.split("==")[0]
+        version = child_pkgV.split("==")[1] if "==" in child_pkgV else None


Do we allow pkgs without a version? If our input is based on pip compile output, won't we have versions? I may misunderstand, though.

tylerharter · 2026-02-26T17:22:02Z

examples/forklift-service/f.py

+            continue
+
+        # keep track of packages that would be loaded by this candidate
+        packages_to_load = set([child_pkgV])


Can create a set without converting:

{child_pkgV}

tylerharter · 2026-02-26T17:22:52Z

examples/forklift-service/f.py

+from collections import defaultdict
+
+
+# TODO: instead of loading deps from a file, use pip-compile on the fly 


I'm thinking lets require callers to pass deps in. Then, however we figure out those dependencies, it will be outside the nice, clean tree-building logic here.

…ept deps as input, add paper link

…d format

tylerharter

Getting closer!

tylerharter · 2026-03-13T17:54:51Z

examples/forklift-service/f.py

@@ -0,0 +1,290 @@
+'''


just call "forklift", not "forklift-service"

tylerharter · 2026-03-13T17:56:11Z

examples/forklift-service/f.py

+        self._enqueue_top_child_candidate(child)
+
+    def build_tree(self, desired_nodes):
+        self.candidate_queue = []  


Best to see all class attrs in init even if None at that point.

For the queue, let's have namedtuple's so we can see the code is correct easier. Things like the following need careful checking:

_, _, best_candidate

tylerharter · 2026-03-13T18:16:28Z

go/worker/lambda/zygote/importCache.go

 		found := false
 		for _, p := range packages {
-			if p == nodePkg {
+			// compare only using package name


separate PR

tylerharter · 2026-03-13T18:19:25Z

examples/forklift-service/f.py

+                call_counts[name] += 1
+
+        # build call matrix
+        calls = {}


is a dict the best type for a calls matrix? Do want something (a) sparse with (b) column names.

Need to check, but Claude says Pandas can have a pd.SparseDtype(int, fill_value=0) type to make it sparse. Maybe Pandas DF is the best matrix format then?

I believe in this case a dict would work better because the algorithm repeatedly partitions calls by rows and we only operate on rows.

From AI:
"Pandas sparse data structures (specifically SparseDtype and SparseArray introduced in version 1.0+) are designed to handle sparse data efficiently, but they do not deal well with frequent, row-wise, or random mutations."

What do you think?

…ions when looking for a zygote match

…namedtuples for readability, and extracted parse logic outside of the class

tylerharter

Nice! Getting close.

tylerharter · 2026-03-27T15:24:23Z

examples/forklift/requirements.txt

@@ -0,0 +1 @@
+flask


Let's do the pattern of requirements.in, and a pip compiled requirements.txt, as in other example lambdas.

tylerharter · 2026-03-27T17:06:22Z

examples/forklift/f.py

+app = Flask(__name__)
+
+
+class CallMatrix:


I think a generic SparseMatrix would allow us to better mirror the logic in the paper. The class has package-specific awareness, and therefore takes on some responsibilities handled by other functions in the pseudocode.

tylerharter · 2026-03-27T17:10:23Z

examples/forklift/f.py

+        self.root = None
+        self.candidate_queue = []
+
+    def _enqueue_top_child_candidate(self, parent):


This function is quite a bit longer and more complicated than the pseudocode for enqueue_top_child_candidate in the paper. I think it is because this function is taking on more concerns, not just the core logic.

E.g., you have separate parse functions. Splitting pkg name from version feels like it should be handled there, not in the core logic of this function.

Maybe a helper function or two would help as well?

tylerharter · 2026-03-27T17:12:22Z

examples/forklift/f.py

+        return jsonify(result), 200
+
+    except Exception as e:
+        import traceback


imports should be top of file

tylerharter · 2026-03-27T17:13:23Z

examples/forklift/f.py

+
+    try:
+        event = request.get_json()
+        if event is None:


we have a couple special cases where we return different errors for specific kinds of invalid data. But given all is in a try/except, do we need special code for these?

tylerharter · 2026-03-27T17:14:04Z

examples/forklift/f.py

+    '''
+
+    try:
+        event = request.get_json()


perhaps the body of this try should be it's own function to cleanly separate logic: one function for core logic, another (which calls the former) that adds error handling.

JaimeZepeda08 and others added 3 commits February 6, 2026 03:41

created lambda that integrates Forklift

5e9f7e2

implemented Forklift

f057066

Merge branch 'open-lambda:main' into main

0d1cf1b

tylerharter requested changes Feb 26, 2026

View reviewed changes

JaimeZepeda08 and others added 7 commits March 3, 2026 14:06

Merge branch 'open-lambda:main' into main

da7ecb3

refactor: reorganize forklift into ZygoteTree/ZygoteNode classes, acc…

e24ea7a

…ept deps as input, add paper link

Merge branch 'open-lambda:main' into main

ebadfec

modified forklift lambda to use flask

cdd4cd8

updated requirements

89c5e65

Strip versions from zygote tree output to match Open Lambda's expecte…

565ecab

…d format

modified cache lookup function to match only to package name

c60ab34

tylerharter requested changes Mar 13, 2026

View reviewed changes

JaimeZepeda08 and others added 6 commits March 19, 2026 18:41

Merge branch 'open-lambda:main' into main

9b10bfc

revert change made in the last commit -- Lookup() should support vers…

b4ce803

…ions when looking for a zygote match

fixed utility calculation, preserve package version in output, added …

e4a2cc1

…namedtuples for readability, and extracted parse logic outside of the class

renamed forklift-service to forklift

9300d3d

removed old forklift dir

88f7adb

refactored calls dict into CallsMatrix class

5c5e310

tylerharter requested changes Mar 27, 2026

View reviewed changes


		result = generate_tree(workload_data, num_nodes, multi_package)

		return {

		candidate_queue = [] # priority queue for candidate nodes


		def enqueue_top_child_candidate(parent, deps, multi_package=True):

		from collections import defaultdict


		# TODO: instead of loading deps from a file, use pip-compile on the fly

Conversation

JaimeZepeda08 commented Feb 6, 2026

Uh oh!

tylerharter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylerharter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylerharter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants