Batch generate embeddings in `add_nodes` / `add_edges` by fabubaker · Pull Request #2201 · Pometry/Raphtory

fabubaker · 2025-07-29T20:08:50Z

What changes were proposed in this pull request?

Currently, GqlMutableGraph.add_nodes / GqlMutableGraph.add_edges repeatedly calls the embedding function for each node/edge. This PR modifies it to use the function just once for an entire batch of nodes/edges.

Why are the changes needed?

Optimizes how embeddings are created, for eg, by using a batch call to OpenAI.

Does this PR introduce any user-facing change? If yes is this documented?

No.

How was this patch tested?

Unit tests were added to mutable_graph.rs.

Are there any further changes required?

No.

raphtory-graphql/src/model/graph/mutable_graph.rs

ricopinazo · 2025-07-31T15:07:47Z

raphtory/src/vectors/vectorised_graph.rs

+        let vectors = self.cache.get_embeddings(docs).await?;
+
+        for (id, vector) in ids.iter().zip(vectors) {
+            self.edge_db.insert_vector(*id, &vector)?;


Every time you call insert_vector this is going to update the index. We should probably change that function to accept a vector of embeddings instead, so we can write everything to arroy and only in the end update the index just once. The same applies to nodes

CLAassistant · 2025-07-31T16:58:54Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ fabubaker
❌ github-actions[bot]
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

ricopinazo

LGTM, just left a small comment that you might want to have a look at, but it's not the end of the world

ricopinazo · 2025-07-31T17:09:35Z

raphtory-graphql/src/model/graph/mutable_graph.rs

-        }
+
+        // Generate embeddings
+        let edges: Vec<_> = edges.into_iter().collect::<Result<Vec<_>, _>>()?;


I'm not sure if might be posible to collect here just once

The first collect uses ? to return the first error it encounters in edges, so both collect are needed.
That said I did collapse both into one chain and got rid of the intermediate variable.

ricopinazo · 2025-07-31T17:13:52Z

raphtory-graphql/schema.graphql

 type GraphAlgorithmPlugin {
-	shortest_path(source: String!, targets: [String!]!, direction: String): [ShortestPathOutput!]!
 	pagerank(iterCount: Int!, threads: Int, tol: Float): [PagerankOutput!]!
+	shortest_path(source: String!, targets: [String!]!, direction: String): [ShortestPathOutput!]!


Just letting a comment here to remember we should make sure whatever generates de GraphQL schema file should do it in a predictable way to avoid these kinds of conflicts

…htory into feat/batch-create-embeddings

fabubaker added 2 commits July 29, 2025 14:16

Add unit tests for GqlMutableGraph

1e0bb77

Batch generate embeddings in add_nodes

b41637f

fabubaker requested a review from ricopinazo July 29, 2025 20:09

fabubaker self-assigned this Jul 29, 2025

ricopinazo requested changes Jul 30, 2025

View reviewed changes

raphtory-graphql/src/model/graph/mutable_graph.rs Show resolved Hide resolved

raphtory-graphql/src/model/graph/mutable_graph.rs Outdated Show resolved Hide resolved

fabubaker added 2 commits July 30, 2025 14:45

Use mock_embedding fn for tests

5ce6f0a

Batch generate embeddings in add_edges

69fba55

fabubaker changed the title ~~Batch generate embeddings in add_nodes~~ Batch generate embeddings in add_nodes / add_edges Jul 30, 2025

ricopinazo requested changes Jul 31, 2025

View reviewed changes

fabubaker added 2 commits July 31, 2025 11:33

Batch insert vectors into DB

463cc1a

Remove redundant methods

0dcf33e

fabubaker force-pushed the feat/batch-create-embeddings branch from 7e268b1 to 0dcf33e Compare July 31, 2025 15:44

fabubaker and others added 4 commits July 31, 2025 12:35

Merge branch 'master' into feat/batch-create-embeddings

144bdcd

Apply rustfmt

e6d65c0

Apply more rustfmt

5d5ec7b

chore: apply tidy-public auto-fixes

84148d8

ricopinazo approved these changes Jul 31, 2025

View reviewed changes

fabubaker and others added 3 commits July 31, 2025 14:16

Collapse collect into one chain

b5b8a80

Merge branch 'feat/batch-create-embeddings' of github.com:Pometry/Rap…

0c8e304

…htory into feat/batch-create-embeddings

chore: apply tidy-public auto-fixes

fd6a4f9

fabubaker merged commit 79dab9c into master Jul 31, 2025
5 of 6 checks passed

fabubaker deleted the feat/batch-create-embeddings branch July 31, 2025 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch generate embeddings in `add_nodes` / `add_edges`#2201

Batch generate embeddings in `add_nodes` / `add_edges`#2201
fabubaker merged 13 commits intomasterfrom
feat/batch-create-embeddings

fabubaker commented Jul 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ricopinazo Jul 31, 2025

Uh oh!

fabubaker Jul 31, 2025

Uh oh!

CLAassistant commented Jul 31, 2025

Uh oh!

ricopinazo left a comment

Uh oh!

ricopinazo Jul 31, 2025

Uh oh!

fabubaker Jul 31, 2025

Uh oh!

ricopinazo Jul 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fabubaker commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change? If yes is this documented?

How was this patch tested?

Are there any further changes required?

Uh oh!

Uh oh!

Uh oh!

ricopinazo Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

fabubaker Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Jul 31, 2025

Uh oh!

ricopinazo left a comment

Choose a reason for hiding this comment

Uh oh!

ricopinazo Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

fabubaker Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

ricopinazo Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fabubaker commented Jul 29, 2025 •

edited

Loading