Skip to content

Auto-generate embeddings on entity create/update#1639

Draft
EmanueleDeRossi1 wants to merge 7 commits intomainfrom
feat/embedding-event-listeners
Draft

Auto-generate embeddings on entity create/update#1639
EmanueleDeRossi1 wants to merge 7 commits intomainfrom
feat/embedding-event-listeners

Conversation

@EmanueleDeRossi1
Copy link
Copy Markdown
Collaborator

@EmanueleDeRossi1 EmanueleDeRossi1 commented Apr 14, 2026

Purpose

Automatic embedding generation is triggered by SQLAlchemy ORM instance listeners on EmbeddableMixin (after_insert / after_update). Callers no longer need to manually enqueue embedding work after saving an entity. The pipeline passes identity + precomputed searchable text into EmbeddingService, the generator, and the Celery task, so background work does not depend on holding the ORM instance or re-loading it just to get text.

What changed

EmbeddableMixin (mixins.py)

  • Still provides searchable_text_changed(), which hashes to_searchable_text() and compares it to Embedding.text_hash rows (first embed, or stale when no row matches the current hash).
  • New: @event.listens_for(EmbeddableMixin, "after_insert", propagate=True) and "after_update" handlers build EmbeddingService(session) and call enqueue_embedding(...) with entity_type, entity_id, searchable_text, user_id, and organization_id.

EmbeddingService / generator / task

  • enqueue_embedding is keyword-based and passes primitives (plus searchable_text) instead of (entity, current_user).
  • EmbeddingGenerator.generate accepts optional searchable_text; when set, it can embed without loading the entity for text.
  • generate_embedding_task accepts searchable_text and forwards it to the generator.

Tests

  • Updated for the new enqueue_embedding and internal _execute_sync / _enqueue_async signatures; mocks are reset where commit runs and triggers the new listeners.

@EmanueleDeRossi1 EmanueleDeRossi1 self-assigned this Apr 14, 2026
@EmanueleDeRossi1 EmanueleDeRossi1 force-pushed the feat/embedding-event-listeners branch from 84f7ac8 to cdf4816 Compare April 15, 2026 14:43
@EmanueleDeRossi1 EmanueleDeRossi1 force-pushed the feat/embedding-event-listeners branch from 33d68b9 to 7a13c1e Compare April 16, 2026 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant