Refork: GitHub for Recipes
Inspiration
We've always been frustrated by how recipes exist in isolation - a blog post here, a screenshot there, no way to trace why a dish evolved or how a technique spread across cuisines. When we started thinking about how developers collaborate on code, we kept coming back to one question: what if culinary knowledge had provenance?
GitHub gave programmers a way to fork, diff, and credit each other's work. We wanted to give cooks the same infrastructure. That idea became Refork.
How We Built It
The core of Refork is an ingredient embedding space - every ingredient is encoded as a high-dimensional vector capturing its flavor profile, texture, cultural context, and common pairings. Similarity between two ingredients $a$ and $b$ is computed using cosine similarity:
$$\text{similarity}(a, b) = \frac{\vec{v}_a \cdot \vec{v}_b}{|\vec{v}_a| \cdot |\vec{v}_b|}$$
This lets users discover non-obvious substitutions and flavor bridges - why does miso work in a chocolate cake? The vectors show you.
Recipes are stored as directed acyclic graphs, where each node is a version and each edge represents a fork. If user A creates a base carbonara and user B swaps in coconut cream, that relationship is preserved forever:
$$G = (V, E), \quad E \subseteq V \times V, \quad \text{no cycles}$$
The stack was a Next.js frontend, a Python FastAPI backend, PostgreSQL for recipe metadata, and a vector database (Pinecone) for ingredient search.
Challenges
The hardest problem was defining ingredient identity. Is "smoked paprika" the same node as "paprika"? What about regional variants of the same spice? We ended up building a hierarchical taxonomy where ingredients exist at multiple levels of specificity, and search queries traverse the hierarchy dynamically.
There was also the cold start problem - a graph with no forks isn't interesting. We seeded the database by scraping and parsing public domain recipe archives, normalizing ingredient lists, and generating synthetic fork relationships based on documented recipe evolution (e.g., the historical drift from French béchamel to American cream sauce).
Finally, getting the embedding quality right took iteration. Off-the-shelf food embeddings didn't capture enough culinary nuance, so we fine-tuned on a corpus of flavor pairing research and food science literature.
What We Learned
Building Refork taught us that data modeling is product design. The decision to represent recipes as a DAG rather than flat documents wasn't just technical; it encoded a whole philosophy about creativity and attribution. Every schema decision shaped what users could and couldn't do.
We also learned that vector search is powerful but opaque. Surfacing why two ingredients are similar (not just that they are) became a UX challenge as much as a technical one - explainability matters in consumer products, not just enterprise AI.
Most of all, Refork reminded us why we build things: the moment a user discovers that saffron and vanilla sit surprisingly close in embedding space, and asks why - that's the product working.
Built With
- alembic
- asyncpg
- clerk
- d3.js
- date-fns
- docker
- eslint
- fastapi
- httpx
- lucide-react
- next-themes
- next.js
- numpy
- pgvector
- pnpm
- postgresql
- pydantic
- pyjwt
- pytest
- python
- pytorch
- radix-ui
- react
- ruff
- shadcn/ui
- sonner
- sqlalchemy
- tailwind-css
- turborepo
- typescript
- uvicorn
Log in or sign up for Devpost to join the conversation.