Inspiration

We are big fans of Shark Tank—the thrill of watching passionate entrepreneurs transform their ideas into thriving businesses truly inspires us. Coupled with our love for data and pattern recognition, we set out to harness the power of graph databases to reveal hidden investment patterns and relationships.

What it does

Our project transforms the Shark Tank US dataset into a dynamic graph of startups and investors using ArangoDB. It enables users, particularly private equity professionals, to interactively query and visualize investment relationships. Through natural language queries, the platform uncovers insights such as which investors are funding which startups, providing actionable intelligence for investment decision-making.

How we built it

We began by cleaning the dataset—ensuring missing numeric values were set to 0 and string fields to empty strings. Next, we transformed the data into a node–edge graph using Python’s NetworkX library:

  • Nodes: Represent startups (with attributes like industry, business description, etc.) and investors.
  • Edges: Represent investment relationships, created only when an investment amount is greater than zero.

We then persisted this graph into ArangoDB:

  • Startups and investors were stored in dedicated document collections.
  • Investment relationships were stored in an EdgeCollection (type 3), ensuring that the _from and _to fields were correctly populated.

Finally, we integrated natural language query capabilities into our Agentic App built in streamlit and using Google Gemini and Langchain to dynamically convert plain language queries into optimized AQL queries.

Challenges we ran into

A major challenge was with the investment edge documents: the _from and _to fields were returning None. After much debugging, we discovered that our investments needed to be stored as type 3 (EdgeCollection) rather than type 2, which fixed the issue. This process deepened our understanding of ArangoDB’s graph model and edge handling.

Accomplishments that we're proud of

We’re proud of successfully overcoming the technical hurdles to build a working graph-based solution. Our platform not only visualizes complex relationships between investors and startups but also supports dynamic, natural language-driven queries. The integration of ArangoDB with an Agentic App to deliver real-time insights is a significant achievement for our team.

What we learned

We learned a great deal about ArangoDB’s graph capabilities and the importance of properly configuring EdgeCollections. The project reinforced our skills in data cleaning, network analysis, and natural language processing. Most importantly, it taught us how persistence, creative problem-solving, and iteration are key to building innovative data solutions.

What's next for Shark Arango

Looking forward, we plan to expand our dataset to include more investment information, further enhancing the depth of insights available to private equity professionals. We aim to add advanced analytics, real-time dashboards, and enhanced natural language query capabilities to create a comprehensive, insightful platform that empowers smarter investment decisions.

Built With

Share this project:

Updates