Knowledge Graph Agent

Multi-agent structure showing parallel, loop agents

Inspiration

Our journey with GraphCycle was sparked by a desire for collaborative learning and exploration. Matt aimed to mentor Katie, and together, we wanted to dive deeper into the exciting world of AI, Large Language Models (LLMs), and the practical applications of concepts like Retrieval Augmented Generation (RAG) and the potential of GraphRAG. We saw the Google Agent Development Kit (ADK) as the perfect vehicle to gain hands-on experience building autonomous agent systems, turning theoretical knowledge into a tangible project.

What it does

GraphCycle is an experimental autonomous agent system designed to transform unstructured information into structured knowledge. It takes text input, either from local files or YouTube video transcripts, and processes it through a multi-agent pipeline. The core functionality involves:

Loading: Ingesting raw text data.
Parallel Refinement: Two independent agent loops concurrently generate and review RDF knowledge graphs in Turtle format from the input text. Each loop iteratively improves its graph based on AI-driven reviews.
Merging: The outputs from the parallel refinement loops are then intelligently merged by a synthesis agent into a single, more comprehensive and robust knowledge graph. The final output is a consolidated Turtle RDF file representing the extracted and organized knowledge.

How we built it

We built GraphCycle using the Google Agent Development Kit (ADK), leveraging its powerful constructs for creating and orchestrating autonomous agents.

Core Agents: We utilized LlmAgent for tasks requiring natural language understanding and generation, such as text loading, knowledge graph generation, review, and merging.
Orchestration: The overall workflow is managed by a SequentialAgent (KnowledgeGraphPipeline). Within this, a ParallelAgent (KGRefinementLoopParallel) runs two instances of a LoopAgent (KGRefinementLoop1 & KGRefinementLoop2) concurrently.
Iterative Refinement: Each LoopAgent implements an iterative build-review cycle using a GraphBuilder agent, a GraphReviewer agent, and a custom StopIfComplete agent to control the loop's termination.
Tools: We integrated several custom tools:
- read_file_content: To load text from files.
- download_youtube_transcript: To fetch YouTube transcripts.
- validate_turtle: To check the syntax of generated RDF.
- store_knowledge_graph & load_knowledge_graph: For managing graph data within the agent's state.
State Management: Data such as the raw text, intermediate knowledge graphs, and reviewer feedback is passed between agents primarily using the ADK's session state.

The pipeline is structured as: Input -> TextLoader -> Parallel (Loop1 [Builder -> Reviewer -> StopChecker], Loop2 [Builder -> Reviewer -> StopChecker]) -> SynthesisAgent -> Output.

Challenges we ran into

LLM Hallucinations & Tool Use Fidelity: A major hurdle was the tendency for LlmAgents to "hallucinate." Agents would often claim to have used a tool or completed a task correctly when, in reality, the tool call was missed, failed, or the agent simply fabricated the outcome. This required careful prompt engineering and an understanding that agent self-reporting isn't always reliable.
Reliable State Management: Ensuring consistent and reliable state transfer between agents, especially within loops and across parallel branches, was challenging. While output_key offers a convenient way to pass data, we found that for more complex interactions, direct and explicit manipulation of the session state (tool_context.state or ctx.session.state) was often necessary to ensure data integrity and availability.
Prompt Engineering for Complex Tasks: Crafting precise and effective prompts for agents performing nuanced tasks like RDF generation, critical review, and knowledge graph merging required significant iteration and experimentation.

Accomplishments that we're proud of

Functional Multi-Agent System: We successfully designed and implemented a complex, multi-step agent pipeline using the Google ADK that processes data from ingestion to a final, merged knowledge graph.
Parallel Iterative Refinement: We're particularly proud of orchestrating the two parallel LoopAgents that independently build and refine knowledge graphs. Getting this concurrent, iterative process to work and feed into a subsequent merge step was a significant achievement.
End-to-End Knowledge Extraction: GraphCycle demonstrates a complete, albeit experimental, workflow for extracting and structuring knowledge, showcasing the potential of autonomous agents for this kind of task.
Overcoming Technical Hurdles: Navigating the challenges of LLM hallucinations and state management to produce a working prototype feels like a substantial accomplishment.

What we learned

This project was an immense learning experience:

The Reality of LLM Limitations: We gained a practical understanding of LLM fallibility, especially concerning truthful tool usage and factual consistency. This highlighted the need for external validation and robust error checking in agentic systems.
Principles of Agent Orchestration: We learned how to structure and coordinate a "swarm" of agents using ADK's SequentialAgent, ParallelAgent, and LoopAgent. This involved thinking critically about data flow, dependencies, and control logic between different AI components.
Effective State Management Strategies: We developed a deeper appreciation for the nuances of managing shared state in a multi-agent system, understanding when implicit mechanisms suffice and when more explicit control is required.
Iterative Development in AI: The importance of iterative prompt engineering, testing, and refinement was constantly reinforced throughout the project.

What's next for GraphCycle

We see several exciting directions for GraphCycle:

Enhanced Hallucination Mitigation: Implement more sophisticated validation steps and potentially a "supervisor" agent to cross-check tool usage and the veracity of generated content.
Richer Review & Feedback Mechanisms: Allow reviewers to provide more structured feedback or even suggest specific RDF triple corrections, enabling more targeted refinement.
True GraphRAG Integration: Explore incorporating the generated knowledge graphs into a RAG pipeline to answer questions or generate summaries based on the structured data.
Advanced Merging Strategies: Develop more sophisticated algorithms for the SynthesisAgent to resolve conflicts and fuse knowledge during the merge step, perhaps incorporating semantic reasoning.
Expanded Data Source Compatibility: Add tools to support a wider range of input data types (e.g., PDFs, web pages).
User Interface & Visualization: Develop a user-friendly interface for interacting with the agent and visualizing the generated knowledge graphs in real-time (beyond the current example_output.txt and scratch experiments).

Built With

Updates

Matt Hamilton started this project — Jun 03, 2025 10:18 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.