Commit2Viz - Hackathon Submission (aligned to repo scope)

Inspiration

Commit2Viz was inspired by the need to turn raw repository activity into immediately useful, human-friendly documentation. Instead of forcing developers to switch tools to understand what changed and who did it, we automated the collection of commit context and used an LLM to produce living documentation that lives inside Confluence. The project bridges the gap between repository events and readable documentation so teams can onboard faster and review code with better context.

What it actually does (project-scope aligned)

Commit2Viz is a small Atlassian Forge "Hello World" app combined with a GitHub Actions automation that turns repository commits and diffs into curated Confluence documentation via OpenAI:

The Forge frontend is a minimal UI (uses @forge/react and @forge/bridge) that demonstrates a resolver call pattern. It invokes a backend resolver named getText.
The backend resolver (written with @forge/resolver) currently logs requests and returns "Hello, world!" — acting as a small example/resolver template that can be extended later with smart-blame or commit-summary endpoints.
The core automation is a GitHub Actions workflow (.github/workflows/curr-confluence.yml) that:
- Detects pushes/merges into main and computes the correct compare range (merge-aware).
- Collects commit history, diffs (code-changes.diff), file statistics, and contributors.
- Fetches the current Confluence page content for the repository (storage.value).
- Builds a focused prompt that combines current page content + recent commit context and sends it to OpenAI.
- Receives clean HTML from OpenAI (prompt is engineered to return Confluence storage-ready HTML), strips any accidental quote wrappers, and PUTs the updated page body back to Confluence — with version bumping and safety/fallback handling.

In short: merge or push → workflow gathers context → LLM generates clean HTML → workflow safely updates Confluence.

How we built it

Tech stack and components:

JavaScript / Node.js
Atlassian Forge:
- Frontend: @forge/react
- Resolver: @forge/resolver (getText)
GitHub Actions for orchestration (.github/workflows/curr-confluence.yml)
OpenAI (Chat Completions gpt-4/gpt-3.5-turbo as configured via secrets)
Confluence Cloud API (pages endpoint / storage representation)
Small helper scripts and packaging to create commit-data.json, encode diffs, and construct openai-prompt.json

Architecture (concise):

Event: push or merge to main triggers CI workflow
CI: extracts commit/diff context, builds JSON artifacts, fetches current Confluence page
AI: CI sends combined prompt to OpenAI and receives HTML
Publish: CI safely updates Confluence (with fallback if OpenAI fails)
Forge app: small frontend/resolver sample to show a live entrypoint and the intended shape of resolver-driven features

Challenges we ran into

Prompt engineering: ensuring the LLM returns clean HTML (no JSON/wrapping or extraneous text)
Token limits: preventing overly long diffs from causing truncated or failed AI calls (we truncate and base64-encode diffs to be safe)
Merge handling: correctly computing compare ranges for merge commits so the AI sees correct change context
Confluence API quirks: managing storage.value and version numbers to avoid conflicts or overwrites
Secrets and access: making the workflow robust to missing or misconfigured CONFLUENCE / OPENAI secrets
Safety: ensuring the workflow never pushes malformed HTML to Confluence (added checks and a safe fallback path)

Accomplishments we’re proud of

End-to-end automation from repository event to Confluence update
A robust workflow that is merge-aware and assembles a comprehensive commit-data.json
Prompt refinement to return ready-to-insert Confluence HTML, minimizing manual cleanup
Safe update flow: version bumping, error handling, fallback to existing page content, and logging of openai-response.json for auditability
A minimal Forge UI + resolver example demonstrating how a resolver can be extended to expose smart-blame or commit-summary endpoints

What we learned

How to build a reliable orchestration pipeline in GitHub Actions that mixes git math, shell tooling (jq, base64), and HTTP calls
The importance of prompt constraints to get deterministic, machine-consumable output from LLMs
Practicalities of Confluence storage representation and API versioning
How to keep serverless resolvers (Forge) small and idempotent, delegating heavier context assembly to CI
Strategies for trimming and encoding large diffs to avoid token and JSON-escaping problems

What I (mayaoden) specifically worked on

Designed and implemented the custom GitHub Actions workflow (.github/workflows/curr-confluence.yml) that:
- Detects merge commits and computes correct compare ranges
- Exports commit-history, files_changed, contributors, and base64-encoded diffs
- Builds the OpenAI prompt payload and sends it to the OpenAI Chat endpoint
- Processes the LLM response, strips accidental wrappers, validates HTML safety heuristics, and updates Confluence with a version bump
Refined prompt engineering and output-safety logic to ensure the model returns clean Confluence-ready HTML and to provide fallback behavior on failure
Added logging and JSON artifacts (commit-data.json, openai-prompt.json, openai-response.json, confluence-update.json) to make runs auditable and reproducible during demos
Drafted documentation and a Confluence seed HTML fragment to make outputs deterministic and presentable for judges

Why this mattered:

It made the documentation update pipeline demonstrably robust for the hackathon demo and for future extension into real smart-blame features.

Demo notes (Pull request → CI → Confluence) — what to show

Show the staging Confluence page (before).
Create a small branch and open a PR, then merge it into main (this is what triggers the workflow).
Open GitHub Actions and show the "get curr confluence" run:
- Inspect "Extract Comprehensive Commit Information" (commit-data.json printed)
- Inspect "Get All Files Changed Since Last Main" (code-changes.diff printed)
- Inspect "Send Data to OpenAI for Confluence Update" (openai-response.json printed — show HTML)
- Inspect "Update Confluence Page with AI Content" (confluence-update-response.json printed; new version number)
Refresh Confluence and show the updated documentation page with the newly inserted sections (Project Overview, Recent Changes, Roadmap, etc.)
Optionally, demonstrate the fallback by temporarily invalidating OPENAI_API_KEY and re-running to show the safe fallback path in logs.

Commands to use during demo (copy/paste)

Create branch and push: BRANCH="demo/trigger-curr-confluence-$(date +%s)" git checkout -b "$BRANCH" echo "Demo trigger by mayaoden at $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> DEMO_TRIGGER.txt git add DEMO_TRIGGER.txt git commit -m "demo: trigger curr-confluence workflow" git push -u origin "$BRANCH"
Make PR & merge (using GitHub UI or CLI): gh pr create --title "demo: trigger curr-confluence workflow" --body "Trigger the workflow for demo" --base main gh pr merge --merge --delete-branch
Inspect latest workflow logs: gh run list --workflow "get curr confluence" gh run view <run-id> --log

What’s next for Commit2Viz

Expand the resolver to generate structured smart-blame summaries and serve them as REST endpoints for the UI
Add a staging Confluence page and a preview step so LLM output can be reviewed before publishing to the canonical page
Add unit and integration tests for the workflow components (validate commit-data.json shapes, prompt building)
Improve prompt determinism (section anchors, predictable headings) to make diffs between runs smaller and easier to validate
Add optional telemetry/metrics for Confluence updates and LLM calls (success/failure, latencies)