Inspiration

Like many developers, I love creating new apps and features, but we're not good at (or don't enjoy) creating tutorials explaining how users and customers can use what we've built.

That's why the Agentic MiniTutorial Generator was created. With a simple prompt about what to do (use a new feature or find something), it generates a short, step-by-step tutorial that a user should follow, including screenshots and a video. This saves developers and product managers time and other resources, allowing them to focus on what's most important about their products: the added value for users.

What it does

The Agentic MiniTutorial Generator allows you to generate a tutorial (step-by-step text, screenshots, and videos) using a simple prompt (a longer prompt is better understood by the model). It replicates user behavior, opening and executing each action defined by the model, while receiving feedback from each screen and action.

The Agentic MiniTutorial Generator allows you to:

  • Autonomous Navigation - Browses websites, clicks, scrolls, and types autonomously
  • Tutorial Generation - Creates structured mini-tutorials from completed tasks
  • Session Recording - Records full video + screenshots of each step
  • Safety System - Prompts for user confirmation on sensitive actions
  • Configurable - Custom turn limits, safety rules, and headless mode
  • Organized Output - Saves each tutorial in its own folder with all media

How we built it

The platform was built using:

  • Python.
  • Vertex AI (on Google Cloud).
  • Google GenAI SDK.
  • Playwright (to execute the actions specified by the model and also to generate screenshots and videos).

Challenges we ran into

Achieving a blend between the model's output and a user-friendly navigation flow, as if it were a new user, explaining each step as if it were a tutorial generator. Also, ensuring that if the model were unsure about something (such as requesting passwords and completing a captcha), it would ask the developer to do it.

Accomplishments that we're proud of

Find a real problem (that I deal with all the time as a developer) and find a solution not only for myself, but for any other interested developer and product manager.

What we learned

To use the Computer Use tool to make real actions in web using the Gemini model, without worrying about explaining the model more than necessary (or even training a new model). Also, I learnt how to use Vertex AI and the Google GenAI SDK to make the agents easier to use.

What's next for Agentic MiniTutorial Generator

  • Enable voice in generated videos.
  • Improve the video generation without wasting a second in the first part.
  • Enable the platform for other types of actions (not just tutorials).

Built With

Share this project:

Updates