Inspiration

Ever wondered how many hours you could save TAing if Prof just gave you tutorial slides? Or maybe Zoom through that tedious presentation? That's precisely what we aim to do with Vorto.

What it does

Vorto is a mobile app that creates AI-generated slides from PDF notes. The notes are summarized and AI-generated images are synthesized to make these slides.

How we built it

The model pipeline is as follows:

  • The user selects a PDF from their phone.
  • The PDF is run through an OCR model to extract text.
  • This text is then summarized using Text Summarization AI (MeaningCloud API)
  • The summarized text is further processed with Rapid Automatic Keyword Extraction (RAKE).
  • These keywords are not only used as slide titles but are also passed into a pre-trained HuggingFace diffusion model to generate images. The core app is built with Flutter. The GAN image generation is done in Python with a Flask backend.

Model Pipeline

Challenges we ran into

  • The first challenge we ran into was getting a good Text Summarization API. We were unable to use Azure due to registration issues, and most other APIs weren't easily accessible to us.
  • This was our first time using RESTful APIs in Flutter and the Dio framework. There were a few quirks we had to get accustomed to fast.
  • By far the biggest challenge we faced was in the GAN model. Most models of this sort are too large to be reliably used without significant GPU support. We also tried using OpenAI's model but decided on using our own lightweight model in the end due to better images. We used a lightweight model that can generate images on the CPU. Hosting the server was also quite difficult due to long inference times.

Accomplishments that we're proud of

We created a novel "notes to slides" app using deep learning models in 24 hours. Not only is this idea very cool, we believe the project has very practical uses as well. We created a clean minimalistic UI and a complex deep-learning pipeline with a team mainly composed of first-time hackers.

What we learned

We learnt a lot about backend development and creating REST APIs, and creating deep ML pipelines using a handful of APIs and custom models. We also learnt how to use Flask and integrate deep learning models with Flutter without the use of prewritten model-specific libraries.

What's next for Vorto

There are many things in store for Vorto. Firstly, we'd like to further train our own generative models specifically tailored to creating presentations. Secondly, we'd like to provide users with the opportunity to select which pictures they'd like for their presentation from a selection of choices. We were constrained by generation time constraints in this area. Creating multiple slide features is also a feature we'd like to improve upon. We strongly believe this project has a very strong, wide-ranging scope of use and would like to work further on it.

Built With

Share this project:

Updates