Note: The thumbnail image was made using this project!

Inspiration

Visum stands for Visual Summary.

Wanted to combine two awesome AI models to do something.

The idea of turning a paragraph into an image for use in a blog post popped into my head and was too cool not to try.

What it does

Given a paragraph, it turns it into an image by first summarizing the paragraph in one line and inputting that line to a https://github.com/openai/glide-text2im.

How we built it

Front-end: Svelte Back-end: Flask Summarization: "snrspeaks/t5-one-line-summary" from HuggingFace Text to Image: GLIDE

Challenges we ran into

Model weights are large. Code uses a lot of RAM and GPU Memory. I know nothing about front-end let alone Svelte (literally Googled how to center a div, still no idea) How to make front-end talk to back-end, back-end load model and get predictions, and then back-end communicate with front-end.

Accomplishments that we're proud of

It works!

What we learned

First time used Svelte. Making a UI is hard. Getting even a simple front-end with a back-end is hard enough, adding a deep learning model is too much (I'm still going to keep trying though) It's hard to find an ML model that works out-of-the-box (HuggingFace saved me!)

What's next for Visum

No idea!

Built With

Share this project:

Updates