blogify

Landing Page
Loading Page
Editor

Inspiration

With the theme 'Explore Together', some of the first things that came to mind for us were learning and accessibility. In the current environment, a lot of our education has been reliant on videos, which can be difficult for those with visual impairments. Text-based products work much better with screen readers and other technologies, and many people may even prefer a text product for ease of use and quick access of information.

From this, we decided to try and make a service that would allow people to summarize videos, and allow them to customize the output. We also wanted to try and use some emerging technologies, such as sentence boundary detection and scene detection to enhance the experience.

This is our first long-duration hackathon, and so we decided to aim big, and try to make the product that we envisioned, and that's where we got blogify from.

What it does

blogify is a tool that helps people summarize YouTube videos (and hopefully soon other videos, such as lecture recordings) and present them in a clean format. Using various libraries, the service skims through the video, picking out parts where the scene likely changed, as well as processing the subtitles (including auto-generated ones) to add grammar and context. With this data, blogify provides a full editor that allows users to modify the sections of the video, edit the text, and select key frames to display. All of this is then exported to a Markdown-like viewer that can be saved for future use.

How we built it

The backend is running a NodeJS web server using the fastify library and references a Postgres database for persistent storage. The entire project was written in Typescript to help us avoid bugs and write better code.

The backend handles requests to create a WebSocket, which allows for real-time progress updates during processing.
Video files are processed using ffmpeg, youtube-dl, scenedetect, and several other libraries.
Chosen frame images are uploaded to a Linode S3 object storage instance for fast, reliable service to the client.

The client is written in React and utilizes aspects of SPA (single page applications), along with WebSockets mentioned above to create a seamless experience between page.

The client requests a video from the server, and then displays real-time processing information.
Once processing finishes, the data is received from the server and displayed in an editor, which handles all of the modification client side.

Challenges we ran into

Creating a custom editor in 2 days was very difficult, especially since we were very ambitious with our goals for it. We were able to get most of our features working though.
JS and TS can be complicated, and we definitely reached a level of callbacks that no one could ever enjoy.
We definitely overestimated how much work 2 people can get done in 36 hours (would like some sleep!)

Accomplishments that we're proud of

Fully functioning backend storage and processing: We successfully implemented the libraries and connections to be able to download, process, and export + persist key video details.
Custom subtitle parsing: YouTube's autogenerated subtitles are super messy, so with the help of some open source code we were able to clean it up and output it into a standard format for parsing.
Our slick, feature-filled editor: We had big goals for this editor, and we ended up being able to accomplish a lot with it. It should also allow us to expand on it in the future, which was one of our goals.

What we learned

Making a project that seemed actually useful was super enjoyable, and it didn't really feel like work because of that.
Have good midway points - our product relied on the hardest part, so we had to spend a lot of time finishing it.

What's next for blogify

Here are some ideas we have for the future:

Use OCR to read text-based videos, such as lecture presentations
Polish everything to have better user flows, and other save methods (logins, etc.)
Use audio processing to avoid relying solely on downloaded captions

Built With

Submitted to

Bitcamp 2021
- Winner Best Hack Promoting Expansion of Education Opportunities - Bloomberg LP

Created by

I implemented the back-end services, including prototyping and designing endpoints and WebSockets. I researched a lot about video and subtitle formats in order to create the parser for the transcript feature.

Drew Pleat
I came up with the general architecture and performed exploratory analysis for Scene Transition Detection. For the actual coding, I built the frontend using React and Typescript, utilizing Neumorphism design language.

David Kim