Podcast Processor

Podcast Processor
Diagram

Inspiration

Being a hobbyist podcast host I recently setup a website to accompany my podcast. The site is a truly static quick setup site and I found it tedious to create and update the website pages whenever I released a new podcast episode. So I was inspired to setup an app that uses ML to dynamically create the content based off of the podcast episode.

What it does

What the Podcast Processor does at a high level is takes an uploaded mp3 and uses AWS Transcribe to transcribe the audio conversation to a text version, the transcription is then passed to a summarizer. The summerizer is an AWS SageMaker endpoint that is powered by an AWS Marketplace Training model created by Mphasis DeepInsights Text Summarizer (https://aws.amazon.com/marketplace/pp/prodview-uzkcdmjuagetk). This Text Summarizer solution is an optimal way to tackle the problem of information overload by reducing the size of long documents into a few sentences . Then once the transcription is summarized the podcast episode is published to a web app using the summary for a description/summary and publishes the full transcription. The publisher uses the summarized version of the transcription as the description of the episode.

How I built it

This application is build with Amazon Transcribe, SageMaker, a series of AWS Lambda functions that are orchestrated with AWS Step Functions and using s3 storage.

Challenges I ran into

The most challenging is to get Amazon Transcribed fine tuned to do a better transcription of my voice and the rambling I do in my podcast episodes. Its still a work on process to get transcribe fine tuned.

Accomplishments that I'm proud of

It was so fun getting to put AWS Step Functions in action creating an end to end process using multiple aws services.

What I learned

Being a full stack developer and not too familiar with the ML and SageMaker side of the equation I learned how easy it can be to setup and use ML models from the AWS Marketplace. The model setup was easy and had great documentation as to what is the structure of the input request and the output from the API.