Inspiration
University syllabi are dense, inconsistent, and easy to overlook. We wanted a way to turn static PDFs into something students could immediately use without manually copying deadlines into a calendar.
What it does
CourseTrack lets users upload a syllabus PDF and automatically extracts assignments, quizzes, exams, projects, and other deadlines using AI-assisted parsing.
Users can:
- Review and edit extracted events
- Generate an
.icscalendar file - Optionally sync events directly to Google Calendar
- Generate study plans
- Create study guide PDFs from selected deadlines
- Connect with other students in the same course via Discord
How we built it
The backend is built with Flask and Python.
We use pdfplumber to extract raw text from uploaded PDFs and OpenRouter (Gemini) to structure that text into normalized event data. Extracted events are stored in MongoDB, and a content-based hashing system prevents duplicate API calls for identical uploads.
Calendar files are generated using the iCalendar (.ics) format, and Google Calendar sync is handled through OAuth.
The frontend is built with HTML, CSS, and vanilla JavaScript.
Challenges we ran into
- Syllabus formatting varies widely, which made extraction inconsistent across documents.
- LLM responses sometimes returned markdown-wrapped JSON, requiring additional cleaning and parsing logic.
- Designing a caching system that avoided redundant API calls while maintaining a consistent output structure required careful handling. This involved hashing a value tied to a PDF using Python's built-in library hashlib.
Accomplishments that we're proud of
- Building a reliable PDF → structured deadline extraction pipeline.
- Implementing content-based hashing to significantly reduce repeated API calls.
- Creating a clean review flow so users can correct AI output before exporting.
- Successfully integrating Google OAuth and direct calendar syncing.
What we learned
We learned that LLM output must always be validated and normalised before storage. We also saw how important deterministic caching is when working with external APIs, both for performance and cost control. Handling real-world PDF inconsistencies was more complex than expected.
Next Steps
Deploy!
Log in or sign up for Devpost to join the conversation.