Skip to content

AIntegrator/intern-fall-winter-2025-task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RFC – Timeout Handling and Retry in LLM API Calls

Background

We recently got this error in our logs:

LLM API call failed: 504 Server Error: Gateway Time-out for url: https://api.xyz.com/chat/completions

This happened in our summarization service – the micro service responsible for summarizing transcripts.

We found additional logs related to the error:

Got transcript text from database for transcript with ID: xyz.
Calling LLM API with 194 seconds timeout...

The transcript with id xyz was a large piece of text, and the LLM API call took longer than 194 seconds, resulting in a timeout.

But timing out and losing the transcript is not ideal because it wastes expensive GPU resources and the user doesn't get a summary.

We should handle this case more gracefully.

Improve the code with a retry mechanism using tenacity and an improved estimation mechanism for the number of seconds to timeout.

Steps

  1. Clone the repository (if you fork, other applicants will see your code)
  2. Create a new branch for your changes
  3. Make the necessary changes to the codebase in the branch
  4. Commit your changes with a clear message and create a well-documented (description of your changes and reasoning) pull request in your clone
  5. Send us the link to your pull request

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages