Inspiration
We as in WE ALL do not like writing documentation.
What it does
So, it writes documentation. But what else does it do? Deploy a server with your documentation hosted on it live for anyone else to view.
How we built it
We used Python and Flask for the back end, while using NVIDIA Nemotron for our LLM of choice because it's so great. We also use LangChain to add agentic features to Nemotron. In addition, we use Docusaurus to load all our documentation, with each Docusaurus server running on, separate docker containers. For our front end, we use Vite + React for a fast and efficient front end.
Challenges we ran into
The problems we ran into issues were mostly with dealing with LLM context size. Since this project processes many documents at the same time and LLMs are infamous for hallucinating with increasing context, this proved to be pretty difficult to over come. The time allotted was too little to properly set up a vector database which was truly needed for this project to utilize semantic search. In order to remedy this, we used a 64k context limit and tuned our inferencing server to handle it. In addition, we set up a directed graph of every function used in the project and how it interconnects with the rest of the codebase. This allows us to effectively cut down on the amount of tokens we are feeding to the LLM. Not only does this make the LLM happy in terms of context size, this also make the HTTP protocol happy since we aren't dumping too many packets into the server via HTTP.
Accomplishments that we're proud of
We are most proud of is fixing the issues that seemed to be completely detrimental to the project. In addition, we are proud of the speed of how fast the our project actually processes Git repositories. Since we trim Git history using depth=1 via clone, we cut out most of the bloat given to us by Git repositories. Even though there may be some areas we could improve, I am proud of the solutions we came up with given the time crunch we were in. We're also proud of figuring out how to handle the auto assigning a new domain from NGINX to the new document server options. This way, we can add as many documentation servers each having their own unique URL to access.
What we learned
As a group we learned a lot about scanning grammar using tree-sitter, tuning the LLM inference server, and how context relates to LLM recall. We learned about the limitation of the 70B Nemotron model given to us and what it is particularly better at in terms of writing. We learned a lot about Docusaurus and building for Docker, it worked great.
What's next for Nvidia Track
We could use some work in the optimization department. We have a big bottleneck when building the documentation server and serving it over the web. In addition, it would be best if we replaced our hacky method to optimize context with a true vector database. Doing this would allow for more accurate recall and better documentation generation as a whole.
Built With
- javascript
- langchain
- nemotron
- python
- react


Log in or sign up for Devpost to join the conversation.