Inspiration

We as in WE ALL do not like writing documentation.

What it does

So, it writes documentation. But what else does it do? Deploy a server with your documentation hosted on it live for anyone else to view.

How we built it

We used Python and Flask for the back end, while using NVIDIA Nemotron for our LLM of choice because it's so great. We also use LangChain to add agentic features to Nemotron. In addition, we use Docusaurus to load all our documentation, with each Docusaurus server running on, separate docker containers. For our front end, we use Vite + React for a fast and efficient front end.

Challenges we ran into

The problems we ran into issues were mostly with dealing with LLM context size. Since this project processes many documents at the same time and LLMs are infamous for hallucinating with increasing context, this proved to be pretty difficult to over come. The time allotted was too little to properly set up a vector database which was truly needed for this project to utilize semantic search. In order to remedy this, we used a 64k context limit and tuned our inferencing server to handle it. In addition, we set up a directed graph of every function used in the project and how it interconnects with the rest of the codebase. This allows us to effectively cut down on the amount of tokens we are feeding to the LLM. Not only does this make the LLM happy in terms of context size, this also make the HTTP protocol happy since we aren't dumping too many packets into the server via HTTP.

Accomplishments that we're proud of

We are most proud of is fixing the issues that seemed to be completely detrimental to the project. In addition, we are proud of the speed of how fast the our project actually processes Git repositories. Since we trim Git history using depth=1 via clone, we cut out most of the bloat given to us by Git repositories. Even though there may be some areas we could improve, I am proud of the solutions we came up with given the time crunch we were in. We're also proud of figuring out how to handle the auto assigning a new domain from NGINX to the new document server options. This way, we can add as many documentation servers each having their own unique URL to access.

What we learned

As a group we learned a lot about scanning grammar using tree-sitter, tuning the LLM inference server, and how context relates to LLM recall. We learned about the limitation of the 70B Nemotron model given to us and what it is particularly better at in terms of writing. We learned a lot about Docusaurus and building for Docker, it worked great.

What's next for Nvidia Track

We could use some work in the optimization department. We have a big bottleneck when building the documentation server and serving it over the web. In addition, it would be best if we replaced our hacky method to optimize context with a true vector database. Doing this would allow for more accurate recall and better documentation generation as a whole.

Built With

Share this project:

Updates