Building AI agents with LangGraph, LangChain, and Local LLMs

I have created a website containing a series of notebooks that show how to build AI agents and use LLMs on your local machine.

All examples on the website use LLMs being run locally on a MacBook Air M1 2020, with 8GB RAM.

Using local models is an efficient way to test MVPs and generate and test new ideas. The models work well for simple use cases such as question and answering and even image caption generation, but perform poorly at tasks such as tool calling and decision making.

Identifying factors associated with diabetes from Canadian Community Health Survey data (2019-2020)

The analysis of Canadian Community Health Survey data from 2019-20 revealed that the prevalence of diabetes is much higher in some Canadian regions than others. This led to the question:

Is it possible to identify the most important factors associated with presence of diabetes from survey data (as broad as the Community Health Survey data)?

To answer the above, I built a predictive machine learning model and identified the top factors that are associated with the presence of diabetes.

General survey data was sufficient to recover the top factors associated with diabetes. The top factors, recovered in an unbiased way, were the following: age, medication for high blood cholesterol/lipids, taking medication for high blood pressure, perceived health, frequency of drinking alcohol, BMI classification, sex at birth, and being a visible minority.

You can explore the code for the ML model on github. A brief blog post on exploratory analysis of the data is also availabel here.

Analysis of clinical and molecular data for pancreatic cancer from The Cancer Genome Atlas (TCGA)

The PDAC-app is an interactive application that allows users to explore the pancreatic cancer data available from TCGA.

You can look at the overall demographic distribution of the patients, Kaplan Meier survival plots broken by clinical characteristics, and an analysis of tumor purity and molecular clusters in the data based on the article Integrated Genomics Characterization of Pancreatic Ductal Adenocarcinoma.

You can interact with the data in the shiny app here.

Analysis of political donations through electoral bonds in India

This web application was created as the final project for Harvard’s course, CS50: Introduction to Computer Science. The following tools were used:

  • Python for extracting the data from pdf to csv format
  • R for data exploration and cleaning
  • RShiny for data analysis and visualization.
  • Bootstrap library for R (bslib) to design a multi-page Shiny app
Back to top