Home agAIn

This is the repository for our final UC Berkeley MIDS Capstone project, Home agAIn. Our project leverages historical data from homelessness support programs to create individualized service plans focused on improving permanent housing outcomes.

There are three main components contained in this repo: python notebooks that we ran in SageMaker, a FastAPI backend that interacts with RDS to service the front-end application, and a React based front-end that makes use of the shadcn/ui component library.

Notebooks

The notebooks directory contains a collection of python notebooks that we used to build our data pipeline. The final logistic regression is contained within 'Full Logistic Regression Data Pipeline.ipynb'.

FastAPI Backend

The home-again-backend directory contains the FastAPI backend. It makes use of SQLAlchemy to interact with RDS and also SKLearn to provide homeless service recommendations for program participants to increase their likelihood of a permanent housing outcome.

React Frontend

The home-again-frontend directory contains the React frontend application. It provides a fully functional application that provides a simple dashboard for viewing homelessness analytics, as well as a case management capability to manage participants, log services, and view service recommendations dynamically as participant profiles are updated.

System Architecture

We leverage a range of AWS services for our system architecture. SageMaker hosts Jupyter notebooks that implement our model pipeline. We use S3 to cache prepped datasets and store saved models for operational use. A PostgreSQL RDS instance is used to house all of our data and an EC2 instance runs our FastAPI backend. Cognito provides us with turn-key user auth and session management. Our front-end is built with React and is served up by a secure HTTPS CloudFront distribution.

Data Pipeline

Our data pipeline runs fully automated. Data is queried from RDS and then goes through a data prep phase where data is cleaned, features are engineered, and categorical variables are encoded. The final dataset is split into train, validation, and test sets and cached in S3 for efficient loading across pipeline runs. A training and evaluation loop runs to find optimal models and log experiment results in RDS. The optimal model is then run against our test set and the weights are saved to S3 for use by our application.

Home agAIn Screenshots

Dashboard

Case Details

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
home-again-api		home-again-api
home-again-frontend		home-again-frontend
notebooks		notebooks
sql		sql
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Home agAIn

Notebooks

FastAPI Backend

React Frontend

System Architecture

Data Pipeline

Home agAIn Screenshots

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Home agAIn

Notebooks

FastAPI Backend

React Frontend

System Architecture

Data Pipeline

Home agAIn Screenshots

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages