pi-level-rag-curriculum-mapper

Nursing Curriculum Mapping with NLP

This repository contains an end-to-end NLP project that automates curriculum mapping for the University of Texas at Austin School of Nursing. The goal is to map BSN course syllabi to the American Association of Colleges of Nursing (AACN) competency domains using a mix of classical and modern NLP methods.

Project Overview

Accreditation and curriculum audits require nursing faculty to manually review each course syllabus and document how it aligns with AACN competency domains. This is time-consuming, inconsistent, and hard to repeat as curricula change.

In this project, we:

Ingest and preprocess 29 BSN course syllabi and 10 AACN domains.
Experiment with multiple NLP approaches for curriculum mapping.
Evaluate how well each method recovers “ground truth” domain mappings.
Propose a higher-precision PI-level RAG pipeline for fine-grained alignment.

This work is based on a final project for I385T – Natural Language Processing at The University of Texas at Austin.

Objectives

Automate mapping of nursing course syllabi to AACN competency domains.
Compare classical NLP baselines to embedding-based and LLM-based methods.
Provide interpretable outputs that faculty can review and trust.
Lay the groundwork for a simple UI that supports live curriculum audits.

Data

We work with two primary text sources:

Course syllabi

29 BSN course syllabi (e.g., Fall 2025).
Extracted fields:
- course name
- course description
- learning objectives
- learning outcomes
- assessment methods
- assignments

AACN Essentials

10 high-level competency domains:
- 1: Knowledge for Nursing Practice
- 2: Person-Centered Care
- 3: Population Health
- 4: Scholarship for the Nursing Discipline
- 5: Quality and Safety
- 6: Interprofessional Partnerships
- 7: Systems-Based Practice
- 8: Informatics and Healthcare Technologies
- 9: Professionalism
- 10: Personal, Professional, and Leadership Development
Each domain has multiple progression indicators (PIs) and sub-competencies.

Methods

We implemented and compared several approaches:

1. Latent Dirichlet Allocation (LDA)

Topic modeling on combined syllabi + AACN domain text.
Documents represented as topic distributions.
Course–domain similarity measured with Jensen–Shannon divergence.
Provides interpretable “theme-based” mappings, but sensitive to boilerplate language.

2. Named Entity Recognition (NER)

Uses a biomedical NER model plus noun phrase extraction.
Augmented with a custom nursing lexicon (e.g., “therapeutic communication”, “population health”).
Course–domain similarity via Jaccard overlap of extracted entities.
Transparent term-level evidence, but brittle to synonyms and paraphrases.

3. Embedding-Based Semantic Similarity (BioWordVec)

Domain-specific word embeddings (BioWordVec, 200-dimensional).
TF-IDF weighted document embeddings for syllabi and domains.
Pairwise cosine similarity for syllabus–domain alignment.
Threshold tuning for “covered vs not covered” decisions (precision/recall tradeoff).

4. BERT / Sentence-Transformer Embeddings

Sentence-transformer models (e.g., all-MiniLM-L6-v2, multi-qa-mpnet-base-dot-v1).
Whole-document and chunk-level embeddings for syllabi and AACN domains.
Similarity-based mapping and diagnostic keyword extraction with KeyBERT.
Captures deeper semantics, but can be unstable without good evaluation and calibration.

5. FAISS + PI-Level RAG (Retrieval-Augmented Generation)

Build a FAISS vector store over syllabus snippets and/or AACN PIs.
Two search strategies:
- Global: centroid/domain-level search for coarse classification.
- Local: PI-level search for fine-grained evidence.
LLM acts as an auditor:
- Given a syllabus snippet and top-k candidate PIs, choose the best PI.
- Produce a short justification citing specific syllabus text.
Aggregate PI-level decisions to derive domain coverage for each course.

Synthetic Ground Truth

## Key Findings (High-Level)

Classical baselines (LDA, NER) are interpretable but:
- LDA overweights frequent administrative/boilerplate language.
- NER struggles with abstract competencies and paraphrased concepts.
Embedding-based similarity improves semantic alignment and recall, but:
- Requires careful threshold tuning.
- Works best as a screening tool, not a fully automatic classifier.
Simple centroid-based FAISS domain similarity is not sufficient alone.
PI-level RAG with majority voting:
- Provides much better agreement with the reference mappings.
- Produces human-readable explanations faculty can inspect.
Overall, the system functions best as a faculty assistance tool, surfacing candidates and evidence for human review.

Evaluation

We evaluate models against:

A synthetic ground truth generated via zero-shot LLM classification of courses to AACN domains.
Standard metrics:
- Accuracy
- Precision
- Recall
- F1 score

Limitations

Synthetic “ground truth” is LLM-generated and may not perfectly match expert faculty judgment.
Domain-level mapping can hide which specific progression indicators are covered.
Extensive syllabus boilerplate (policies, grading, etc.) can distort some models.
Sharing of raw syllabi and AACN text may be restricted; users must supply their own documents.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
demo		demo
diagrams		diagrams
lda and ner		lda and ner
parsing_data		parsing_data
rag_as_ground_truth		rag_as_ground_truth
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pi-level-rag-curriculum-mapper

Nursing Curriculum Mapping with NLP

Project Overview

Objectives