Welcome to Structure Discovery, where we explore the foundations of unsupervised machine learning with a focus on discovering patterns in educational data without labels.
Structure Discovery focuses on how we can use unsupervised machine learning to uncover hidden patterns, groupings, and structures within complex educational datasets, without relying on predefined labels or outcomes. In this workshop, we will go through 4 modules where participants will gain both a conceptual foundation and practical experience with key unsupervised learning techniques that are widely used in learning analytics and educational research.
We will be using R, specifically with Integrated Development Environment (IDEs), RStudio or Positron.
This provides step-by-step instructions for setting up your local environment for data science. This repository supports development in both RStudio and Positron IDE (my personal favorite).
Note: While RStudio is highly mature, Positron is in active development (currently version 2026.03.0). You may encounter UI changes, but the core installation logic remains consistent across the 2026 release cycle.
You must install the R language engine before installing an IDE. Positron requires R version 4.2.0 or higher.
| Operating System | Download Link | Architecture |
|---|---|---|
| Windows 10/11 | Download R 4.4.x | x64 |
| macOS (Apple Silicon) | Download R-4.4.x-arm64.pkg | M1, M2, M3, M4 |
| macOS (Intel) | Download R-4.4.x-x86_64.pkg | Intel Macs |
| Linux | CRAN Linux Binaries | Distro-specific |
Recommended for users focused exclusively on R, RMarkdown, and Shiny applications.
- Go to the RStudio Download Page.
- Select the installer for your OS:
- Run the installer and follow the default prompts.
Recommended for users who work with both R and Python and prefer a VS Code-based workflow.
- Navigate to the Positron Releases Page.
- Download the version corresponding to your OS.
- Important: After installation, launch Positron and click the Interpreter icon (top right) to select your R version.
-
Conceptual Overview: A slide-based overview introducing the core concepts of structure discovery. This includes key distinctions from supervised learning, common techniques (e.g., clustering, dimensionality reduction), and examples of real-world applications.
-
Case Study/Essential Reading: This research article demonstrates the application of unsupervised methods (e.g., clustering) in an educational context. 🔍 Suggested use: Students can annotate the reading, identify methods used, and reflect on findings in small group discussions.
-
Code Along: A guided coding activity using a real-world dataset. Students apply multiple structure discovery algorithms (e.g., k-means) and interpret results related to student performance. 🧠 Tip: Ideal for hands-on learning following the conceptual overview. We will encourage experimentation with different algorithms (e.g., cluster vs. factor analysis).
-
Badge Activity: A self-paced, reflection activity that promotes a deeper understanding of structure discovery methods.
-
Conceptual Overview: Introductory slides covering core clustering concepts, algorithms, and use cases in education.
-
Case Study/Essential Reading: A foundational article demonstrating how clustering techniques are applied in real-world educational research.
-
Code Along: A guided analysis of a real-world study applying clustering to educational data.
-
Badge Activity: A reflection activity to connect clusering to your own research or teaching practices.
-
Conceptual Overview: Slides introducing key concepts in validating clustering results, including metrics like silhouette analysis and many others.
-
Case Study/Essential Reading: A key reading demonstrating how clustering validation techniques are applied in practice. In this code-along, you will gain an understanding of how to assess and interpret the quality of clustering solutions.
-
Code Along: A guided coding activity using real data to apply clustering validation metrics and interpret results.
-
Badge Activity: A reflection activity to connect clustering validation to your own research or instructional practice.
-
Conceptual Overview: Slides introducing advanced clustering methods (e.g., hierarchical clustering, spectral clustering, Gaussian mixture models, etc.) and their applications in educational data.
-
Case Study/Essential Reading: A key reading showcasing the use of advanced clustering techniques in educational research. This reading helps contextualize when and why to use more sophisticated models.
-
Code-Along: A guided code-along activity exploring advanced clustering techniques, with space for discussion and reflection.
-
Badge Activity: A reflective activity to connect advanced clustering approaches to your own research or teaching.