MTech-AI-Capstone-Project

National University of Singapore - Institute of System Science - Masters of Technology in Artificial Intelligence Systems - Capstone Project
By Richard Chai (https://www.linkedin.com/in/richardchai/)

Problem Statement

The persistent misalignment between question complexity and perceived difficulty presents a fundamental challenge in AI-driven educational technologies, where subjective human judgments frequently diverge from objective question characteristics.

“How can we improve the AI system’s ability to generate questions with complexity levels that consistently align with human expectations of difficulty?”

Challenges

Achieving proper alignment between AI understanding of question complexity and human perception of difficulty faces several interrelated challenges. This is because "difficulty" is:

A Fluid Concept
Malleable
Hard to Define and Measure

Hence, to date, there is no universal standard for defining and measuring question difficulty.

Proposed Solution - The Cerebro Index Framework

Introducing the Cerebro Index (CI), a novel framework that integrates educational theory, cognitive science, and technical metrics into a unified system to quantify the complexity of a question.

When given a text question as input, the Cerebro Index Framework outputs a complexity score (CI Score) that quantifies and represents the question’s inherent complexity. CI Scores are in a continuous range from 0.0 to 10.0 (inclusive). The CI framework addresses limitations in existing methods, such as subjective human judgment, rigid rule-based systems, and opaque machine learning models, by combining educational theory with scalable, interpretable metrics.

Cerebro Index Framework - System Design

Human Validation of the CI Framework

Research Questions

RQ1: Do CI scores positively correlate with human-perceived question complexity rankings?
RQ2: Do CI complexity categories align with those assigned by human evaluators?

Human evaluators frequently do not agree among themselves on complexity categorisation.

Limitations

While the results are promising, especially for RQ1, the study has limitations. The sample size was small, with only 54 unique questions evaluated by 51 participants for a total of 588 ranked or categorised questions, which limited the statistical power of the study. It was a deliberate study design not to provide rubrics to the human evaluators to avoid biasing judgements. While the study did capture the raters’ natural correlations and category assignments, it introduced variability in human judgment for RQ2.

The coverage for this study was narrow, with only three domains and limited question diversity. Despite these constraints, the study confirmed the potential of the Cerebro Index Framework.

Study Conclusion and Next Steps

The Cerebro Index (CI) Framework demonstrates empirical validity as a quantitative method for assessing question complexity. The framework exhibits strong rank-order correlation between CI Scores and human-perceived difficulty (Spearman’s ρ > 0.7 in 6 of 9 domain-format conditions), with statistically significant alignment (p < 0.05) in key configurations such as Time in Personal Travel - MCQ (ρ = 0.943) and Supervised Machine Learning - True/False (ρ = 0.986).

These findings support the CI Framework’s utility in:

Cognitive load modelling in educational assessments
Instructional sequencing and curriculum design
Adaptive learning systems, where CI scores can inform item selection and difficulty progression
AI evaluation, enabling benchmarking of LLMs on questions with objectively calibrated complexity
Prompt/Intent routing system

Furthermore, the standardised nature of CI scoring enables cross-domain comparability, allowing for equating difficulty across disparate knowledge areas (e.g., supermarket math vs. machine learning theory), provided format-specific adjustments are applied. The CI Framework provides a psychometrically grounded, continuous measure of question complexity that bridges subjective perception and objective quantification.

With further refinement and expanded validation, the Cerebro Index Framework has the potential to become a foundational tool for intelligent tutoring systems, automated assessment design, and rigorous AI capability benchmarking, enhancing the AI ability to generate questions with complexity levels that consistently match the difficulty expected by human users.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
Project PPT		Project PPT
videos		videos
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MTech-AI-Capstone-Project

Problem Statement

Challenges

Proposed Solution - The Cerebro Index Framework

Cerebro Index Framework - System Design

Human Validation of the CI Framework

Limitations

Study Conclusion and Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

MTech-AI-Capstone-Project

Problem Statement

Challenges

Proposed Solution - The Cerebro Index Framework

Cerebro Index Framework - System Design

Human Validation of the CI Framework

Limitations

Study Conclusion and Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages