Natalya (Natasha) Smith drnsmith

Hi 👋, I'm Natasha — a Quantitative Data Scientist working at the intersection of statistical modelling, machine learning, and complex systems.

My background spans business and economics, econometrics, risk analysis, and computer science. I started in business economics, moved into statistical modelling and uncertainty analysis during my PhD, and later added computer science to strengthen my engineering foundations.

I now build applied data and AI systems across the full workflow: data pipelines → modelling → evaluation → deployment.

I’m especially drawn to messy, high-stakes problems where the data is imperfect, the signal is partial, and the cost of getting it wrong is not trivial. That includes work in areas such as healthcare, risk, forecasting, trust, and decision support.

My work typically focuses on:

statistical modelling, inference, and uncertainty analysis
machine learning across structured, unstructured, image, and time-series data
NLP and LLM-based systems where retrieval, evaluation, and reliability matter
data pipelines, analytics engineering, and reproducible workflows in Python and SQL
explainability, diagnostics, and decision support for real-world use

My approach is straightforward: strong statistical reasoning, practical engineering, and clear communication.

💼 Featured Projects

Spectral Drug Verification

Built a verification workflow for comparing measured spectral signatures against reference compound libraries, with a focus on chemically similar classes, concentration differences, and classification reliability. A scientifically grounded project centred on signal quality, similarity structure, and decision confidence.
Histopathology AI for Breast Cancer Detection

Developed deep learning pipelines for breast cancer image classification using transfer learning, class-balancing strategies, and interpretability methods. This work explores not just model performance, but the practical challenges of medical image analysis in imbalanced settings.
Air Quality Forecasting (PM10)

Built forecasting models for PM10 pollution using regression and neural network approaches, including LSTM-based experiments. Focused on temporal modelling, feature design, and comparing methods for environmentally meaningful prediction.
Big Data Sentiment Analysis of NASDAQ Companies

Analysed more than 4 million tweets about NASDAQ-listed firms using Hadoop, MapReduce, Hive, and NLP techniques. The project combines large-scale text processing with sentiment analysis to study public opinion in financial contexts.
Customer Churn Prediction & Retention Analytics

Built an end-to-end churn modelling pipeline using XGBoost and SHAP, with an interactive analytics layer for interpreting risk drivers and supporting retention decisions. Designed to connect predictive modelling with usable business insight.
Analytics Data Warehouse & ETL Design

Designed a structured analytics warehouse with ETL workflows and dimensional modelling principles to support reporting and downstream analysis. This project reflects the data engineering side of data science: clean inputs, reliable structure, and usable outputs.

📝 How I Think & Where I Write

I write about systems under strain: data systems, decision systems, biological systems, and the human tendency to misunderstand all three.

My work sits somewhere between data science, AI, uncertainty, physiology, and philosophy of modern life. The themes vary, but the underlying interest is constant: complex systems, failure modes, and the gap between reality and the stories we tell about it.

I publish on Medium, Substack and LinkedIn. New connections are always welcome!

🛠 Tools & Stack

Languages

ML & Data Science

Engineering & Infrastructure

Visualisation & Dashboards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Natalya (Natasha) Smith drnsmith

Block or report drnsmith

Hi 👋, I'm Natasha — a Quantitative Data Scientist working at the intersection of statistical modelling, machine learning, and complex systems.

💼 Featured Projects

📝 How I Think & Where I Write

🛠 Tools & Stack

Popular repositories Loading

Uh oh!