Mohiraa_Shafreen Shaflovescoffee19

Hi there, I'm Mohiraa Shafreen!

I'm a bioengineer who got pulled into computational biology and never looked back. My background is wet lab (DNA extractions, PCR, microbial cultures), but somewhere along the way, I got fascinated by what you can learn from data that you simply cannot see at a bench.

I'm interested in building the computational tools that read that story accurately and I'm currently looking for roles at the intersection of ML, multi-omics, and precision medicine.

Connect with me on LinkedIn Mohiraa Shafreen | Check out my publications Google Scholar | For research or project collaborations, feel free to reach out at [email protected] |📄 Download CV

📚Background

Bioengineer with a Gold Medal (B.Tech Biotechnology) | M.Tech First Class | Bioinformatics Industrial Internship
Published 8 peer-reviewed articles | 190+ citations
Integrated expertise in computational workflows and wet lab (both sides of the bench😉)

What I've been building

Over the past few months I built a 10-project ML portfolio working through the techniques that keep showing up in computational biology research. Started from scratch, went in order, and tried to actually understand each method rather than just run the code.

Project	What I learned	Techniques	Why I built it
Heart Disease EDA	How to actually read a dataset before touching a model	pandas, seaborn, statistical analysis, visualisation	Most tutorials skip straight to modelling. I wanted to get this part right first
Diabetes Data Cleaning	Real medical data is messy and cleaning it properly takes longer than modelling	Missing data imputation, IQR outlier capping, feature engineering, scaling	Dirty data breaks everything downstream and I wanted to understand how to fix it properly
Cancer Risk Classification	When the simplest model wins and why that is not a failure	Logistic regression, Random Forest, XGBoost, AUC-ROC, cross-validation	Needed to understand the core classification algorithms and how to evaluate them honestly
Survival Analysis	Time-to-event modelling has its own entirely different logic from classification	Kaplan-Meier, log-rank test, Cox Proportional Hazards, C-index	This comes up constantly in clinical research and I had no idea how it worked
Customer Segmentation	Finding structure in data without being told what to look for	K-Means, Elbow Method, Silhouette Score, PCA	Unsupervised learning is everywhere in omics research and I had never properly done it
Gene Expression Clustering	RNA-Seq data has its own preprocessing rules and skipping them breaks everything	Log transformation, variance selection, hierarchical clustering, heatmaps	I work with this kind of data and wanted to understand the pipeline from raw counts to clusters
Explainable AI with SHAP	A model nobody can explain is a model nobody will trust or use	TreeExplainer, beeswarm, waterfall plots, bootstrap stability	Interpretability matters a lot in clinical contexts and I wanted to go beyond feature importance
Counterfactual Explanations	Turning a risk score into something a person can actually act on	Actionable counterfactuals, diverse CF generation	SHAP tells you why. Counterfactuals tell you what to change. Both matter
Multi-Modal Data Fusion	Genomic, microbiome, and clinical data together tell a story none of them can tell alone	Early/late/intermediate fusion, stacking ensemble, ablation study	Multi-omics integration is the problem I most want to work on and this is its core technical challenge
Transfer Learning	When your target population is small you need a model that borrows knowledge not one that starts blind	Neural network pre-training, layer freezing, fine-tuning, learning curves	Small and underrepresented cohorts are a real problem in genomics research and this is how you address it

💻 Technical Skills

💡Core Bioinformatics Expertise

🧬 NGS Pipelines: Quality control → Alignment → Quantification → Analysis
🔬 Variant Analysis: VCF processing, annotation, population genetics
🦠 Metagenomics: Taxonomic profiling, diversity analysis, phylogenetics
🧬 RNA-seq Analysis: Reference-based & De novo

🔬 Wet Lab Expertise

Domain	Techniques	Sample Types
Microbiology	Microbial isolation, antimicrobial screening, Monod modeling, growth kinetics	Bacterial cultures, environmental samples
Molecular Biology	DNA/RNA extraction, PCR, qPCR, RT-PCR	Bacteria, blood, feces, plant tissues, soil, water, fungi
Biochemistry	Enzymatic assays, protein quantification, metabolite extraction, purification & analysis	Cellular extracts

🤓Fun Fact: When I'm not sciencing, I read a lot of books, hoard them, watch a lot of movies, and analyze more film plots on Letterboxd @manicindisguise(＾▽＾) 📊➡️🎬

PS: Currently reading Project Hail Mary by Andy Weir (microbiologists would especially LOVE it). The movie trailer blew me away… can’t wait to see if the movie is better than the book, or if the book wins out as always.

𝗨𝗽𝗱𝗮𝘁𝗲: The movie is 𝘀𝗼𝗼𝗼𝗼 𝗴𝗼𝗼𝗱. Book = Movie. No notes.

And for all the Game of Thrones fans out there.. This is actually from George R. R. Martin

If you like a lot of science in your science fiction, Andy Weir is the writer for you....

Provide feedback

Saved searches

Use saved searches to filter your results more quickly