Skip to content

simantalahkar/simantalahkar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 

Repository files navigation

๐Ÿ‘‹ Hey there! I'm Simanta Lahkar

๐Ÿ”ฌ Computational Physicist turned Scientific Software Developer and Data Engineer
๐Ÿง‘โ€๐Ÿ’ป Currently designing databases for large-scale atomistic simulation data at TU Eindhoven ร— IBM Research
๐Ÿš€ Passionate about bridging scientific computing with modern data engineering and AI technologies
๐Ÿ’ก Love building tools that make complex scientific and engineering workflows accessible and scalable
๐ŸŒ Based in Den Bosch, Netherlands
โšก Fun fact: I enjoy cooking, drone cinematography, and swimming when not debugging simulation data pipelines!

๐ŸŒ Connect with Me:

Portfolio LinkedIn Email

๐Ÿ’ป Tech Stack:

Core Programming & Development

Python SQL C++ Git

Data Engineering & Big Data

Apache Spark Apache Airflow PostgreSQL Docker

Data Analysis & Visualization

Matplotlib Plotly Excel

Scientific Computing & Machine Learning

NumPy Pandas scikit-learn TensorFlow MATLAB

Cloud & Infrastructure

Linux Databricks Jupyter Notebook

๐Ÿš€ What I'm Working On:

๐Ÿ”ฌ Scientific Cloud-Native Data Infrastructure & AI Integration

Building an open-source, cloud-native pipeline for large-scale molecular dynamics data. Using MinIO for scalable storage, Apache Spark and Delta Lake for transforming raw trajectories into structured formats, and Trino for fast SQL querying. Integrating MLflow for reproducible AI workflows and orchestrating everything with Apache Airflow. Focused on scalable, metadata-rich infrastructure for scientific computing.

๐Ÿงฌ LAMMPSKit - Production-Ready Scientific Package

GitHub PyPI Developed a modular Python toolkit for LAMMPS simulation analysis, backed by 270+ tests (94% coverage), Dockerized for portability, and powered by robust CI/CD. Achieved 60% memory savings and 40% faster performance compared to typical scientific scripting workflows.

โš›๏ธ LAMMPS Extension for Electrochemical Simulations

GitHub Extended LAMMPS with C++ to integrate two open-source packages for novel electrochemical device simulations, navigating complex licensing and attribution challenges.

๐Ÿ’ผ Professional Focus:

๐ŸŽฏ Seeking opportunities in:

  • Scientific Software Development & Computational Materials Science
  • Data Engineering, Analytics & Data Stewardship
  • Modeling & Simulation Engineering
  • AI/ML Applications in Scientific Computing

๐Ÿ”ง Core Expertise:

  • Data Analysis & Insights: Statistical analysis of large scientific datasets with advanced visualization
  • Materials Science Modeling: Molecular dynamics simulations, DFT calculations, and multi-scale modeling
  • Performance Optimization: Algorithmic improvements achieving significant memory and speed gains
  • Data Pipeline Architecture: Real-time streaming and batch processing for scientific workflows
  • Data Governance: Metadata management, data quality assurance, and reproducible research practices
  • Full-Stack Scientific Computing: From Python APIs to C++ algorithms to cloud deployment
  • Production Software Development: CI/CD, automated testing, containerization, and package distribution

๐Ÿ“Š Current Learning Journey:

๐ŸŒฑ Databricks Certified Data Engineer (in progress)
๐ŸŒฑ Cloud-native data lake architectures & data governance
๐ŸŒฑ Advanced statistical analysis and predictive modeling
๐ŸŒฑ Graph-based ML for scientific applications
๐ŸŒฑ Natural language interfaces for scientific databases

๐ŸŽ“ Background:

PhD in Materials Science & Engineering from Shanghai Jiao Tong University with expertise in computational modeling, machine learning, and numerical methods. Transitioned from pure research to building production-ready scientific software that solves real-world problems.

Key Achievement: Led IBM collaboration resulting in 10x device stability improvement through innovative simulation algorithms and data processing pipelines.

๐Ÿ—ฃ๏ธ Languages:

  • ๐Ÿ‡ฌ๐Ÿ‡ง English (Professional - C2)
  • ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch (Learning - Beginner)
  • ๐Ÿ‡จ๐Ÿ‡ณ Chinese (Basic - A1)
  • ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, Assamese, Bengali (Native)

๐Ÿ’ฌ Let's connect! I'm always excited to discuss scientific computing, materials science research, data engineering challenges, or opportunities to make complex data more accessible through better analysis and visualization. Whether you're looking to optimize simulation workflows, design scalable data architectures, implement data governance, or bridge the gap between research and production - I'd love to hear from you!

๐Ÿ“ซ Reach out: [email protected] | LinkedIn | Portfolio

About

Coffee-powered problem solver - currently designing databases that don't hate scientists

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors