A comprehensive collection of 20 Jupyter notebooks covering Python programming from fundamentals to advanced topics in data science, machine learning, and computational biology.
Core Python programming concepts essential for scientific computing.
- 01_Python_Basics_DataTypes.ipynb - Data types, variables, operators, strings, lists, tuples, dictionaries, sets
- 02_Python_ControlFlow.ipynb - Conditional statements, loops, list comprehensions, generators, exception handling
- 03_Python_Functions.ipynb - Function definition, parameters, scope, lambda functions, decorators, recursion
- 04_Python_Classes_OOP.ipynb - Object-oriented programming, inheritance, encapsulation, polymorphism
- 05_Python_Modules_FileIO.ipynb - Modules, packages, file I/O, directory operations, context managers
Essential libraries for numerical computing and data manipulation.
- 06_NumPy_Fundamentals.ipynb - Arrays, broadcasting, linear algebra, random sampling, vectorization
- 07_Pandas_Basics.ipynb - Series, DataFrames, data selection, cleaning, basic operations
- 08_Pandas_Advanced.ipynb - GroupBy, merge/join, pivot tables, time series, multi-index
Creating informative and publication-quality visualizations.
- 09_Matplotlib_Seaborn_Visualization.ipynb - Matplotlib basics, Seaborn statistical plots, customization
Advanced numerical methods and statistical analysis.
- 10_SciPy_Scientific_Computing.ipynb - Statistics, optimization, interpolation, signal processing
- 11_Statistical_Analysis.ipynb - Probability distributions, hypothesis testing, confidence intervals
Comprehensive machine learning with scikit-learn.
- 12_Scikit_Learn_MachineLearning.ipynb - Classification, regression, clustering, model selection, pipelines
Domain-specific applications and tools.
- 13_Text_Processing_Regex.ipynb - Regular expressions, text cleaning, pattern matching
- 14_Biopython_Computational_Biology.ipynb - Sequence analysis, file parsing, bioinformatics workflows
Professional development and production-ready code.
- 15_Python_Standard_Library.ipynb - collections, itertools, functools, datetime, pathlib
- 16_Advanced_Data_IO.ipynb - Pickle, Parquet, HDF5, SQL databases, REST APIs
- 17_Testing_Debugging.ipynb - pytest, unittest, pdb debugging, logging, best practices
Cutting-edge tools and techniques for modern data science.
- 18_Deep_Learning_Basics.ipynb - TensorFlow/Keras, neural networks, transfer learning
- 19_Performance_Parallelization.ipynb - Profiling, multiprocessing, joblib, numba optimization
- 20_Web_Scraping_APIs.ipynb - requests, BeautifulSoup, API integration, web data collection
Start with Part 1 (01-05) to build a solid Python foundation, then progress through Parts 2-4.
Focus on Parts 2-4 (06-11) for data manipulation, visualization, and statistics. Add Part 7 (15-17) for professional skills.
Review Part 2 (06-08) for data preprocessing, Part 5 (12) for classical ML, then Part 8 (18-20) for deep learning and production.
Complete Parts 1-2 (01-08) for fundamentals, then focus on Part 6 (13-14) for specialized bioinformatics tools.
Complete all 20 notebooks in order for comprehensive coverage: 01 → 20
- Theory + Practice: Each notebook combines conceptual explanations with hands-on examples
- Progressive Complexity: Topics build from basic to advanced concepts
- Real-world Examples: Practical workflows and complete analysis pipelines
- Well-commented Code: Clear explanations using
###++++++++++section markers - Interactive Output: All cells configured to display results
- Production-Ready: Professional best practices throughout
# Create conda environment
conda create -n python_ds python=3.9
conda activate python_ds
# Install core packages
conda install numpy pandas matplotlib seaborn scipy scikit-learn jupyterlab
# Install additional packages
conda install -c conda-forge biopython h5py pyarrow
pip install pytest requests beautifulsoup4 tensorflow joblib numba# Start Jupyter Lab
jupyter lab
# Or Jupyter Notebook
jupyter notebook- Follow the order: Notebooks are numbered for progressive learning
- Run all cells: Execute cells sequentially to understand the flow
- Experiment: Modify examples to deepen understanding
- Use as reference: Return to notebooks for syntax and method references
- Check NOTEBOOK_INDEX.md: Quick reference for all topics
Notebooks are compatible with:
- Python 3.8+
- NumPy 1.20+
- Pandas 1.3+
- Matplotlib 3.3+
- Seaborn 0.11+
- SciPy 1.7+
- Scikit-learn 1.0+
- TensorFlow 2.8+
- Biopython 1.79+
| Metric | Value |
|---|---|
| Total Notebooks | 20 |
| Total Parts | 8 |
| Code Cells | ~450+ |
| Topics Covered | 100+ |
| Estimated Time | 50-70 hours |
✅ Core syntax, data structures, control flow ✅ Functions, decorators, generators ✅ Object-oriented programming ✅ Standard library mastery
✅ NumPy for numerical computing ✅ Pandas for data manipulation ✅ Statistical analysis and hypothesis testing ✅ Data visualization
✅ Classical ML algorithms (scikit-learn) ✅ Deep learning basics (TensorFlow/Keras) ✅ Model selection and evaluation ✅ Production pipelines
✅ Testing and debugging ✅ Performance optimization ✅ Database and API integration ✅ Web scraping and data collection
✅ Bioinformatics with Biopython ✅ Text processing and NLP basics ✅ Time series analysis ✅ Parallel processing
- README.md (this file) - Complete curriculum overview
- NOTEBOOK_INDEX.md - Quick reference guide with topic index
- COVERAGE_ANALYSIS.md - What's covered and why
- MIGRATION_GUIDE.md - Change history and organization
Feel free to:
- Report issues or errors
- Suggest improvements
- Add new examples
- Request additional topics
These educational materials are provided for learning purposes.
This curriculum covers:
- ✅ Complete Python fundamentals from scratch
- ✅ Industry-standard tools (NumPy, Pandas, scikit-learn, TensorFlow)
- ✅ Professional practices (testing, debugging, optimization)
- ✅ Real-world applications (APIs, databases, web scraping)
- ✅ Specialized domains (bioinformatics, NLP, deep learning)
After completing this curriculum, you will be able to:
- Write production-quality Python code
- Perform comprehensive data analysis
- Build and deploy machine learning models
- Work with various data sources (SQL, APIs, files)
- Optimize code for performance
- Debug and test professional applications
- Apply Python to specialized domains
Current Version: 2.0 (20 notebooks - Complete Professional Edition)
Last Updated: 2025-10-06
Status: ✅ Complete comprehensive curriculum
Happy Learning! 🎓
For questions or feedback, please open an issue in this repository.