Skip to content

Latest commit

 

History

History
218 lines (165 loc) · 11.4 KB

File metadata and controls

218 lines (165 loc) · 11.4 KB

License GitHub forks GitHub stars

Python Machine Learning Notebooks (Tutorial style)

Authored and maintained by Dr. Tirthajyoti Sarkar, Fremont, CA. Please feel free to add me on LinkedIn here.


https://miro.medium.com/max/1838/1*92h6Lg1Bu1F9QqoVNrkLdQ.jpeg

Requirements

  • Python 3.5
  • NumPy (pip install numpy)
  • Pandas (pip install pandas)
  • Scikit-learn (pip install scikit-learn)
  • SciPy (pip install scipy)
  • Statsmodels (pip install statsmodels)
  • MatplotLib (pip install matplotlib)
  • Seaborn (pip install seaborn)
  • Sympy (pip install sympy)

You can start with this article that I wrote in Heartbeat magazine (on Medium platform):

“Some Essential Hacks and Tricks for Machine Learning with Python”

Essential tutorial-type notebooks on Pandas and Numpy

Jupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, matplotlib etc.

Regression


Classification


Clustering

  • K-means clustering (Here is the Notebook).
  • Affinity propagation (showing its time complexity and the effect of damping factor) (Here is the Notebook).
  • Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery) (Here is the Notebook).
  • DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do) (Here is the Notebook).
  • Hierarchical clustering with Dendograms showing how to choose optimal number of clusters (Here is the Notebook).

Dimensionality reduction


Random data generation using symbolic expressions


Simple deployment examples (serving ML models on web API)


Object-oriented programming with machine learning

Implementing some of the core OOP principles in a machine learning context by building your own Scikit-learn-like estimator, and making it better.

Here is the complete Python script with the linear regression class, which can do fitting, prediction, cpmputation of regression metrics, plot outliers, plot diagnostics (linearity, constant variance, etc.), compute variance inflation factors.

See my articles on Medium on this topic.