IBM-DATA-SCIENCE

A collection of notebooks, projects, examples, and resources for learning and applying data science concepts inspired by IBM's Data Science coursework and practical exercises. This repository is organized to make it easy to follow hands-on tutorials, reproduce experiments, and build small end-to-end projects.

Owner: ADVAIT135

About

This repository is intended to host educational and practical content for data science workflows including:

Exploratory data analysis (EDA)
Data cleaning and feature engineering
Statistical modeling and machine learning (scikit-learn, XGBoost, etc.)
Model evaluation and visualization
Short projects and capstone-style examples
Jupyter notebooks demonstrating concepts step-by-step

Use this repo to learn, experiment, and as a starting point for small data science projects.

Repository structure

A typical layout (adapt to actual contents in this repo):

notebooks/ — Jupyter notebooks (.ipynb) for lessons, experiments, and demos
data/ — small sample datasets used by notebooks (not for large data)
src/ — reusable Python modules and helper scripts
reports/ — generated reports, figures, and export artifacts
requirements.txt — Python package requirements for pip installs
environment.yml — conda environment specification (optional)
README.md — this file
LICENSE — license for the repository (if present)
tests/ — unit / integration tests (optional)

If some of these files or folders are missing, create them as needed or update this README accordingly.

Getting started

Prerequisites

Python 3.8+ (recommended)
Git
Optional: Anaconda/Miniconda if you prefer conda environments
JupyterLab or Jupyter Notebook for interactive work

Install (pip)

Clone the repo

git clone https://github.com/ADVAIT135/IBM-DATA-SCIENCE.git
cd IBM-DATA-SCIENCE

Create a virtual environment and install dependencies

python -m venv .venv
source .venv/bin/activate   # macOS / Linux
.venv\Scripts\activate      # Windows (PowerShell: .\.venv\Scripts\Activate.ps1)
pip install --upgrade pip
pip install -r requirements.txt

Start Jupyter
```
jupyter lab
# or
jupyter notebook
```

Install (conda)

If there is an environment.yml:

conda env create -f environment.yml
conda activate ibm-data-science
jupyter lab

Usage

Running notebooks

Open JupyterLab or Jupyter Notebook and navigate to notebooks/.
Execute cells top-to-bottom to reproduce analyses.
If a notebook requires data in data/, ensure files are present (see Data section).

Automated execution (headless):

# execute a notebook and write output notebook
jupyter nbconvert --to notebook --execute notebooks/example.ipynb --output notebooks/example-executed.ipynb

Running scripts & tests

Python scripts (utility modules) live in src/. Run with:
```
python src/some_script.py
```

If tests exist (pytest):

pip install -r requirements-dev.txt   # if provided
pytest -q

Reproducing results

Pin dependency versions in requirements.txt for reproducibility.
Where randomness affects results, set random seeds inside notebooks/scripts (e.g., np.random.seed(42), random.seed(42), and framework-specific seeds).

Notebooks & Projects

Each notebook should include at minimum:

Problem statement / objective
Data source and short description
Code cells split into logical steps (load, clean, explore, model, evaluate)
Clear visualizations and conclusions
Dependencies listed in the notebook metadata or a corresponding cell

Suggested naming convention:

notebooks/XX-brief-title.ipynb where XX is a two-digit ordering number, e.g. 01-exploratory-data-analysis.ipynb

Data

Small sample datasets can be placed in data/. For large datasets, prefer external links and provide instructions to download. Never commit large or sensitive datasets to the repository.

Example:

data/sample.csv — small anonymized sample used for demos
For external datasets, include a DATA_SOURCES.md (or a section in this README) listing download links and any required preprocessing steps.

Contributing

Contributions are welcome. Suggested workflow:

Fork the repository.
Create a feature branch: git checkout -b feature/my-change
Make changes, add tests if relevant.
Commit and push: git push origin feature/my-change
Open a pull request explaining the change.

Please follow a consistent style (PEP8 for Python). Add or update documentation and notebooks as needed. If you add new dependencies, update requirements.txt or environment.yml.

Consider adding:

CONTRIBUTING.md
CODE_OF_CONDUCT.md

License

This project is provided under the terms of the GPL-3.0 license. See the LICENSE file for details.

Contact

Repository owner: ADVAIT135

If you find issues, please open an issue in this repository. For questions or suggestions, you can also open a discussion or reach out via GitHub.

Thank you for using the IBM-DATA-SCIENCE repository — happy data exploring and modeling!

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
10. Applied Data Science Capstone		10. Applied Data Science Capstone
2. Tools for Data Science		2. Tools for Data Science
5. Python Project for Data Science		5. Python Project for Data Science
6. Databases and SQL for Data Science with Python		6. Databases and SQL for Data Science with Python
7. Data Analysis using Python		7. Data Analysis using Python
8. Data Visualization with Python		8. Data Visualization with Python
9. Machine Learning with Python		9. Machine Learning with Python
LICENSE		LICENSE
README.md		README.md
ibm-data-science-professional-certificate-v3.png		ibm-data-science-professional-certificate-v3.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IBM-DATA-SCIENCE

Table of Contents

About

Repository structure

Getting started

Prerequisites

Install (pip)

Install (conda)

Usage

Running notebooks

Running scripts & tests

Reproducing results

Notebooks & Projects

Data

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IBM-DATA-SCIENCE

Table of Contents

About

Repository structure

Getting started

Prerequisites

Install (pip)

Install (conda)

Usage

Running notebooks

Running scripts & tests

Reproducing results

Notebooks & Projects

Data

Contributing

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages