Skip to content

oumaymaih/Transport_system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Public Transport System with Sharded MongoDB

Overview

This project implements a distributed MongoDB cluster with sharding and replication to manage large-scale public transport data.

It simulates a real-world transportation system handling:

  • Millions of tickets
  • Thousands of passengers and buses
  • Multi-city data distribution

The system is designed to ensure:

  • Scalability
  • High availability
  • Fault tolerance

Architecture

The system is based on a MongoDB Sharded Cluster:

  • 3 Shards (data distribution)
  • Replica Sets (fault tolerance)
  • Config Servers (metadata)
  • Mongos Router (query routing)

Data Flow

  1. Client sends query to mongos
  2. Mongos routes query to relevant shard(s)
  3. Each shard processes locally
  4. Results are aggregated and returned

Dataset

Collection Amount of Data
Tickets 1,000,000
Passengers 100,000
Bus 960
Lines 320
Stations 5686

- Data Description

This folder contains sample datasets used for demonstration purposes.

- Structure

  • sample/ → small data samples for testing
  • schema/ → data structure definitions

- Note

Full datasets (millions of records) are not included due to size constraints.


Sharding Strategy

  • Shard Key: ville
  • Strategy: Range-based sharding

Why this choice?

  • High cardinality
  • Frequent filtering by city
  • Uniform distribution
  • Stable field

Features

  • Distributed storage using MongoDB
  • Horizontal scaling via sharding
  • Replication for high availability
  • Fault tolerance testing
  • Analytical queries (aggregation)

Fault Tolerance Test

✔ Reading continues even if a node fails
✔ Data remains accessible via secondary nodes
✔ Automatic failover supported


Example Analytics

  • Number of tickets per city
  • Most used transport line
  • Time-based filtering of tickets

Technologies

  • MongoDB (Sharding + Replication)
  • Docker & Docker Compose
  • PyMongo (optional integration)
  • Python (for frontend visualization)

Additional Resources

For more details about the project setup and implementation:

  • Setup guide: see docs/setup-guide.md
  • Full command list: see scripts/commands.txt

These resources provide step-by-step instructions and cluster configuration details.

About

This project implements a distributed MongoDB cluster with sharding and replication to manage large-scale public transport data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors