P.S: Apologies for the unedited and rambly video, I was running out of time and didn't make a better one.
Inspiration
The largest and most transit-dependent city in the nation, New York City, is currently undergoing a full overhaul of their bus network, in an effect to meet modern travels demands because the NYC bus route network is incredibly outdated. However, their process of redesigning is extremely slow and manual; there must be a better way to serve the needs of millions of daily travelers.
What it does
Given a street grid (represented as a weight graph), commute data (a set of start and end points), and an integer n, it attempts to find the optimal set of n bus routes that optimize a cost function $\alpha(Walk) + \beta(Bus)$. For a set of routes, the cost is the sum of the minimum cost for each set of start and end points.
I do not know of an analytical solution to this problem. Therefore, I tried solving from a reinforcement learning lens, formulating the MDP as follows:
- States: a set of routes. The routes are paths in the graph that cannot have branches or cycles.
- Actions: single edge addition or deletion to the set of routes. This way, during a trajectory, it "builds" an optimal route network by directly adding or removing edges to routes. When the action space is generated, it preserves the no cycles or branches restriction.
- Rewards: the negative total cost delta. This way, the trajectory could compute the total cost as the sum of rewards, although not exactly with discounting.
The action space and rewards are generated dynamically when the RL agent reaches a state. This is done since it is computationally infeasible to create a whole RL model, so we use a model-free approach with Q-learning.
The way the cost is computed is by BFS-ing from every start and end point before the algorithm runs, caching the min distance to every point. Then, for a set of routes, it is easier to find the cost it would take a single commuter to take a specific route.
Obviously, it probably needs a lot more training and a better RL algorithm, but I was able to code a proof-of-concept today and get it to actually draw routes. For n=2, a windy route is probably, technically, optimal from how the points I generated. Curious to see how it does scaled up.
What's next for Algorithmic Bus Network Redesign
There's a ton of stuff you can scale up. First of all, we can actually train it for a long time and see what happens on this small test case. We can scale up the size of the graph, the number of routes, the number of points, etc ...
Log in or sign up for Devpost to join the conversation.