U-M Blue Buses
Project Leads: Ulysses O'Donnell, Michael Peng
Project Members: Agam Kohli, Andrew Smirnov, Julia Spilkin, Oliver Wu
Students depend on U-M buses, but they are sometimes unreliable. We quantify their reliability via position reports from BusTime and have created a dashboard to visualize live operational statistics, "vitals".
The University of Michigan encompasses three regions at Ann Arbor: the Ross Athletic Complex, Central Campus, and North Campus. Traveling between these campuses by foot is often not feasible (especially in the winter), so a bus system is in place to transport students and staff between and within them. As students ride the Blue Buses to catch classes and meetings, their time-sensitive activities rely on the bus system's timeliness, so the bus system's reliability is important.
Unfortunately, some routes are not always reliable. Buses may not operate according to the officially published time intervals, and they can get completely full during rush hours, where unlucky passengers at a stop would have to let the full bus leave and wait for the next bus to arrive. The abundance of "Not In Service" buses at the Central Campus Transit Center is a persistent source of frustration for waiting passengers, especially if their wait exceeds the published time interval for their desired route. Complaints about the unreliability of buses are widely visible on Reddit.
With this project, we seek to evaluate the reliability of Blue Buses quantitatively using aggregations of real-time position reports from Logistics, Transportation and Parking (LTP), the campus unit in charge of the bus system. We intend to identify signs of unreliability, visualize them in real time, and publish them for public access and operational feedback to LTP. In the long term, we aim to identify external factors that influence the reliability of the bus system, such as weather, class schedules, and traffic conditions. We hope that our product will help LTP improve its operational strategies in order to make the Blue Buses more timely and reliable for all.
To monitor the bus system and offer live position reports to users, LTP uses BusTime, a software system developed by Clever Devices. Bus tracking apps (both official and third-party) retrieve route specifications and live position reports from the official LTP BusTime server with a documented API. We run a custom script that uses this API to retrieve live position reports for every Blue Bus every minute and stores these reports in a local SQLite database. Over an entire semester, this script gathered a large corpus of data that describes the history of every route execution by every bus, revealing many signs of unreliability that we encountered in person.
Our immediate goal was to interpret these historical position reports and transform them into figures that summarize the bus system's reliability. We achieve this through a canonical ranking of routes based on their performance and several real-time visualizations of key statistics for each route.
Ranking Bus Routes
The ranking of the bus system considered two different factors, delays and crowding. The crowding metric came from the ‘pslgd’ column, and had values ‘EMPTY’, ‘HALF EMPTY’ and ‘FULL’. These are clearly not that descriptive so we developed a way to get a numeric score from these factors. Ideally a bus isn’t empty, but also isn’t packed to full capacity, so the column ‘HALF EMPTY’ is ideal. This value was given a new numeric value of 0 and ‘EMPTY’ and ‘FULL’ buses were given a value of 1. For each entry in a route, these values were averaged to essentially get a percentage of time when buses had a good amount of people on them.
For delays, the column ‘dly’ gave us a Boolean of whether or not there was a delay. There was no further information given by the API discussing what constituted a delay or not, but we decided to average the delay column for each route, giving us a percentage of delays. The final step was to combine these two metrics with a given weight to get a final score for each route. Each metric was given 0.5 weight, but if someone prioritized non-crowded buses, they could increase the weight of that metric and get a new ranking. The use of non descriptive columns was a good starting point, but in the future we could get better metrics for delays and crowding such as the ones used in our visualizations. The ranking aspect was a separate sub-team so combining each team's results and findings would surely produce better results.
The only information given by BusTime was a boolean value representing whether or not a bus was on time. There was no other information regarding what constitutes a delay, and we found that many of the delays were near the end of routes where buses tend to take breaks.
Our goal was to infer real values of delay, not just whether or not they happened. “tatripid” is a column that had a unique ID for a given trip, where a trip constitutes an entire southbound or northbound route. These IDs were unique in a given day, but would be reused from day to day. The first step was to group the data for one day, then group the trips based on their tatripid. Extracting the first and last timestamps from each segment of the grouped data gave us the time for a bus to complete their route. This was done for each trip for each day in our dataset, and then averaged for an average time that each route should take. There was some outlier removal based on the route. For example, if Google Maps says a route should take 15 minutes then time differences below a certain threshold, say 7 minutes, and above a certain threshold, say 23 minutes weren’t considered in the average. Thresholds were set for each route manually, so some future work could be done to remove outliers more systematically.
Our visualization extracts each trip length from the past 6-10 hours of the specified date and plots the difference against the average time. An example of our results is shown below.
We infer the crowding situation in every bus using the psgld variable from BusTime, which takes on one of three values: EMPTY, HALF_EMPTY, and FULL. Given a collection of position reports for a particular route within a contiguous time period, we partition these reports into buckets representing each 10-minute interval within that period. In each bucket, we then tabulate the frequencies of each psgld value and graph these frequencies with a stacked, filled line graph. We generate a load factor for each bucket ranging from 0 to 1 based on these frequencies, where 0 represents completely empty and 1 represents completely full. This metric is intended to summarize the level of crowding for a route within a 10-minute interval regardless of the number of buses operating that route. An example of the crowding graph is shown for Bursley-Baits on April 22, 2022:
Another metric showing bus performance is its alignment with the preset schedule. LTP sets times that a bus should be at each stop. For example, Bursley Baits buses should depart every 10 minutes, so ideally a bus will be at each stop in the route every 10 minutes. To visualize this, we used the ‘pdist’ column in our data set, which represents the distance into the route. For each route, both southbound and northbound portions, we manually extracted a 'pdist' value that corresponded to a stop in the route. This would be used as a threshold, so that each time a bus passed the threshold it would be the time that the bus was at the stop.
The resulting visualization would plot the timestamps that the bus was at the stop, and the time that the bus should have been at the stop(represented by a vertical line). This displays if buses are coming to a stop too soon, too late or on time. From this information we can infer instances of bunching(where many buses in the route are very close together) or even delays.
Schedule Alignment for Bursley Baits Route
Each of these visualizations were made using Plotly Express, and this made for easy transferring to a dynamic dashboard using plotly dash. The only problem is it is incredibly slow and unoptimized due to time constraints. You may visit the dashboard, but embrace for high loading times and visualizations that aren't entirely accurate.
The dashboard shows each graph for six different routes for the previous 6 hours. This makes it easy to see how certain routes are performing. The dashboard also contains a page for all routes, displaying static rankings and visualizations about our data collection. Future work would include optimizing this dashboard and creating a dynamic ranking aspect.
An example of results from our ranking is shown in the figure below. A lower score is better as empty and full buses were given a score of 1, and values closer to 1 in the delay metric signal that a route experienced more delays. Northeast Shuttle is the best bus according to our metrics. It has a short route in a less popular area so this makes sense. The buses that are used the most, such as Bursley Baits and Commuter North are ranked lower, because naturally they are crowded more, and also have more buses which introduces more things that could go wrong.
Throughout the semester we aggregated data, extracted metrics, and built visualizations and systems to quantify and visualize bus performance. Using these tools, it is easy to see when and where buses perform poorly, and when they perform well. While our final result is far from a system that LTP could use efficiently, it offers a promising start.
In the future, optimization of our dashboard and further outlier removal would be necessary for a fully functioning and accurate dashboard. We extracted enough 'vitals' to use them to predict on-time performance, so implementing that to help both drivers and riders would be an exciting next step.