Project creator and Women Who Code director Charlotte Lee Jackson tweeted earlier this week that the project, which collects car crash data and compares it to DC Department of Transportation (DDOT) reports, completed a crucial step. It now automatically collects and refreshes data from multiple sources, and places it in an easy-to-read dashboard that compares whether or not it showed up in DDOT data.
The database starts by pulling from an app called Pulse Point, which shows what incidents fire and emergency services responded to. The problem there, Johnson said, is that the app doesn’t show what kind of traffic collision occurred. So it then turns to a site called OpenMHz, created by Luke Berndt, which collects crash report audio and converts it in .WAV files. This distinguishes which accidents involved pedestrians or cyclists. The Crash Bot code runs the audio files through AWS Transcribe. On top of that, it pulls information from a few Twitter accounts that report local crashes and sources data from the Citizen app to track down as many incidents as possible, and compare what’s been recorded across multiple sources.
I'm excited to finally share that we at @CodeforDC have built a pipeline that automatically collects data on 911 calls for ped/bike crashes, and checks whether they appear in @DDOTDC crash data. Crashes that were previously invisible are now tracked here: https://t.co/wRzZiWFLra
— Charlotte Lee | 미국아줌마 (@cljack) July 7, 2021
Johnson noted that open data sources were a crucial part of the project coming together.
“It’s not like I’m scraping encrypted data from a secret source or anything,” Johnson said. “It’s all here.”
Berndt’s OpenMHz in particular was a key point in getting the data together, since it had the audio files for all calls made. Berndt, who is based in DC, built the system using a software defined radio, which he said works like a police or fire scanner but can tune into more than one channel, so it can hear what dispatchers are saying to each other. He uses an RTL-SDR device to find radio and an open source program called GNU Radio to build a “software version of any radio receiver out there.” Some C++ code pulls the whole thing together, he added, to provide the .WAV files.
"Ideally, DDOT will start using this data when they are making these infrastructure and policy decisions about what changes to make to our streets."
“I put it out there and then I started with just what I was capturing in DC, but the software is open source and people have just started using it and contributing from around the country,” Berndt said.
OpenMHz mostly gets used by media outlets, Berndt said, but it has expanded nationwide. Firefighters use it for trainings, and regular citizens look into what’s going on in their neighborhood.
“[OpenMHz] started off with personal curiosity and I think it’s sort of grown into—there’s lots of people who just want to be more aware of what’s happening in their community and what’s around them, and figuring out what’s happening,” he said.
For Jackson, that meant compiling crash data for more accurate policy decisions. Good Hope Road SE, she said, is an example of a problem point that often gets missed by DDOT data. The Crash Bot project showed that DDOT reported three crashes in the area in the last six weeks, when there were at least five found via the project. She hopes that such firm examples can help with residents submitting traffic safety requests to show urgency, or that DDOT will use the project data itself.
“Ideally, DDOT will start using this data when they are making these infrastructure and policy decisions about what changes to make to our streets. Because it’s very different rates of unreported crashes by ward,” Jackson said. “And the data that DDOT is using right now for East of the [Anacostia] river…you can see exactly how incomplete it is.”
Berndt added that uses like the Code for DC project are an important aspect of OpenMHz that he would hate to lose if he made the site fully encrypted.
“Having that data and information be accessible, and not just as a boring record, but also the actual calls and sort of emotion and context around the things, is really important for just really telling the civic story around what cities and the people who work in them actually are doing on the day-to-day,” Berndt said.