Unearthing city government data buried on tired mainframe computers could prove beneficial in two big ways:
- It would give Baltimore’s residents a better window into how City Hall works, while also providing civic hackers the raw data they need to visualize something complicated, like the property taxes paid by homeowners in 2012.
- It would give Baltimore’s city agencies the information they need to piece together historical patterns—about government spending, for instance—in order to make better choices.
“We want enough data and enough ways to look at it to start doing predictive analytics and start fixing problems before they even occur,” she said. “I’m excited, but I’m also like: Oh God. It’s a huge effort.”
For two years inside the Mayor’s Office of Information Technology — first as a contractor, then as an official city employee — it’s been Hudson’s job to maintain and update the city’s OpenBaltimore data portal, the place online where sets of raw data (on the number of vacant properties, on city employees’ salaries and more) are accessible to anyone with an Internet connection.
But now most of her attention will shift to data warehousing, a snoozer of a technical concept, albeit one with major implications. It’ll be Hudson’s job to get all the data, from all the city agencies, into one central repository. Instead of neatly summarized quarterly reports submitted by each office, all the numbers an agency would have to report will be in one database (hence “warehouse”), easy to find, and easy to pull for use by city government and for inclusion on OpenBaltimore.
In the digital age, this is how government can become smarter, leaner and more efficient. Even CitiStat, the performance-based measurement wing of city government—the very thing that tracks the effectiveness of Baltimore’s more than 50 government entities—still “relies heavily on Excel templates,” Hudson said.
And, for what it’s worth, the choice of Hudson hasn’t gone unnoticed by pivotal members of this city’s civic hacking community.
“She understands what the open data movement is all about,” said Shea Frederick, a developer at AOL/Ad.com and the creator of the Baltimore Vacants map. “She has already fought hard on the side of app developers to keep data available and flowing.”
Technically Baltimore spoke with Hudson about her new position.
TB: Is it hard to get raw data from city agencies? Do they see the need to provide numbers in addition to summary reports?
HH: They don’t necessarily get [data]: what’s the point? Why does anyone care? Especially if I ask them for raw data. They want to send us pretty, summarized reports. That’s not what I’m asking for. There’s no agency that really wants to hide anything. But there are lots of parts of the city that are understaffed, and they have things they’re focused on, and it’s not providing me data.
The bigger issue I run into is we have some antiquated systems here. The data’s in them and getting them out and into a nice format for OpenBaltimore is difficult, and in some cases, almost impossible.
TB: Antiquated systems—like, computer systems?
HH: Some mainframe systems have a relational database back-end. Ours do not. Our mainframe runs off of a flat file, with variable length. What it requires [for the city] is for a COBOL programmer to run a program that takes the master file out of the mainframe and dump it somewhere for us. It’s a huge file. We have to parse through all that. We’ve done this with water billing.
TB: Sounds complicated.
HH: They’re solid, strong systems that have been working for 30 years.
TB: But still, that’s where the city data is? Inside those mainframes?
HH: I wouldn’t mess with them except we want the data. If I can get the data out of those mainframes and into a relational database, it’ll be a lot less painful to migrate that data into a more modern system.
TB: You mentioned city agencies don’t see the point of providing raw data. How do you convince them it’s worthwhile to provide you — and, in turn, civic hackers — raw data?
HH: I try to show them examples of apps built on the kind of data I’m asking for in other cities. A great example: the Sheltr project from [Hack for Change]. I went to the Mayor’s Office of Human Services, who has that data for Baltimore. I showed them that Philly had this Sheltr app [for aggregating homeless services data], and they immediately provided me the homeless services data for Baltimore city.
The other thing that helps is when I explain to them that the agencies are users of OpenBaltimore as well. The liquor licenses [dataset] in OpenBaltimore: the police department would occasionally ask for a list of liquor licenses so they could patrol and keep an eye on different bars. They’d have to request it, and the Liquor Board would have to provide it. Now they can go online and get it whenever they want.
TB: You’ve been at the Mayor’s IT office for two years, and weathered the turnover of former CIO Rico Singleton. How do you feel about MOIT now, and the direction it’s headed?
TB: And that includes building up this data warehouse?
HH: [Before] I felt like we were doing it backwards. OpenBaltimore’s great, but we probably should’ve started with data warehousing. I almost kind of gave up on the idea. Chris [Tonjes] said, well, of course we should be doing that. There’s time savings and money savings. It’s a no-brainer.