Why Pittsburgh's Innovation and Performance team takes an open-source approach to open data - Technical.ly

Civic

Aug. 1, 2018 10:31 am

Why Pittsburgh’s Innovation and Performance team takes an open-source approach to open data

Senior Digital Services Analyst Tara Matthews on the city department's commitment to democratizing the outcomes of its work.

Pittsburgh's City-County Building.

(Photo via Wikimedia Commons)

This is a guest post by Tara Matthews, the senior digital services analyst at the City of Pittsburgh's Department of Innovation and Performance.
Picture it: Pittsburgh, Pennsylvania. October 2015.

It was the birth of what would be named the Western Pennsylvania Regional Data Center (also known as the WPRDC, also known as “Whopper Duck”), an all-star collaboration between the City of Pittsburgh, Allegheny County and the University of Pittsburgh.

This put us in the unique position of hosting not just city and county data, but data from non-governmental organizations such as the Carnegie Library and Bike PGH, as well as other local service providers such as the Port Authority of Allegheny County. This required a specialized set-up, which is why WPRDC is based in CKAN, an open-source data management system that allowed for a completely custom configuration.

The Data Center launch coincided with the kickoff of the city’s Open Data program, managed by the city’s Department of Innovation and Performance. When the Open Data program was in its infancy, I was an intern tasked with crowdsourcing ideas for our first open datasets. We asked a simple question: “What data, owned by the City of Pittsburgh, would you most like to see?” This project, known as the Open Data Forum, received dozens of suggestions, hundreds of votes and helped influence our data priorities moving forward.

How useful is published data if no one’s using it?

Three years later, I was handed the keys to the Open Data program — which has grown significantly since launch. One thing that hasn’t changed, however, is our commitment to a customized, DIY approach using open-source tools.

From the beginning, it was important that we were given the freedom by executive management to develop our own tools and build exactly what we wanted. This allowed us to focus our initial efforts into an effective ETL — Extract, Transform and Load — system to ensure that we could easily host a variety of datasets from all kinds of systems, from complex databases to simple spreadsheets.

But how useful is published data if no one’s using it?

Our second priority was to democratize our data — to make it available in a way where anyone, whether they were a seasoned power user or a total novice — could engage with city data, through maps, dashboards and interactive tools. We tried a variety of solutions but none of them gave us exactly what we needed — which was the ability to build something that was tailored exactly to our needs, while being able to adapt quickly to feedback from city departments and members of the public.

Advertisement

This lead to the creation of Burgh’s Eye View (BEV), our flagship tool for visualizing city operations data. It was developed and maintained by Geoffrey Arnold, who built it entirely from the free program R-Studio, using R Shiny.

While our next major initiative in the pipeline doesn't face the general public in the same way, it is easily as important.

The initial customers for this tool were city departments — BEV provided a simple way to show their work on an interactive map. It became clear that this would be a useful tool for the public as well. We wanted to know what people actually wanted, so we took our show on the road and visited over 30 neighborhood meetings all over the city to get community feedback. We heard a lot of requests and suggestions, many of which became new features.

The tool has grown from a map of 311 complaints to an entire suite of tools that contain information ranging from incidents of crime to the tax delinquency status of a property to the location of all city-affiliated recycling sites, all with a cute penguin mascot.

While our next major initiative in the pipeline doesn’t face the general public in the same way, it is easily as important — and it is called Data Rivers. Returning to our priority of easily making data accessible, we are redesigning our entire data delivery process from the ground up, using open-source tools for a streamlined solution that is managed in-house. The plan for Data Rivers is to transform our old data delivery model, which we affectionately built out of duct tape and bubble gum — into a professionalized, elegant system that will make it easier to publish the data that our users want to see.

While Data Rivers is still in the development phase, we want to keep our work as transparent as possible. In the coming months, we will be publishing our Data Publisher Resources, as well as the source code to many of our applications on our GitHub page.

In the meantime, we plan to keep that do-it-yourself attitude as we continue to build our digital services team. If we’re hoping that our residents can take data into their own hands, why can’t we?

Get Pittsburgh stories in your inbox weekly -30-
JOIN THE COMMUNITY, BECOME A MEMBER
Already a member? Sign in here

Advertisement

3 technologists’ hopes for a more open Pittsburgh

How to build a civic tech community, according to Pittsburgh leaders

5 ways Pittsburgh’s public servants are using human-centered design

SPONSORED

Baltimore

SmartLogic propels digital transformation aboard Baltimore Water Taxi

7 reasons you should attend Open Data PGH’s Future of Pittsburgh Civic Tech event

Here’s why Idea Foundry thinks increasing access to entrepreneurship is like civic tech

Fighting fire fatalities with data in Allegheny County

SPONSORED

Baltimore

Join our Technical.ly Match beta, an opt-in alternative to recruiting

Sign-up for daily news updates from Technical.ly

Do NOT follow this link or you will be banned from the site!