Software Development
Data / Municipal government

Why the City of Philadelphia is making GitHub part of its open gov outreach

The City of Philadelphia has embraced Github as a platform to share its own data releases. Here's how Chief Data Officer Mark Headd is developing it as a part of a workflow.

Hiring Mark Headd as the first chief data officer of the City of Philadelphia was supposed to be a pathway to modernizing the open data efforts of one of the region’s largest IT teams. For technologists, one of the more memorable of those steps has been the local government team growing into the most active U.S. city on GitHub, the popular software sharing and collaboration platform that is a mainstay for today’s open source developers.

Many of the City’s public GitHub repositories contain releases of data from several different City departments, addressing diverse topics including real estate tax under the Actual Value Initiative (AVI), polling locations and police complaints.

The open data initiative’s primary audience is Philadelphia’s developer and civic hacking communities. The city shares data and, in return, the general public gets access to these third-party applications, visualizations and uses.

There’s no precise way of determining whether visitors are dreaming up software or whether they’re just trying to obtain information. But GitHub recently introduced traffic analytics for tracking a repository’s referring sites and its popular content.

“We want to know who’s using our data and how they’re using it,” Headd said, “because it helps us make the data better and helps us make the case that more data should be released. But at the same time we want to be mindful that we don’t want to appear to be too Big Brother-ish.”

That very line, of what will become normal workflow for local governments, is still being shaped. City CTO Adel Ebeid, who oversees Headd’s team, is battling what corners of today’s dev community are worth building into their process and what might less likely last.

Headd is saying one of those corners worth building is leveraging a popular resource to broadcast and improve city IT releases, like GitHub, which many use to post and share their repositories of Git, a distributed version control system. (Find the differences between GitHub and Git here)

GitHub users – primarily developers – use Git to deploy their code to a public repository, in this case city agencies and IT leads. From there, members of the public and other developers can discuss the code, review differences between versions, file bug reports or make their own copy of the repository.

One feature has proven particularly enticing for Headd’s open data team.

GitHub renders certain filetypes according to the format that suits them best. CSV files display as tables, GeoJSON files display as interactive maps and STL files render as 3D models. The City collects and releases large amounts of geospatial data on subjects like bicycle thefts, city-owned buildings, farmers markets and police districts.

These maps can also be embedded in other websites.

“I hope we are able to leverage that embeddability that GitHub provides so that we can more easily share our data,” Headd said. “We can bring people back to one canonical source for that data [instead of] seeing different versions of it in different places.”

GitHub staff member Yaroslav Shirokov demos GeoJSON embedding using Philadelphia’s farmers market location dataset. GIF by Yaroslav Shirokov of GitHub.

Naturally, the city government is a “canonical” and centralized source of information, but decentralization is a core value of the open data team and it’s ingrained in the nature of Git, the free software that powers the entire process.

Headd notes that ideally, the city departments themselves should publish and maintain their data releases.

“We’ll create a repo for them and give them the rights to it and they can push this data whenever they believe it’s appropriate,” Headd said. “So the data may change at different intervals depending on what it is. We want to empower [staff members] to actually do this. We don’t want this all to be centralized – we want it to be as decentralized as we can make it, while still making it useful and efficient.”

But getting the data released at all can sometimes be an issue in itself. Moving the city’s data to GitHub “required some formal documentation and training,” Headd said. “It’s something we’re moving up to a little bit more slowly because [many staff members are] not as familiar with the tool, it’s not a part of their everyday workflow, and they may have managed their data differently in the past.”

A comparison of the same repository’s commit (change) log in Git’s command-line interface (left) and in Git Tower, a graphical user interface for Mac OS X (right).

While Git was originally just a tool for users comfortable with command line interfaces, its popularity spawned many graphical user interfaces (GUIs), making Git even more accessible to the general public. Without GUIs, Git would likely not be a viable option for the open data group.

“Since many city users are on Windows,” Headd said, “we’ve demoed a workflow internally that uses the GitHub for Windows client. But we don’t mandate a particular client. If a particular user is more comfortable on Linux or Mac they can use another client, or just rock the command line.”

Companies: GitHub / City of Philadelphia
People: Mark Headd
Projects: Philadelphia Neighborhoods

Knowledge is power!

Subscribe for free today and stay up to date with news and tips you need to grow your career and connect with our vibrant tech community.

Technically Media