Professional Development

City data analysis: Just one example of how LLMs and AI can help efficiency

"Leveraging public datasets on top of LLMs models is a powerful way to help workers get answers to questions at scale," writes DataGovs leader Gregory Johnson.

An Open Data Govs platform examination of San Francisco. (Courtesy image)

How can the mainstream adoption of language learning models (LLMs) and other generative AI tools help workers be more productive and efficient in their work?

Data analysis is one area where the use of prompts can enable non-technical professionals to work with data more easily, without the need for assistance from developers or data practitioners. This approach can make data analysis more accessible to workers who lack technical skills in SQL or other programming languages.

(Courtesy image)

Leveraging public datasets on top of LLMs models is a powerful way to help workers get answers to questions at scale. For example, HR databases typically contain information on salaries, which is often available in open data served by governments. However, finding this data can be time-consuming and challenging, requiring searching through PDFs, spreadsheets or data portals. By using natural language prompts powered by the latest LLMs, workers can get answers to their queries quickly and easily without the need for technical expertise.

(Courtesy image)

For instance, a natural language prompt can be used to extract information from government datasets, such as the City of San Francisco’s open data portal. By asking a question like “Where are all the poop complaints located in SF?” through a prompt, the LLM interface can provide a table, an SQL query, and a map showing the locations of these complaints. This information can help workers make informed decisions and recommendations based on data insights, such as identifying areas in SF where more public restrooms should be set up based on the complaint data.

(Courtesy image)

The adoption of generative AI and LLMs in the workforce will undoubtedly create many tools that improve workflows and save time for workers. However, this approach also poses its own set of challenges, such as ensuring that the data used is reliable, high-quality, and not false. Having a common data model and an adequate amount of training data is also essential for effective data analysis. Nonetheless, the availability of over 250,000 datasets on the U.S. government’s data portal provides an excellent starting point for deploying internal LLM apps or testing models with data before using private data.

This is a guest post by Gregory Johnson, the co-creator of DataGovs, an organization using automation to streamline data governance tasks. A version was first published on Johnson's website and is republished here with permission.

Before you go...

Please consider supporting Technical.ly to keep our independent journalism strong. Unlike most business-focused media outlets, we don’t have a paywall. Instead, we count on your personal and organizational support.

Our services Preferred partners The journalism fund
Engagement

Join our growing Slack community

Join 5,000 tech professionals and entrepreneurs in our community Slack today!

Trending

Trump may kill the CHIPS and Science Act. Here’s what that means for your community.

After nearly a decade, the federal program for immigrant entrepreneurs is finally working

Despite big raises and contracts, a tech training giant lays off staffers and loses its CEO

Block the bots or feed them facts? How Technical.ly uses AI in journalism

Technically Media