Washington Post develops AI reader tool with Virginia Tech

One of the world’s best-known news outlets is developing AI-powered tools in hopes of improving reader experiences.

The Washington Post is partnering with Virginia Tech’s Sanghani Center for Artificial and Data Analytics to develop the new tech. It’s a generative AI project where readers can get answers to questions, using data taken from the Post’s previous coverage. The plan is for it to be built to understand intent in user questions, rather than just relying on keywords like some other AI platforms.

Project development will happen out of Virginia Tech’s Innovation Campus, but the physical space isn’t set to open until spring 2025 in Alexandria. For now, students and professors are working out of facilities in Arlington and Falls Church.

The partnership stemmed from a desire for the Post to be leaders in the new ways people are finding and consuming information, Sam Han told Technical.ly

Han is head of data and AI at the paper. He’s been at the Post for about seven years, and in his current role for the past three.

“People are getting used to asking questions, [getting] answers directly, instead of them reading and understanding,” Han said. “That’s the trend we are observing. And we want to be in that transformation — or, in a way, revolution — to lead as a media technology company. We want to prepare ourselves technically so that we can provide the best media experience to readers.”

The tech will consider implicit assumptions, and context. Han gave the example of someone asking who won the Super Bowl: Usually, they are asking about the most recent championship, not past years.

For asks like these, among others, a technique called retrieval-augmented generation (RAG) will be used to provide responses that are more likely to actually answer someone’s question. Using RAG lets a generative AI system access new information beyond its initial training data — in this case, the paper’s up-to-date coverage, Han explained.

“The goal is to build up technology assets for us in this new world”Sam Han Washington Post

The Post will also employ multimodal large language model (LLM) technology, meaning the AI tool won’t just pull from text, but also be able to integrate information found in audio or video reporting products.

The New York Times is suing OpenAI and Microsoft for copyright infringement, claiming that millions of articles were used to build the AI models. In August 2023, the paper blocked OpenAI from being able to scrape its content to train models. BBC, CNN and Reuters followed.

In May 2023, Fred Ryan, the previous CEO and publisher at the Post, announced in a press release that AI was a “priority opportunity.” At that same time, the Post established an AI Task Force and AI Hub, the latter being led by Han.

At the moment, there is no specific timeline of when readers can expect to see the feature, Han said. Two PhD students have started a yearlong research and development effort to build the tool’s search abilities, with three Virginia Tech faculty members supervising

The partnership will provide “one-of-a-kind educational experiences for our students,” according to Naren Ramakrishnan, director of the Sanghani Center, since it provides an opportunity to work on a real-world project with exacting demands.

It’ll also allow the Post to stay on top of the latest AI trends.

“The goal is to build up technology assets for us in this new world,” Han said, “where large language model AI plays a critical role of providing conversational information consumption.”

Tags: AI / Media / Technology

The Washington Post is developing an AI-powered answer tool informed by its coverage

In partnership with Virginia Tech, the legacy paper is leaning into the tech “revolution,” said Sam Han, head of data and AI.

Before you go...

Join our growing Slack community

The person charged in the UnitedHealthcare CEO shooting had a ton of tech connections

From rejection to innovation: How I built a tool to beat AI hiring algorithms at their own game

The looming TikTok ban doesn’t strike financial fear into the hearts of creators — it’s community they’re worried about

Where are the country’s most vibrant tech and startup communities?