Building an easily searchable and accessible library of documents is harder than you might think — especially when those files hold highly complex engineering and manufacturing information.
To make that idea a reality, Downtown Pittsburgh-based applied AI company Cognistx partnered with technical learning association SAE International, headquartered in the Pittsburgh suburb Warrendale, to establish what they’ve called the OnQue Digital Standards System. The new platform uses natural language processing and text intelligence to make SAE’s vast library of engineering documents and standards more accessible to workers and companies that need them. So far, Cognistx has processed over 3,000 files from SAE’s library in the AI-powered system.
Why does that matter?
Typically, associations like SAE make their documents available to clients in a narrative format, requiring engineers and technologists to obtain an entire file even when they might only need part of it.
“What we found is that customers and users were utilizing pieces of our data in a much more disaggregated way — not in this sort of regular document, you know, PDF or XML format,” Frank Menchaca, chief growth officer for SAE, told Technical.ly, “but at a very sort of micro level where they were interested in maybe finding particular specifications.”
To better meet that customer need, SAE looked to partner with an existing artificial intelligence company, leading to this new collaboration with Cognistx — a growing firm founded in 2015 by Carnegie Mellon University academics with a focus on applying AI techniques like natural language processing and machine learning to solving business problems like this.
“When this project came I said, ‘this is a great application of our capability,'” Cognistx CEO Sanjay Chopra said. “Because what we can do is take these decades old, standard documents that are used in the aerospace and automotive industry, and then be able to parse them and pick out parts, pick out materials, make it more digital, make it very searchable.”
"What we can do is take these decades old, standard documents that are used in the aerospace and automotive industry, and then ... make it more digital, make it very searchable."
This system aims to improve engineer awareness and increase accuracy on projects — a particularly important concern for those working on technology like an airplane engine that requires a low chance of failure. When engineers search this system for a part rather than a full document, they can now see other documents and standards associated with that part that they might not have come across before. And, as Cognistx and SAE continue to build out the number of available documents in the system, the companies can identify ones with outdated or inaccurate information.
The partnership is a pioneering move for SAE, as the standards industry has been slow to move toward a more accessible digitization of its document libraries, Menchaca said.
“In order to be able to do the types of things that working with Cognistx has allowed us to do, you really have to pull the camera back away from the document, and see the contents of the document much more as a kind of constellation data,” the exec said. “And standards organizations typically aren’t used to seeing their content that way.” Thinking about its library more as data than a collection of documents is what helped SAE move ahead on this new use of AI.
But even after understanding the need and having a strategy for how to address it, there remains a great challenge with using natural language processing itself, Chopra noted.
“We are going from large amounts of unstructured data and then converting it into searchable structured data, without missing too many things — that’s the hard part of the problem,” he said.
"The talent here is on such a level that we were able to do it here locally together."
Despite that challenge, both Chopra and Menchaca agreed that Pittsburgh’s growing tech prowess provided the right talent to develop the new database. Menchaca said SAE didn’t necessarily limit its focus to Pittsburgh when searching for a partner — the association has sister organizations in some of the country’s larger tech hubs — yet “it’s interesting that the talent here is on such a level that we were able to do it here locally together.”
And while remote work habits remain the norm for the time-being, Chopra added that there’s a benefit to having the companies in close proximity for the day when in-person work returns, from both a cost savings and innovation standpoint.
For Cognistx, a homegrown company from the storied CMU talent pipeline, this partnership will open doors to new ones, and continue to bring more tech business to the city. But more importantly, while this new platform of SAE documents will be accessible to engineers everywhere, its Pittsburgh origins demonstrate a growing presence of support-focused businesses for local technologists — helping to make the city a more permanent and established home for tech.
Sophie Burkholder is a 2021-2022 corps member for Report for America, an initiative of The Groundtruth Project that pairs young journalists with local newsrooms. This position is supported by the Heinz Endowments.