The Value of Natural Language Processing in Standards Development

By Chris Harding, CEO of Lacibus

Until very recently, conversing with AI was an activity left to engineers and technical specialists. But with the increased availability and buzz around generative AI such as ChatGPT, more people and industries are working with language-based AI and discovering the benefits of natural language processing (NLP).

Chris Harding
Chris Harding

NLP, a branch of AI, is a method of analyzing and classifying human speech or text that enables a computer to understand human communication. NLP technology is already used for a number of functions, including spam detection, social media sentiment analysis, text summarization, and machine translation, offering a more accurate translation between languages and ultimately capturing meaning and tone more effectively.

But the most widespread application of NLP comes in the form of virtual agents and chatbots, whereby speech recognition is used to identify patterns in voice and text to respond with appropriate action or helpful advice. This technology is already powering Facebook Messenger, the Google search platform, Amazon recommendations, and Apple’s Siri, but its latest and most talked-about application is ChatGPT.

NLP in ChatGPT

ChatGPT packages OpenAI’s state-of-the-art NLP into a free-to-use chatbot that has drawn considerable attention from the public for its potential impact on content creation, customer service, coding, and more. As well as its widespread accessibility, ChatGPT has attracted attention because of its ability to respond to questions on any given topic, rather than focusing on one topic or just providing a few pre-defined prompts. This is due to the sheer size of the data set that OpenAI used to train the ChatGPT language model, which puts it streets ahead of existing sales and support chatbots.

Despite its potential for disruptive change, there are limitations to the NLP that underlie ChatGPT. Its capabilities are constantly evolving, but NLP models still often misunderstand sarcasm, idioms, and the finer nuances of human language, which can lead to errors and irrelevancies in output.

Depending on its training data, an NLP can also result in outdated, biased, or inaccurate results, even resulting in prejudicial claims. And this doesn’t begin to cover the potential data privacy issues and intellectual property infringement that can arise from ChatGPT – much of the data used to train NLP models is scraped from the internet and can infringe on the privacy of individuals or organizations. To get the most value from NLP while minimizing the risk of context, ethical, or privacy issues, businesses need to harness collaborative intelligence.

The power of collaborative intelligence

Rather than using AI and NLP to entirely replace people and business processes, collaborative intelligence means creating a well-organized partnership between humans and machines. With machines providing a greater body of knowledge at a faster speed than humans can, and humans adding a moral and ethical dimension, as well as an element of “common sense”, better ideas will be developed.

According to research by Harvard Business Review involving 1,500 companies, businesses achieve the most significant performance improvements when humans and machines work together, showing improvements in the leadership, teamwork, creativity, and social skills of humans and the speed, scalability, and quantitative capabilities of machines. With humans playing a key role in training, maintaining, and sustaining AI, organizations can embrace collaborative intelligence to truly achieve the best of both worlds.

Using NLP for standards development

NLP is unlocking new business value for organizations in a variety of industries, one of which is standards development. Standards enables businesses and people to work together more efficiently by creating a common foundation for communication. Over time, the scope of these standards has expanded, with standards like the UNIX® standard helping software developers collaborate by defining operating systems and the TOGAF® standard helping enterprise architects collaborate by describing how they develop architecture.

The concept of Standards as Code is emerging, whereby a standard may consist of executable code, provided that the code is subject to consensus-led change control. This unlocks the potential for computer software to act as practical standards. This software is able to include NLP-driven programs like ChatGPT, with standards consisting of their language models. Standards as Language Models presents a huge and industry-disrupting advancement in the standards world.

In addition to providing a way of representing standards, NLP can support their development and help people use and implement them. The Open Group Data Integration Work Group is exploring these possibilities through work on its forthcoming Guide to Data Integration using The Open Group Standards. To establish a shared basis, it is researching use cases and current trends in data integration, reviewing the corpus of The Open Group standards to identify relevant clauses.

The Data Integration Work Group uses a prototype Ideas Browser to analyze sets of webpages, enabling users to browse topics and ideas without needing to read them in full. Summaries are then created by large language model such as the one used by ChatGPT and, while this won’t replace professionals, it will allow them to review much more material, make faster decisions, and ultimately produce better work. By using NLP to empower collaborative intelligence, organizations can get the most value from AI technology, while supporting and upskilling their employees.