Enterprise Generative AI — The problem of authoritative data

PRACTICAL DESIGN INNOVATION

Enterprise Generative AI — The problem of authoritative data

Problems and their remediations to make sure your enterprise Gen AI product is useful and appropriate.

Man on a wave of data trying to decide what to add to it. — Author, Midjourney

Designers and other non-technical roles should be aware of Enterprise Generative AI applications’ unique data considerations to help ensure that projects are trustworthy and usable.

What do we need to know?

LLMs’ responses are based on the popularity of token relationships. The more times something appears, the more probable the system will use it to generate a response. On the other hand, there is generally only one authoritative version of an organization’s operating processes, language for speaking about itself and its products, and rules and regulations it must follow. There is a version of this data that can and should be considered authoritative.

The problem in an enterprise system comes from authoritative information being swamped or contaminated.

Let’s examine what contamination looks like. In an enterprise system, we generally care about three kinds of data:

Base LLM model data (all points) — the data that was used to create the initial model.The enterprise’s data (not blue dots) — all of the data the enterprise has access to.Authoritative data (darker red, yellow and green shapes) — the subset of enterprise data that actually represents the enterprise’s current point of view and processes.Plot showing the distribution of data about subjects. — Author

Two reasons for hallucinations are sparseness and contamination. Sparseness comes from insufficient information about the subject to receive meaningful responses. In the diagram above, there are not enough rounded square items related to the subject matter to generate an appropriate response. Contamination comes from additional material in the system that is not authoritative. The blue and lighter-colored shapes are not authoritative but may be sampled, swamping, or distorting the response.

Let us say that the user wants to request information about the company’s financial state. For this example, the green rounded squares represent the company’s financial information, the darker green rounded squares represent this month’s information, and the nearby blue dots represent financial information about the market and competitors.

Problems and Remediations

Problem — Nonauthoritative versions can swamp authoritative information. The system looks at any sources that match the concepts requested, possibly at information in the base model.

Remediation — The system must be trained to tell what queries need an authoritative response. For those responses the system needs to be trained on what sources are authoritative. For example, it needs to stick to the dark green sources.

Problem — There isn’t enough data about the subject (dark green rounded squares) in the LLM. If few documents on the subject exist and none have been declared authoritative, the LLM may look outside the company’s articles on the subject matter to gather enough information to fulfill the request. This may not be obvious to the user and may be harder to validate than external requests that can easily be corroborated with a search.

Remediation — Add authoritative documents at the modeling stage or augment what is available via RAG (retrieval augmentation) when requesting to ensure that the subject is covered. If neither of these is possible, you may need to fine-tune the model not to give a response on the subject.

Problem — There are old versions of documents (light green square circles) across the system. It is important that the system return authoritative responses. Especially in an enterprise, ideation on projects and strategy, marketing messages, financial data, and other goal-directed or legally binding communications should reflect the participants’ current thinking. Early versions of documents involved in creating these projects will contaminate the output of an LLM, providing spurious results.

This is especially true because there are likely many versions of the early deliverables. Since the LLM relies on a probabilistic method of gathering information, those early versions will provide more samples.

While there may be rare occasions when a person wants a gestalt view, including any thoughts or ideas generated to create an authorized point of view, that should be provided as an optional path that the user explicitly asks for.

Remediation — Either an automated system needs to designate the last updated documents as authoritative or a manual capability where users can designate an authoritative document, and others do not. The system may also need a way to find versions of documents so an editorial authority can check if they should be considered in some fashion.

Problem — There are multiple copies of documents across the system (light green rounded squares) because people have copied them. In addition to out-of-date documents, there are variations of information for other reasons, whether there are different versions for different audiences, different versions from different contributors, or just in various states. The problem remains the same. These other sources contaminate the authoritative message.

Remediation — Similar to different versions, an editorial authority must decide what should be considered authoritative.

Problem — The LLM has not been trained to differentiate the context of questions to determine the appropriate tone. Some requests for some subjects, like finances or human resources, require specifically phrased responses for public consumption. It would be best if you decided whether the system should answer questions that may be used for public consumption or legal conversations.

Remediation — Fine-tune the system so it does not respond to requests requiring styles or tones that it does not support. There may need to be logic to test the request’s origin and the likely forum in which it will be used.

Problem — None of the source information has the output tone. For example, in our financial information request, the response needs to meet the company’s tone for reporting. The authoritative sources may not have the correct tone for a public response or response to the media.

Remediation — First, the system must understand that a request requires a specific tone. Once that has been determined, you may need to fine-tune it to understand the circumstances in which a specific tone is required and fine-tune or prompt engineer to tell the system what the tone should be.

We must be intentional about the data available to Gen AI enterprise applications to create usable and trustworthy applications.

Keep these authoritative data difficulties in mind when designing your enterprise Gen AI systems.

The next article in the series will cover a framework that uses these considerations to help teams extract the most valuable and comprehensive capabilities where Gen AI can help users achieve their goals while maintaining a company’s goals and reputation.

Workshops to imagine a bold future Generative AI UX Design Patterns Generative AI UX Design Adjectives

Enterprise Generative AI — The problem of authoritative data was originally published in UX Planet on Medium, where people are continuing the conversation by highlighting and responding to this story.