Keeping your AI out of trouble

Posted June 12, 2024Jun 12

One thing is true for most AI Applications - it's easy to get yourself in trouble if you're not careful. AI is all about probability, and the probability of it being incorrect, or behaving unexpectedly for a new input is practically never zero. In the classic chatbot days, this often meant getting an answer about something you're not asking about, or the good old "I did not understand" default answer we all "love" to see when we're having an issue. But with Generative AI, mistakes are much more nuanced, and may take the appearance of plain misinformation and, even worse, harmful content!

In this article, we'll cover some of the guidelines you can adopt to minimize risk on AI Apps. Each section is composed of a set of actions you can take, followed by good and bad examples to illustrate their role in keeping your users - and you! - safe from unexpected AI behavior.

[HEADING=2]1. User interface guidelines[/HEADING]

Starting with UI tips - these are simple changes to the way your end-users engage with your AI application that can go a long way in preventing misuse.

Guideline	Description	Reasons
Include disclaimer text	In order to interact with the AI, end-users should acknowledge the rules and limitations of the tool. A good disclaimer should mention:

The information provided may be generated by AI
The information provided may be incorrect
The user is responsible for verifying the correctness of information against sources provided
Any additional industry specific disclaimers

Users expect to see correct information on the platforms you provide them. The concept of a tool that can provide incorrect information is new and needs to be explicitly called out.
Visually separate Generated and Retrieved content into sections

Generated content is the output of the language model, and as such can be incorrect
Retrieved content is directly extracted from trusted sources, and can be expected to be correct, but possibly not relevant

This distinction should be clear to the end user. The generated content can be grounded on retrieved content, but you should always provide an original source the user can read directly.

In addition, you may want to refrain from answering a question when no content was retrieved.

Once you establish some content must be verified by the user, you need to define a clear boundary of what information needs verification, and what can be trusted without doubt.

Providing both pieces of information side by side makes it easy for the user to check the information at a glance, without leaving the app.

Having that separation in the application also allows you to override the generated content. Even if the AI says something, you can choose not to display it through app logic if there are no sources to support it.
Add a feature to report issues and provide feedback
Users should be able to provide feedback whenever they face issues or receive unexpected responses.

If you decide to let users include chat history with their feedback, make sure to get confirmation that no personal or sensitive data was shared.
Feedback forms provide a simple way for users to tell you if the app is meeting expectations. Establish user accountability
Inform the user that the content they submit may be subject to review when harmful content is detected.
Having users be accountable for exploiting the tool may dissuade them from repeatedly attempting to do so.

Good examples

Let's start with the original ChatGPT interface - Notice all elements are present:

Disclaimer text at the bottom
Per-message feedback option
Clearly distinct Retrieval and Generation sections
Terms and Conditions - though hidden under the question mark on the bottom right.

All these elements are crucial to ensure the user is aware how things can go wrong, and sets the right expectations for how to use the tool.