Building Trust in AI Products: A Product Manager's Guide

Sara Remsen

Jan 9, 2025

In the rapidly evolving world of generative AI, building user trust isn't just a nice-to-have—it's the lifeline of your product's success. As product managers, we need to create AI solutions that actually work, and ideally drive business metrics like retention and upsell.

The Rabbit r1 (From The Verge)

Imagine investing millions in an AI product, only to watch users abandon it after a few frustrating interactions. Earlier this year, the ambitious Rabbit r1 device sold out in two days, promising to complete tasks via voice commands (e.g. “call me an Uber to work”). However, as soon as it arrived in the hands of customers, they called it “underwhelming, underpowered, and undercooked” and “barely reviewable.”

Generative AI is still emerging, and many users are interfacing with generative AI products at the same time companies are figuring out its limitations and possibilities. This presents a unique challenge: how do you provide a great user experience when there are so many unknown variables?

After working with numerous AI product teams and diving deep into human-computer interaction best practices, we've identified key practices that transform user skepticism into trust and then into adoption.

1. Design for User Intents and Add Guardrails

We’ve misconstrued the flexibility of LLMs as the capability to support any customer interaction.

Traditional, deterministic software forces users into rigid pathways (e.g. click to add to the shopping cart, press 2 for billing support). While we all hate phone trees, they do solve one important problem: they set correct expectations about what users can and cannot do.

Generative AI has created high volumes of diverse, dynamic, and complex customer interactions.

Generative AI technology like large language models (LLMs) promise new, infinitely customizable pathways to get work done. For a customer support AI agent, users should be able to ask for billing support in any way they want (“I need to update my billing information,” “I want billing help,” or “Something’s wrong with my recent bill).

However, out in the real world, people ask about things that the AI was never designed to do. If a user asks “can I split my bill between two credit cards?” the best answer would be “I don’t have the information to answer that question.

When the AI answers confidently but with false information - either a yes or no - that erodes trust for the customer.

What to do instead:

Imagine you are designing a phone tree. Which customer "intents" or goals need to work well with your AI product? Then think about how to handle requests outside of those main intents.
Read the rows. There’s no substitute for looking at real, post-production data in the wild. Review the conversations where users struggle or where they have the most success. Look for patterns in first - or last - interactions.
Monitor real behavior at scale with intent classification models. There’s a point where no team can read every conversation. Instead, focus on reading the most important rows and use an intent classification model to understand behavior at scale. Either build a model in-house and track with a third-party analytics tool like Heap, Mixpanel, or Hex, or use a solution like Melodi that combines an intent classification model with analytics out of the box.

2. Treat User Feedback like Bug Reports

The thumbs up / thumbs down buttons are everywhere. But what happens when you click thumbs down? Does anyone click them? Are they worth putting in your own app?

Yes, absolutely.

Feedback isn't just a feature—it's a trust-building mechanism. Unlike traditional software, where every user sees the same screen (e.g. the same checkout screen), generative AI-powered software often presents different data to each user (e.g. every user chat is different based on what the user asks). This makes it extremely difficult to find bugs and understand the extent of the issue.

Melodi's feedback inbox helps organize and take action on user feedback for AI products

Collecting feedback from your users is a simple, efficient way to flag - and fix - issues affecting your users. Plus, creating visible feedback loops build trust by showing users that their input can drive real improvements.

What to Do:

Add a feedback button. We’re partial to flags (vs. thumbs up / down) because it conveys the “there is a problem” message to users. You can build this yourself, use an off-the-shelf feedback tool like BugHerd, UserSnap, or UserPilot, or use a feedback system purpose-built for AI like Melodi. Make sure to collect user comments for additional insight into the issue.
Treat user feedback like gold standard bug reports. If users took initiative to flag an issue, it’s highly likely that issue exists for other users who did not report it. Keep in mind that users always provide more negative than positive feedback, so negative/positive feedback should not be the only metric of success.
Follow up. If you’ve resolved the issue, thank the user for their input and let them know that their feedback was instrumental in fixing the issue.

3. Provide Transparency with the Principle “No Data Is Better Than Bad Data”

Users would rather hear "I don't know" than receive a confidently wrong answer. When your AI can't provide a high-quality response, be clear about its limitations.

“No data is better than bad data” is one of our core product principles at Melodi. Nothing erodes user trust faster than bad data, such as an incorrect or bad result.

Machine learning metrics like “accuracy” are useful for initial product development, but they often miss the rubric for quality in the eyes of the user. In one recent Stanford study, users even rated an AI assistant more highly when it gave partially incorrect answers because it still helped the user accomplish their task.

A figure describing the crossword task used evaluate human-AI interaction patterns. From Lee et al. "Evaluating Human-Language Model Interaction." Transactions on Machine Learning Research (TMLR), September 2023.

What to do:

Implement graceful fallbacks. If a question is outside the scope or if there’s no data to answer, provide context and clarity around the reasoning, like “I don’t have the information to answer that question, I’m sorry!” or “I wasn’t able to find data to answer the question.”
Monitor the success rates of specific intents so you know where to prioritize your efforts. If a particular intent is consistently unsuccessful and it happens to be a larger portion of what people are asking, then it may be worthwhile to fix.

The Bottom Line: Trust Drives ROI

Trust isn't just a feel-good metric. It's the engine of user retention, product adoption, and long-term success.

Every interaction with your AI product is an opportunity to either earn or erode user trust. By designing with user intents in mind, collecting user feedback, and embracing transparency, product managers can create genuinely useful and trustworthy AI products.

Terms

Privacy

Security