As product designers and developers, listening to our user’s experiences and satisfaction is crucial for creating successful, human-centered products. Very commonly, we’ve relied on numbered rating scales to gather this information. It’s a fast, quantifiable way of gathering research that can be cast with a very wide net. However, this type of questioning comes with its own set of challenges that need to be acknowledged and addressed. So today, we’ll cover the problems associated with using rating scales in usability and satisfaction assessments and explore alternative strategies for more meaningful insights.
Subjectivity in the 5-point rating scale system
Tracking customer success and satisfaction with a product via numbered rating scales turns the participant’s experience into a subjective metric that might mean differently to each person. The criteria to which a person might rate something a 4 point may drastically differ from person to person. Some people might rate a product or an experience a ‘4’ because it was great but not perfect or above and beyond. Others might give something a ‘4’ due to problems they experienced along the way. Clearly, the context and the big “why” are lacking when asking users this simple question.
In a day and age where people’s jobs are on the line for reviews below a perfect 5 star, the pressure to achieve a perfect ‘5’ score can result in users being overly generous, hindering the collection of genuine feedback. And if something didn’t go drastically wrong in the user or customer’s experience, why not just give a ‘5’? Well, from the perspective of researchers and marketers, there’s nothing to improve from a perfect ‘5’ experience. So why spend the time and money distributing and analyzing a survey that lacks actionable insights?
Rushed Responses and Question Misinterpretation
Yes, it’s the point in the blog where we quote The Office—I’m sorry, but it’s a great example. Remember in Season 7 Episode 2 when Michael Scott got so angry and annoyed about having to go through mandatory therapy with Toby and quickly filled out his exit survey so he could finally leave, but unknowingly marked “strongly agreed” that he was feeling extremely homicidal? Sure, in this instance, it was funny—but it happens!
People often rush to fill out surveys or often don’t read the questions carefully. Even well-intended attempts to alternate between positive and negative framing of questions can lead to confusion. Research by Sauro and Lewis (2011) highlights the challenges of participants not carefully reading statements, impacting both their responses and the subsequent analysis by researchers.
Rating Scales: No, It’s Not Just Semantics
Researchers may internally set criteria for each point on a scale, but as this is not usually laid out in a survey, users are left scratching their heads about what a ‘4’ really means. Using numbers to summarize an experience can feel too abstract for people. Without knowing why they chose that number, we’re stuck in the dark, unable to figure out the “how” for improvements and the next steps.
Even with semantic differentials, this assumes the person is able to plot their experience along a binary where the pair of adjectives align with their experience. If I experienced confusion on how to use a new feature, where would I plot this along a scale from very difficult to very easy? What about “it depends on when I need to use it” along a frequency scale from never to often?
Rating scales may not provide enough granularity to capture the nuances of their experience. Also, language barriers and basic misinterpretations of a scale can affect the reliability of the collected data.
So what should we use instead?
If you need to use a rating system in a survey or usability test, here are some tips we recommend:
- Provide more context: Offer substantial context for the features or tasks being assessed. This empowers users to provide thoughtful responses armed with a clear understanding of the specific aspects under consideration.
- Label Plot Points Clearly: Labeling plot points between poles specifically and clearly, especially for semantic differentials, will give participants a better understanding of the points in between one extreme or the other. This ensures participants have a precise understanding of the points between extremes, fostering more accurate and nuanced responses.
- Include Neutral Options: Giving participants the ability to mark something as neutral might seem like “giving them an out,” but it can lead to more quality results from users who do find they can plot their experience along a scale. Occasionally, a question on a survey may not be relevant to their experience. Rather than marking themself in the middle of a scale and skewing results, give people the ability to determine whether they can confidently put themselves along the scale in the first place.
- Limit the Number of These Questions: Finally, limiting the number of rating scale questions will reduce the likelihood that people will get fatigued and rush through these questions to finish. This can compromise the quality of the data being collected, especially when influenced by their mood and environment.
To sum it up
While rating scales can offer some insights, they should be part of a broader evaluation strategy. If your team has the time and resources, one of the best ways you can gather research is to watch your users use the thing you’re testing. Whether that’s your own product or a competitor’s, getting a glimpse into how a person uses something through observational research allows users to unveil data to researchers that people may not think to bring up during a written survey or an interview.
Workarounds and misuses of a product are rarely uncovered in surveys, and these moments can bring insight into a part of your product that may have never been intentionally tested in the first place. So, embrace a holistic approach to user feedback—and you just might create products that not only look good but feel good for your users. And cheers to that!