How many User Testers do you need? - 4 minutes read
Say you went fishing and had one day to catch as many fish as you can from several different ponds. Each pond has some fish that are bigger and easier to catch than others. To maximise the number of fish you can catch, would you spend all day fishing in one pond or spend some time at each pond to catch the easiest fish out of each?
Fishing for Usability Problems
Testing for usability and maximising the problems you can discover is a lot like trying to catch fish. While you can spend all of your time in one pond and try to catch every fish, it is much more efficient to focus on the big fish in each pond. Likewise, the most effective way to catch usability problems is to conduct iterative, small-scale usability tests while continuously updating your testing conditions as your design evolves. Because the objective of usability testing is to improve the overall user experience, it can not be a set-it-and-forget-it process. After conducting a usability test and fixing the discovered issues, it is necessary to retest to see if these issues were resolved and whether or not the solution presented any new issues. Spending big on a single, large-scale usability test is not as effective in discovering usability problems because each tester you add will give you diminishing returns. So what is the ideal sample size when conducting a usability test?
The “Magic Number”
In order to determine the ideal sample size for a usability test, it is important to understand how you calculate the probability of discovering an issue. The most commonly used model to evaluate the effectiveness of a usability test is:
In this formula, p is the likelihood of problem discovery (per tester) and n is the number of testers. Jakob Nielson, a leader in user research, used this model to determine that if the problem discovery frequency is at least 30%, then about 85% of discoverable problems will be found by the first five testers and 95% by the first eight. Many research teams have since pointed to this study and use the “magic number 5” as the ideal sample size for usability tests.
The Goal of User Testing
Does this mean that testing with 5 users is enough? The answer is yes and no. Looking at the curve at 30% problem discovery frequency seems to indicate that you actually need to test with at least 15 users to discover all of your product’s usability issues. However, this doesn’t mean that you need to conduct tests with 15 users at a time. While large scale tests are great for evaluating the current state of your user experience and creating benchmarks, in order to make measurable design improvements you have to continuously search for new usability problems. Focus on the big fish problems with each iteration and fine tune your user experience along the way.
Different sites have different problem discovery frequencies
One of the key assumptions in Nielson’s magic number calculations is that the frequency of usability issues encountered must be at least 30%. This is because the less likely a tester is to discover a problem, the lower the percentage of problems will be discovered with the same number of testers. By rearranging the discovery model from above and instead solve for n, you can calculate the number of testers you need to discover a certain percentage of usability issues given your testers likelihood to uncover a problem.Determine what percentage of discoverable issues you want to catch and build a test plan that best suits your needs.
How does problem discovery frequency affect your test plan?
For government, insurance, and banking sites where there are large amounts of information and complex functionality, there is typically a higher problem discovery rate. This means that these types of sites, especially if they have not done any UX design before, will likely get the most out of conducting iterative small-scale usability tests. The same goes for antiquated internal systems that are poorly designed and lack proper integration. However, for more simple websites with a limited number of tasks or ecommerce sites looking to improve conversion rates, a smaller multitude of usability tests but with larger sample sizes is a more effective approach. Use the insights gathered from usability tests to develop hypotheses for possible design changes. Then conduct A/B tests on a large scale to gather data on which iteration increases conversions. Continue identifying improvements using further usability tests to catch big fish, but validate them with A/B tests.