How to know when your hypothesis is validated
A lot of executives are asking me – “how do we know when a hypothesis is validated?”I’m pleased because this means rapid experimentation is starting to make it! Even getting to the stage of caring about hypotheses being validated is a game-changer. But I understand that this question is tricky to answer. This post shares how we at Philosophie by InfoBeans validate or invalidate hypotheses.
Before you can validate anything you need to run experiments on these hypotheses. Luckily we have a whole article on that topic: How to Select the Right Experiment for Your Project
But once an experiment has been run and we have some data, how do we know if a hypothesis that we’ve prioritized has been validated or invalidated?
A key goal of early stage product validation work is to provide confidence that you are building the right thing. Each company will require different levels of confidence during these product validation cycles. So I can’t say there is a magic formula for validation – these experiments need to work within the constraints of your company culture.
Early in the product development lifecycle you will rely on smaller data sets and leverage qualitative feedback from user interviews. Naturally these data sets get bigger over time as you run multiple iterations.
The earlier you invalidate something the better, because it means you are saving time and money that would be spent building something that people may not want.
As you move from this early stage discovery work to product market fit validation, you will layer in analytics. This quantitative data shows you exactly how people are using the feature or the application. This will be the ultimate test to validate or invalidate hypotheses.
Discovery
When you have low confidence in an idea and there is a high cost to build it, you don’t just want to build the thing. You want to run lower cost and faster experiments to validate the idea. Of course, this will not give you a 100% accurate answer, because people aren’t actually using the product or buying a service. But it will point you in the right direction and create a higher degree of confidence.
Typically, these experiments will rely on 5 – 10 one-on-one interviews. You could also run a landing page test where you drive some traffic to a landing page. Given the relatively low volume, we typically look at the data within intervals of 20% and then layer the feedback from the interviews along it.
40% or more away from what you expected
If you run a usability test and only 3 out of 5 or even worse 4 out of 10 people complete the task, it means your hypothesis is likely NOT validated.
However, your hypothesis may not be completely invalidated.
Instead, this could mean that your solution wasn’t the right one to solve the problem and you need to iterate on it further. Or you could decide to bench this idea and revisit it in the future.
Try looking into the raw data from the interviews. You may hear your users indicate subtly that the feature might be useful, but not necessarily in the way you initially thought.
Recently we were working with a leading health & wellness resort. We had a hypothesis that “If we allow guests to mark activities they are interested in, we will increase the number of services they book”. When we interviewed guests with a prototype we got a mixed bag of reactions.
Looking at the specific interview notes, there was confusion about both the feature’s interface and how the content such as activities and services was organized. So we continued to explore other potential solutions.
Within 20% of what you expected
Now is where it gets fun!
Let’s say 4 of the 5 people you interviewed easily completed the task you hoped they would. This might be enough to validate the hypothesis at this stage.
Again, look at the qualitative feedback to see if there are more clues. It can also be valuable to compare how impactful your feature is to the business or the level of effort it will take to build. If it is super impactful or if it is relatively trivial to build, then you can consider validating it.
Some hypotheses may not require every one of your test users to complete the tasks. This is where the qualitative data is again important, because you want to hear what users who don’t like the feature are saying. In this case, you want to make sure you are not impacting the brand negatively.
For example, we worked with a top 3 media company to figure out how to increase the number of people opting in to data tracking. We wanted to accomplish this in a way that would increase trust with the brand.
The existing baseline analytics of opt-in was low, so we never planned to create a solution that had a 100% opt-in rate. As we tested our prototypes we wanted to hear people express interest or indicate that they trusted the brand. We wanted to make sure we didn’t get any negative feedback like “this scares me because….”
After additional iteration and an internal discussion, we felt like we had enough to get more real data if people would actually opt-in via an A/B test. The small sample size data was validated and we increased opt-ins by 30%. This work was promoted to production and eventually rolled out to their other properties.
Likewise, with a top 3 Credit Card company we attempted to validate a concept around people sharing their credit card to their friends through a SMS-based chatbot experience. One specific thing we wanted to validate was how we might pre-approve someone for a credit card. And what was the best way to collect this information – that included their social security number – to run a basic credit check.
It’s unlikely that every single person that comes through a text-based credit card application is going to finish this process with their SSN. But we were still able to get clarity on different methods and why they would work or wouldn’t work.
Other times, hypotheses in this zone need to be further experimented and tested. You can add a similar prototype to the next experiment that you run to see if more data comes in that shifts your opinion one way or another.
Exceeds your target
When you run an experiment and you get resounding feedback and evidence that your interviewees are loving what you created, that means your hypothesis is likely validated.
With the health & wellness brand we referenced before, we had a hypothesis that “If guests can see when they have time in their schedule, they will be more likely to book more services.” When the guests saw our prototype, their eyes lit up. They wanted a way to easily find activities and services that were happening based on a time they were free. The user experience we created and the approach we took resonated strongly with our interviewees.
When you eventually launch, there is still a chance these hypotheses may be invalidated. However, this is generally good news and it’s still worth progressing your hypothesis to the next step.
Moving Beyond Discovery
As you build confidence in the early stages, you can move to quantitative experiments. This could involve running an A/B test on your product with a certain feature and without, or building the feature and watching how users actually use the product through analytics or screen recordings like Hotjar.
If the work you are doing is to update or improve an existing product, then you can look to the baselines already established. For example, what is the conversion rate or how much revenue does each user spend? Using these baselines as a benchmark, you can then set your validation criteria based on the improvement you hope to achieve.
If you don’t have any existing baseline data, look at standards from your industry or come up with metrics as a team. Then make sure that your application is actually recording the analytics that you need to validate your hypothesis.
When we worked with We Work to create a sales tool, we heard a lot of great ideas in the user interviews. One thing that came up with a couple sales reps was a desire to easily find and view case studies. Even better, they said they wanted them in AR so they could truly experience the space.
We built a quick and dirty version of this feature into the application, including AR case studies from a partner that they already had. But when we examined how the sales reps actually used the application during pitches, we found they almost never navigated to the case studies! Instead they spent most of their time on the virtual floor plans we created. This indicated we should spend more time building out functionality around these floor plans instead of the virtual case studies.
On the other hand, while working with a major skincare company, we prototyped and tested multiple features. One of these features was a mini quiz to find the right skincare solution. In our early usability tests we saw that the quiz tested well. This feature was also aligned with the organization’s new brand vision. We then promoted this to a production A/B test against their existing homepage. We found two things:
1) People actually completed the quiz – something that some stakeholders were concerned about
2) Conversion rate increased against our control homepage – success!
As you validate or invalidate these hypotheses, make sure to update your Experiment Dashboard. And remember, learning never ends. Always look to update hypotheses as you go and learn more – just because you validated something in a previous cycle doesn’t mean that you can’t still improve it.
Does your team believe that a big impact could be made to the business by improving a feature? If so, continue to iterate on the feature. But if you don’t think a big impact could be made or the cost to iterate would exceed the impact, then hold off on it and look at other initiatives. Consider what investment makes sense and when to focus on new priorities to find the right balance.