May 22, 2024

A/B Testing Review Prompts to Improve Your App Store Rating and Reviews

How to improve your App Store rating by A/B testing when you ask for reviews

We've already covered how conditional logic can improve your app store rating . However, the most frequently asked question was how do you determine if one version of your logic is better than another? For other optimization problems, this is fairly simple: use A/B testing, measure the average outcome from a few variants, and pick the winner. However, App Store reviews are difficult — since the star-rating is sent directly to Apple/Google, we can't easily measure the average rating.

Unlike other app store A/B testing guides, this guide includes strategies that measurably improve you app rating, not just your product page content.

In this blog post, we outline a strategy for A/B testing several variants of app-review logic for your app, to improve your app store rating and reviews. At the end we offer a SDK which encapsulates this logic, so you can quickly implement these ideas.

Just want the code?

Understanding App Rating Optimization

If you've already read our other articles, skip ahead! If not, it is important to understand the app optimiztion problem before we start making changes.

There are two types of users who review your app. First, users who visit the App Store for the sole purpose of reviewing your app. Second, users you request to provide a review, following which they do.

You have little control over the first group of users who actively go to the App Store to review your app. You cannot predict their actions, nor can you prevent them. A secret we've learned from years of tuning app reviews is that the average rating from these users is usually significantly lower than the ratings from the second group, those who agree to leave a review when prompted.

Why is this so? Well, it turns out that users who are dealing with an issue or frustration are more inclined to leave a review than those who are content with your application.

Here is a great way to think about prompting users to review your app:

Soliciting reviews from your users ensures that the silently satisfied majority is represented in the App Store rating. If you don't, your rating will soon be overpowered by the vocal discontented minority.

Fortunately for quality apps, the satisfied group is generally a lot larger than the unhappy minority. It's always a good time to initiate prompts for reviews to significantly enhance your app's rating in the App Store. Fine-tuning frequency, user selection, and timing can greatly influence your app rating.

How to Optimize Your App Store Rating and Reviews

We'll outline a strategy for improving the variables you can control:

  • The volume of people who rate your app can be improved by adding in-app prompts asking users to review
  • The average rating of from the people who agree to review your app when prompted can be improved by only asking users who have experience key “ah-ha” moments in your app, asking right after a delightful moment, and avoiding negative device conditions

These two variables interact. While asking more people for reviews likely increases the review volume, it may decrease the average rating compared to only asking users who show strong indications of loving your app, which is a smaller group.

Once you have these two variables dialed-in, you can turn that silent satisfied majority into a 5-star review army that overwhelms the loud dissatisfied minority.

Of course, improving your app's quality should be a key part of your strategy. That said, a great app can still suffer mediocre reviews if it only receives reviews from people motivated enough to go to the App Store. It's important to also optimize the variables above (while continuing to improve your app quality).

Part 1: Use Industry Best Practices

In another post we outlined 14 best practices you can use today to improve when and who you ask for reviews. If you haven't already, start there!

Many of these tactics are proven and can be implemented without A/B testing to immediately improve your average rating. For example:

  • Don't ask on out of date app versions
  • Don't ask when the user's battery is < 20%
  • Don't ask when they don't have an internet connection
  • Don't ask if their native language isn't supported in your app
  • +10 more suggestions. These are all implemented for you. See our developer guide for details.

The rest of this post focuses on how to customize and optimize this conditional logic to your app. Logic like “how many key features uses before asking” or “how long to wait after install” or “after what in-app event do we ask” have no global best-practices, and change from app to app. While you should start with a best guess, measurably tuning these requires the strategy below.

Part 2: Optimize Custom Logic with A/B Testing

A/B test graphic

Now, let's get back to A/B testing: how do you optimize a variable if you can't measure it? In our case, the App Store review average star rating.

The solution: Apple will provide you the average App Store rating for a specific release! If you change your logic for when to ask for reviews release to release, you can see the volume and average-rating information needed to pick an A/B test winner in App Store Connect. Each release becomes a test variant.

Now this is a bit harder than typical A/B testing where users are randomly assigned, so here are things to keep in mind:

  • Ensure that the app releases being tested are comparable — if one release has bugs and the other does not, the results will be skewed. It's best to do this testing on releases that are very similar or identical, with only major changes to the review targeting logic.
  • Ensure comparable timeframes — avoid comparing one group over the winter holiday to another in early January. Use comparable timeframes including same days of the week, and no major holidays.
  • Follow statistical best practices — select your timeframe before starting, avoid stopping a test early, determine a minimum sample size in advance, check the statistical significance of your results using a calculator, and avoid p-hacking.

Now what should you test? It's best to A/B test the factors most likely to impact review volume:

  • When in the user experience do you ask for a review? After a user event or app launch? Which user event is best?
  • How long to wait before asking? You want users to have enough time to experience the app, but not too long the volume drops significantly. Consider a measure based on number of interactions instead of timeframe!

To A/B test, create your variants, bind them to releases, and measure the results in App Store Connect. We have more details on the technical implementation below.

Optional: Volume Analytics

As we mentioned earlier, the volume of people who see your app review prompt is important. Targeting only your most loyal users will likely get higher rating per user, but potentially too few to raise your average rating significantly. You can quickly assess the potential volume using analytics before you embark on more time-consuming A/B testing.

Simply send analytics to you preferred analytic system to see the size of you user base that lands in each cohort. This can help inform which targeting ideas are worth A/B testing, and which are simply too small to have an impact:

# Pseudocode, adapt to your analytics tool of choice
analytics.sendEvent:"targetingCohortA" withValue:user.isInCohortA
analytics.sendEvent:"targetingCohortB" withValue:user.isInCohortB
analytics.sendEvent:"targetingCohortC" withValue:user.isInCohortC
analytics.sendEvent:"targetingCohortD" withValue:user.isInCohortD

Get Started

Creating custom logic and changing it release to release can be a lot of work. Our Critical Moments SDK simplifies the process of optimizing your app ratings! It includes:

  • Built-in targeting: our 14 suggestions like avoiding users with low battery or no network are already implemented. Add them to your app without additional coding.
  • Less code: launch your app review from config, after specific user actions
  • Powerful targeting: use our powerful conditional targeting engine to define new targeting variants in one line, such as: eventCount('primary_feature_use') > 5 && app_start > 3 && app_install_date < now() - duration('168h')
  • Over the air updates: change your targeting over the air. This can include rolling out successful experiments to all users, rolling back targeting that triggered negative reviews, or creating A/B tests without updating your app.
  • A/B test selection: control which A/B test variant is running on each user's device, such as: app_version == '2.3.3' && (logic to test variant A)

If you're ready to take your app review game up a notch, check out our App Rating Developer Guide for step-by-step instructions on how to implement these ideas.

Other Articles about App Reviews

Want to learn more about how Critical Moments can improve your app reviews? We have several additional resources that will help you improve your app's rating:

The Mobile Growth SDK

Critical Moments helps you increase conversions, improve app ratings, and make bugs disappear.