If you haven’t read Jonathan Hsu’s 8 part Medium series on Social Capital’s diligence process, add it to your reading list. I didn’t immediately grok all of the concepts in the post, but it has had an incredible impact on how I look at product metrics.
It appears it’s a big part of their recent announcement of how they are able to fund early-stage companies focused exclusively on their metrics.
One of the concepts that struck me was the depth of engagement. It shows you how engaged different portions of your user base are. You don’t need a ton of fancy data science techniques to get a glimpse into what your user base is doing. All you need a fairly straightforward SQL query to get you started.
It starts with a fairly simple concept: how many users are active for 1 day in the past month? How many are active for 2 days in the month? It’s really simple to generate a histogram (this is fake data) that looks like this:
In this fake example there are 100k monthly active users (MAUs) in this hypothetical product. I think this is very telling and interesting from a strategy and operational perspective, but there’s a different view that I now prefer. I prefer to look at this chart on a percentage basis (the % of MAUs), and look at it cumulatively. This is what it looks like:
How to read this chart: 33% of the MAUs are active for a single day of the month. It may be the first day of the month or the last, but the people that fall into this bucket were only active for a single day in the month. 53% of the MAUs were active for 2 or fewer days – you add up the 33k and the 20k from the histogram to get the 53%. In Jonathan’s example there’s a little bit of a spike of users that are active every day of the month – in a bunch of the examples I’ve seen in the B2B space there’s a nice healthy bump around 20 days, which makes sense when you consider that B2B apps are most likely used every business day, rather than every day.
This is a powerful way to slice up your install base very quickly. I push for taking the MAU install base and slicing it up into types of users. Here’s a hypothetical set of groupings:
- Low engaged users (66%): 3 days of activity or less
- Medium engaged users (14%): 4-10 days of activity
- Highly engaged users (10%): 11+ days of activity
There are a bunch of plays that I could see happening for each of these buckets:
- Sales: I could see sales following up with customers that fall into the highly engaged bucket. If they’re free, I could see them seeing value in paid tiers of your product. If they’re already paid customers, they are probably the most likely bucket to see value in additional paid options.
- Services: I could see customer success reaching out to the low engaged bucket to understand why they aren’t using the tool more frequently. In a B2B company where customer success is focused on retention, this is an area of high potential churn.
- Product: I could see the product team looking to build features that address the missing functionality users need to use it more. They could also work on retention hooks that pull users back into the product / get them to see more value in the tool.
- Marketing: I could see the marketing group targeting users based on the bucket they fall into and how they might see value from additional features.
If you’re interested in doing this yourself, check out this Jupyter notebook for sample code.
January 8, 2018 at 6:13 pm
Great write up, Dan! The Jupyter notebook is especially helpful.
What other user segmentation approaches do you most commonly use?
January 8, 2018 at 7:49 pm
Since I’m in B2B, I like to look at MAU on an organization level.
I like segmenting on a bunch of other criteria:
Geo
Industry
SKU
Free vs. Paid
Language
MAU
I try to keep it simple to start since you can spend a lot of time going down rabbit holes.
December 22, 2020 at 11:33 am
Hey Dan, again, very well explained!
I’m looking at the frequency histogram in my company with very similar numbers to your fake example. Would you look at the histogram or the CDF curve when deciding what your product’s frequency of use? What would be the frequency of use in your fake example?
Thank you!
December 28, 2020 at 5:01 pm
The charts show the same information in slightly different ways, so you should come to the same conclusion regardless of which chart you look at. The first chart shows the number of users counted by their days of usage in the past month, which the second chart shows the cumulative percentage of all of the active users in the past month. The first data point is just shown as an absolute number in one chart, and a cumulative percentage of monthly users in the 2nd. The first data point is around ~33,000 users, which represents about 1/3 of the active user base in that month. The second data point is about 20,000 users, and 33,000 + 20,000 is about 53% of your monthly active users. Let me know if that makes sense as how the data points are calculated and match up in the charts.
I would use the histogram to measure the proper frequency (daily, weekly, monthly) and the CDF as another lens to understand the high level trends. Sometimes it can be hard to understand where the inflection point is, for example, where do you cross the 50% barrier.
I think the example represents monthly usage, because 66% of your monthly users are active for 3 or fewer days in a month.
January 8, 2021 at 11:06 am
Yes, that totally makes sense.
In my hope to see a weekly frequency instead of a monthly one, I had tried to see patterns that didn’t really exist.
Thanks for clarifying!