Authordanwolch

Snapchat’s Secrecy and DAU Metrics

I was pretty interested to read about Snapchat’s DAU numbers and their culture of secrecy. At first, I was pretty shocked to read about the stories employees told about the lengths the company goes to in order to keep its information private.

It’s consistent with anecdotal stories I’ve heard about Snap (they’re serious about privacy and keeping their own information a secret), but I always try to take these stories with a grain of salt.

I immediately thought about how we try to do things differently at HubSpot. Two elements in the HubSpot culture code are using metrics and transparency in the organization. I thought “we’re totally different than that here at HubSpot”, and I bet many of you thought the same thing when reading the Snapchat article. While I think we strive to be different, we’re far from perfect and are constantly trying to improve. Some of the questions I asked myself (and ways I want to hold myself and our teams accountable):

  • Does everyone in the organization have access to data (behavioral analytics, data warehouse) that helps them make better decisions?
  • For those that aren’t technical, is it accessible with non-technical tools?
  • Just because they have access to the data, do they leverage it in their ideas, analysis, and proposals?
  • Do we have sufficient documentation about how to use the data that’s available to all employees?
  • Do we make an effort to train people on using the data so they are as self sufficient as possible?
  • Do we create a culture of sharing and encouraging others to showcase their findings?
  • Do we enable others to reproduce analysis that has been done in the past?

 

While I like to think we’re better than the portrayal of Snapchat in the article, I’m not 100% satisfied with the answers to the questions above.

Choosing a behavioral analytics system: our journey to Amplitude

As part of my role at HubSpot, I run a team of analysts and data scientists that leverage quantitative analysis to inform our product development and improve the customer experience. It’s our goal to make teams self-sufficient in answering questions like: “How many people are using this feature?” or “What percentage of signups do X?” or “How sticky is this feature?”. In addition, we perform analysis and build models to identify and act upon areas of opportunity. One of the main tools we use on a daily basis is our behavioral analytics system, which helps us understand what our customers are doing inside our product.

I’ve become increasingly obsessed with behavioral analytics over the years. Here’s a brief timeline of my experience with them:

  • 2013:
    • Join HubSpot, start building a new product with Mixpanel
    • Blown away by the type of analysis it enables. Mind. Blown. It revolutionizes how I think about building products
  • 2014:
    • Grow the new product to hundreds of thousands of users, start to get nervous about our Mixpanel bill (this was a huge mistake in hindsight)
  • 2014 – 2015:
    • HubSpot decides to build its own internal behavioral analytics system
    • Rationale:
      • HubSpot is a public company at this stage, it’s a competitive advantage to have complete ownership and control over this system
      • If it costs as much as an engineer’s salary, why not pay someone to build a system customized for us?
      • We could solve our own problem, then turn the solution into a solution that could be sold to customers
  • 2016:
    • Perform a vendor assessment of our internal tool vs. a vendor (for a variety of reasons, to be explained in a future post)
    • We choose to go with Amplitude as our new behavioral analytics system
  • 2017:
    • Finish our migration to Amplitude, we currently have 250-300 HubSpotters using Amplitude on a monthly basis

Why did we pick Amplitude? Some key reasons:

  • They allowed us to create charts that count by users or by other arbitrary identifiers. Since HubSpot is a B2B company, we want to track active companies, look at the conversion rates for key actions for all users in a company, and look at company retention. Amplitude had the best solution: it allowed us to change one option in an existing chart to toggle between users and organizations. Other companies could technically solve this, but I thought it was too cumbersome.
  • They had an option to store our data in a SQL database (at the time Redshift, now it’s Snowflake). The important piece is that it allows our business intelligence team to ingest the data at a regular interval so it could be combined with other data sources. We use Looker internally, and we want to take behavioral data and combine it with financial data, CRM data, support data, and any other data loaded into our data warehouse.
  • They were focused on product analytics. We felt that their roadmap aligned perfectly with our priorities and long-term goals.
  • We had a team of 3 engineers and some of a PM’s time devoted to our internal tool. Amplitude has a much bigger engineering team and we didn’t think the customizations we would build were worth it. We felt the product team’s efforts were better spent generating value for the company, not in building a tool that was (at best and probably not the case) slightly better than Amplitude.
  • Their dashboard and behavioral cohort features were just what we wanted
  • It was fast. Our internal system had been plagued by slowness and outages (we had turnover on the team that built the internal tool and had then understaffed the team)

No solution is a panacea and I won’t say that Amplitude is perfect in every way, but I have been personally very happy with the decision we made. I’m pretty bullish on all of the companies in this space (I think they’re all powerful and worth the money), and unless there’s a fundamental shift in the technology required for these kinds of systems, I don’t want to be involved in building another one from scratch.

Flying blind: not setting or measuring product metric goals

I love building new products. Ever since I was building junky web apps as a geeky high schooler, I always get excited the first time something actually works. It has always felt like magic. Now that I’m older, I increasingly feel the pressure of showing my impact. After the initial euphoria passes, I now immediately measure the metrics that represent success. Something that has been bothering me lately is that regardless of your methodology (waterfall, agile, scrum, burndown, trello anarchy, etc), I never hear others talk enough about product success metrics.

When I joined HubSpot I learned from many others about behavioral analytics. Sadly, I find myself constantly fighting responses when I speak with friends in the industry such as:

  • “We forgot to add tracking”
  • “We want to ship it and see how it does”
  • “We don’t have any specific goals for this release other than to improve the design”
  • “What should we measure?”
  • “We can’t afford to use behavioral analytics, it’s too expensive”

This is how I want to react every time I hear one of those answers:

A guy in a panda suit breaks a computer on someone's desk

Just kidding. I am always asking questions to understand the rationale so I can try to help add perspective.

These are the tough questions I want to ask in response:

  • What’s more expensive? A behavioral analytics system or shipping the wrong features / wasting the time of your product and engineering team?
  • If you hear feedback from a couple of customers, is that representative of all users?
  • How do you know that the users are actually doing what they say they’re doing?
  • Do you think you’ll get a team’s best work if the only goal is to release their work?
  • What do you think will garner more resources in the future? “We improved the experience, just look at it!” vs. “I increased conversion rates of signup to value by 10%, with an expected lift in revenue of Y”.


I don’t think you need to spend weeks off in a corner crunching numbers to come up with the answers to these questions. My suggestion is to spend 30 minutes thinking about a goal, why you’re working on something, and then a simple mechanism to measure success.

I push teams to answer these questions:

  1. What represents success for this release/feature?
  2. What is the current baseline?
  3. What is the hypothetical ceiling of improvement?
  4. Given the baseline and ceiling, how much do you think you can improve the metric?
  5. What will be the mechanism to track success/failure?
  6. When should you evaluate progress?

You don’t have to be super fancy and build Excel models, but at least spend 15-30 minutes thinking through the basics for a new feature. Regardless if you’re building something brand new or iterating on an old feature, I always think it’s worth considering the above questions.

As the saying goes, “if you can’t measure it, you can’t improve it”.

Segment Your User Base: Depth of Engagement

If you haven’t read Jonathan Hsu’s 8 part Medium series on Social Capital’s diligence process, add it to your reading list. I didn’t immediately grok all of the concepts in the post, but it has had an incredible impact on how I look at product metrics.

It appears it’s a big part of their recent announcement of how they are able to fund early-stage companies focused exclusively on their metrics.

One of the concepts that struck me was the depth of engagement. It shows you how engaged different portions of your user base are. You don’t need a ton of fancy data science techniques to get a glimpse into what your user base is doing. All you need a fairly straightforward SQL query to get you started.

It starts with a fairly simple concept: how many users are active for 1 day in the past month? How many are active for 2 days in the month? It’s really simple to generate a histogram (this is fake data) that looks like this:

Count of Users by Days Active

In this fake example there are 100k monthly active users (MAUs) in this hypothetical product. I think this is very telling and interesting from a strategy and operational perspective, but there’s a different view that I now prefer. I prefer to look at this chart on a percentage basis (the % of MAUs), and look at it cumulatively. This is what it looks like:

CDF of Monthly Active Users by Days Used

How to read this chart: 33% of the MAUs are active for a single day of the month. It may be the first day of the month or the last, but the people that fall into this bucket were only active for a single day in the month. 53% of the MAUs were active for 2 or fewer days – you add up the 33k and the 20k from the histogram to get the 53%. In Jonathan’s example there’s a little bit of a spike of users that are active every day of the month – in a bunch of the examples I’ve seen in the B2B space there’s a nice healthy bump around 20 days, which makes sense when you consider that B2B apps are most likely used every business day, rather than every day.

This is a powerful way to slice up your install base very quickly. I push for taking the MAU install base and slicing it up into types of users. Here’s a hypothetical set of groupings:

  • Low engaged users (66%): 3 days of activity or less
  • Medium engaged users (14%): 4-10 days of activity
  • Highly engaged users (10%): 11+ days of activity

There are a bunch of plays that I could see happening for each of these buckets:

  • Sales: I could see sales following up with customers that fall into the highly engaged bucket. If they’re free, I could see them seeing value in paid tiers of your product. If they’re already paid customers, they are probably the most likely bucket to see value in additional paid options.
  • Services: I could see customer success reaching out to the low engaged bucket to understand why they aren’t using the tool more frequently. In a B2B company where customer success is focused on retention, this is an area of high potential churn.
  • Product: I could see the product team looking to build features that address the missing functionality users need to use it more. They could also work on retention hooks that pull users back into the product / get them to see more value in the tool.
  • Marketing: I could see the marketing group targeting users based on the bucket they fall into and how they might see value from additional features.

If you’re interested in doing this yourself, check out this Jupyter notebook for sample code.

How to quickly format retention data in less than 5 minutes to maximize learning

Formatting a retention data set is critical whether it’s for yourself or in a situation like a job interview. Once you’ve asked all of the questions necessary to understand the data set, you should format it to maximize your ability to analyze it. Here’s my step by step process:

Export the raw data from your analytics system

It should look like the below. The cohorts are in the first column, the size is next, and then the number from the original cohort that retained in each subsequent period.

retention-1

Create another table that computes the percentage values of the original cohort size

retention-1

Add a conditional formatting element to show the size of the cohorts over time (conditional formatting -> data bars)

retention-4

Apply conditional formatting to the table that contains the percentages. It should now look like this:

retention-3

Adding summary rows

Then I create two new rows at the bottom of the percentage table. The first is the average of each column, and the second is the percentage decrease for each additional week.

retenion-5

retention-6

Wrap it up

Now you have a nicely formatted retention table that shows you:

  1. The size in cohorts over time
  2. At a glance how your cohorts retain over time
  3. Where there are good pockets and bad pockets of retention
  4. The average of cohort retention over time
  5. Which weeks have the biggest drops in retention
  6. Whether your retention levels off over the long term

Where there are many more graphs you could create off of this data set, I think this sets the right foundation for how to quickly look at the data and answer some important questions.
Are there other ways you visualize this information? Do you recommend doing this another way or formatting it differently? Let me know in the comments below.
Disclaimer: all of the data in these screenshots was made up for this example.

© 2018 Dan Wolchonok

Theme by Anders NorénUp ↑