CategoryStartups

Analyzing Metabase Usage

I am the head of data at Reforge (we’re hiring!) and data is critical to our success. We use it to track our business, individual teams leverage it in their work, and we believe in empowering people in the company with data to help them do great work. We use all sorts of great tools, but as the head of data I felt it was important to understand whether people were actually using our BI (business intelligence) tool.

I built a weekly report that lays out how my company is using our BI tool, Metabase. It has the following reports:

  • Dashboards created in the past 7 days
  • Questions created in the past 7 days
  • Individual Dashboard views by our exec team in the past 7 days
  • Dashboard view counts by user in the past 7 days
  • Activity by my boss (our CTO)
  • Metabase MAUs
  • Metabase WAUs

It helps me to understand at a high level what is happening inside the company from a data perspective:

  • Who is building reports (questions / dashboards)
  • What questions people are asking of our data
  • What is most important to our exec team
  • Who is looking at information on a regular basis
  • What is top of mind for my boss
  • How engaged our company is with our data & analytics

It was surprisingly easy to set this up. All we had to do was add the Metabase internal database as a source to Metabase (sounds confusing, but actually quite simple). It wouldn’t shock me if this became a paid feature long-term for Metabase on their enterprise plan, but it took less than 30 minutes to figure out their data model and build some quick and dirty SQL questions that produced answers to our most important questions. Metabase is already storing all of the data we needed to visualize to get answers to our questions.

I set this up as a dashboard subscription so it is delivered to my email every week and I can quickly scan it to understand what’s top of mind for the org.

Reforge is Data Driven

Since the beginning of the company we’ve looked at data to understand how well we’re tracking against key goals for the company (revenue, applications, content completion, weekly activity, etc). These stats span across the entire company in terms of usefulness (support, sales, marketing, product, engineering, design, finance, etc) and we strive to make as much information accessible to everyone so that they can be data informed.

If you’re considering joining a company – do you want to be at one that uses data to help inform decisions? You can see below that Reforge employees are using data on a regular basis and you can see how it scales as we’ve grown the team and hit key milestones. We’ve typically seen 75-100% of employees leveraging Metabase on a monthly basis.

Our Metabase monthly active users along with company milestones

The queries that we use to power the charts:

Dashboards created in the past week

select created_at, name, u.email, left(description, 40) as description, 'https://BI_HOSTNAME/dashboard/' || d.id as url
from report_dashboard d
    inner join core_user u 
        on d.creator_id = u.id
where created_at > current_date - interval '7 days'
order by 1 desc

Questions created in the past week

select created_at, name, u.email, 'https://BI_HOSTNAME/question/' || d.id as url
from report_card d
    inner join core_user u 
        on d.creator_id = u.id
where created_at > current_date - interval '7 days'
order by 1 desc

Exec team dashboard views in the past week

For this report, I wanted to know what the exec team was looking at in the past week. I want to know how much they’re looking at the metrics and whether they’re looking at the dashboards I think they should be looking at.

select u.email, d.name, count(v.id)
from view_log v
    inner join core_user u
        on v.user_id = u.id
    inner join report_dashboard d
        on v.model_id = d.id
where model = 'dashboard'
    and u.email in 
        (LIST_OF_EMAILS)
    and v.timestamp > (current_date - interval '7 days')
group by 1, 2
order by 3 desc
limit 200

Dashboard views leaderboard (views in the past week)

This shows the number of views by user in the past 7 days. It’s always a good sign when your head of data isn’t the most prolific viewer of dashboards.

List of users and their views
select u.email, count(v.id)
from view_log v
    inner join core_user u
        on v.user_id = u.id
    inner join report_dashboard d
        on v.model_id = d.id
where model = 'dashboard'
    and v.timestamp > (current_date - interval '7 days')
group by 1
order by 2 desc
limit 200

Most popular dashboards by views

select d.name, count(v.id)
from view_log v
    inner join core_user u
        on v.user_id = u.id
    inner join report_dashboard d
        on v.model_id = d.id
where model = 'dashboard'
    and v.timestamp > (current_date - interval '7 days')
group by 1
order by 2 desc
limit 200

Questions created by a single person

select created_at, name, u.email, 'https://BI_URL/question/' || d.id as url
from report_card d
    inner join core_user u 
        on d.creator_id = u.id
where created_at > current_date - interval '28 days'
    and u.email = EMAIL_OF_YOUR_BOSS
order by 1 desc

Metabase MAUs

select date_trunc('month', timestamp), count(distinct user_id) 
from view_log 
group by 1
order by 1

Metabase WAUs

select date_trunc('week', timestamp), count(distinct user_id) 
from view_log 
group by 1
order by 1

Flying blind: not setting or measuring product metric goals

I love building new products. Ever since I was building junky web apps as a geeky high schooler, I always get excited the first time something actually works. It has always felt like magic. Now that I’m older, I increasingly feel the pressure of showing my impact. After the initial euphoria passes, I now immediately measure the metrics that represent success. Something that has been bothering me lately is that regardless of your methodology (waterfall, agile, scrum, burndown, trello anarchy, etc), I never hear others talk enough about product success metrics.

When I joined HubSpot I learned from many others about behavioral analytics. Sadly, I find myself constantly fighting responses when I speak with friends in the industry such as:

  • “We forgot to add tracking”
  • “We want to ship it and see how it does”
  • “We don’t have any specific goals for this release other than to improve the design”
  • “What should we measure?”
  • “We can’t afford to use behavioral analytics, it’s too expensive”

This is how I want to react every time I hear one of those answers:

A guy in a panda suit breaks a computer on someone's desk

Just kidding. I am always asking questions to understand the rationale so I can try to help add perspective.

These are the tough questions I want to ask in response:

  • What’s more expensive? A behavioral analytics system or shipping the wrong features / wasting the time of your product and engineering team?
  • If you hear feedback from a couple of customers, is that representative of all users?
  • How do you know that the users are actually doing what they say they’re doing?
  • Do you think you’ll get a team’s best work if the only goal is to release their work?
  • What do you think will garner more resources in the future? “We improved the experience, just look at it!” vs. “I increased conversion rates of signup to value by 10%, with an expected lift in revenue of Y”.


I don’t think you need to spend weeks off in a corner crunching numbers to come up with the answers to these questions. My suggestion is to spend 30 minutes thinking about a goal, why you’re working on something, and then a simple mechanism to measure success.

I push teams to answer these questions:

  1. What represents success for this release/feature?
  2. What is the current baseline?
  3. What is the hypothetical ceiling of improvement?
  4. Given the baseline and ceiling, how much do you think you can improve the metric?
  5. What will be the mechanism to track success/failure?
  6. When should you evaluate progress?

You don’t have to be super fancy and build Excel models, but at least spend 15-30 minutes thinking through the basics for a new feature. Regardless if you’re building something brand new or iterating on an old feature, I always think it’s worth considering the above questions.

As the saying goes, “if you can’t measure it, you can’t improve it”.

How to quickly format retention data in less than 5 minutes to maximize learning

Formatting a retention data set is critical whether it’s for yourself or in a situation like a job interview. Once you’ve asked all of the questions necessary to understand the data set, you should format it to maximize your ability to analyze it. Here’s my step by step process:

Export the raw data from your analytics system

It should look like the below. The cohorts are in the first column, the size is next, and then the number from the original cohort that retained in each subsequent period.

retention-1

Create another table that computes the percentage values of the original cohort size

retention-1

Add a conditional formatting element to show the size of the cohorts over time (conditional formatting -> data bars)

retention-4

Apply conditional formatting to the table that contains the percentages. It should now look like this:

retention-3

Adding summary rows

Then I create two new rows at the bottom of the percentage table. The first is the average of each column, and the second is the percentage decrease for each additional week.

retenion-5

retention-6

Wrap it up

Now you have a nicely formatted retention table that shows you:

  1. The size in cohorts over time
  2. At a glance how your cohorts retain over time
  3. Where there are good pockets and bad pockets of retention
  4. The average of cohort retention over time
  5. Which weeks have the biggest drops in retention
  6. Whether your retention levels off over the long term

Where there are many more graphs you could create off of this data set, I think this sets the right foundation for how to quickly look at the data and answer some important questions.
Are there other ways you visualize this information? Do you recommend doing this another way or formatting it differently? Let me know in the comments below.
Disclaimer: all of the data in these screenshots was made up for this example.

The Retention Questions You Should Be Asking in a Job Interview

In my experience, job applicants rarely ask the right questions when faced with a retention exercise. You can create a retention chart for anything, but I think these are the questions you simply have to ask:

  1. What do the cohorts measure?
  2. What is the criteria for retention (the % of the cohort that returns)?
  3. How can I segment the retention data?

There are many follow up questions you should be asking based on the answer.

Question 1: What do the cohorts measure?

  • Is it any People? Companies? Revenue? People using it at a company? Is it a subset of any of those concepts?
  • If it’s people — what kind of people? Is it people who did some action? Example: is it people who signed up for a product? Is it people who saw some value from your product? Bought a product? What is the trend of the cohorts over time? What could affect that? How do you expect the numbers to change over time?

Question 2: What is the criteria for retention?

  • How is retention measured? If it’s people, what action do they need to take to be counted as having been retained? Why was that chosen? Why is it important?
  • If it’s revenue, how is it measured?
  • If it’s an attribute of a person or a company (maybe number of people active), how is that measured? Why was that chosen? How is it important to the business?
  • Does it level off? If you have a budget to acquire people in the cohorts, how long must they stick around in order to sustain a profitable business?
  • How does the data change over time? Are there patterns that you would expect to see, or cliffs where things drop off?
  • Do you expect to see patterns horizontally (step function changes from one cohort to another), vertically (time barriers where cohorts worsen or improve), or diagonally (holidays, press spikes, re-engagement campaigns)?

Question 3: How can I segment the retention data?

  • Are there pockets in the data that are better or worse than the overall average?
  • Is there a feature that results in better retention?
  • Is there a platform (ex: Android, iOS) that is better than another one?
  • Is there a cohort acquisition source that performs better?
  • Is there a geography that performs better in retention?
  • Is there a cohort attribute that results in higher retention? Age / sex / profession of people? Size of company? Industry of company?

If you’re doing a retention exercise in an interview, you don’t even really need to look at the data, you could just ask questions the whole time.

Identify users with the most valuable feedback

In my time as a product manager, I was constantly reminding myself to talk to customers more. It might have been to talk about existing features, something under development, or customer pain and processes for product research. It was easy in the beginning, because we knew most of the people using the tool as we worked on the initial version. Even as we got bigger, my feelings usually boiled down to these four words: talk to customers more. I think a critical skill, however, is learning how to talk to the right customers.

As I worked on the HubSpot sales products and we grew it to hundreds of thousands of active users, I quickly realized I couldn’t speak to everyone. I needed to be strategic about who I reached out to, and the type of feedback I was looking for. It’s a dangerous path to go down, because you can end up wasting time by over-analyzing your data and getting into analysis paralysis. Even worse, in my opinion, is taking the existing feedback as representative of your user base, and only solving for the loudest segment of the customers or the customers that are most convenient to interact with.

I want to share the process that Brian Balfour showed me. It’s fairly easy (and inexpensive!) to identify who you should reach out to, and collect their feedback.

Step 1: Identify who you want to speak with

Are you working on activation? If so, you want to target people who sign up but don’t really get started or see value. If you’re working on retention, you need to dig into why someone would start to use it but end up quitting your product. Not sure which one you should be solving for? It depends.

Step 2: Find those people

Any good behavioral analytics system will allow you to look at the number of events that someone has done over time. Let’s say we want to find users who start to use the product, but then stop. Most analytics systems will tell you the raw number of times an event happened, or the unique users that did it. Here’s a sample from the Amplitude demo for when someone plays a song:

The graph above is great for evaluating trends, but the goal is to speak with individual users about their experience. You need to go a level deeper and visualize this data on an individual user level. You should be able to get it by the user’s email. Once you have that, export it to Excel so you can play with it more closely.

For the next images, imagine I exported data that has the number of events someone performed on a daily basis for my app’s most important events. I then would create a pivot table:

This pivot table has a list of email addresses in the first column, alongside the number of times an event happened per day. You could do the same on an hourly / daily / weekly / monthly basis, depending on what makes sense for your app.

Next, I typically do some more filtering to evaluate who I could speak with. I set up this filter to find the people that weren’t active on 6/18, but were active on 6/16 or 6/17. I expect people to use my app every day, and I am shocked when they don’t. I’m curious to know why they didn’t use it.

You could spend a lot of time doing this type of analysis. I’d try to limit yourself to less than thirty minutes of playing with a spreadsheet like the above. I’d bucket your search into a couple of categories:

Super active users:

  • They use the crap out of your product, consistently. I’d be curious to speak with them to understand what they like about your product and why it’s so valuable to them.

Drive by users:

  • People who check out your application quickly and then leave, never to return. What was their impression? Why didn’t they stick around? If you’re using a product like Fullstory, are you able to see what they did in your product?

Engaged people who quit

  • I’d look for people who used it for a minimum of some period (a month? Multiple weeks?) and then stopped using it. These people presumably understand at least a portion of your product, but then took action to stop using it.
  • As you can see in the screenshot, I added a column where I computed the number of days someone had performed an action. If done over weeks / months, this is helpful to quickly find the users who were long term users.

Step 3: Reach out to users

This should produce some users that you want to speak with. Now comes the fun part! I took the list of emails, and I send them an email soliciting feedback. This is where I hear a ton of complaints from product managers like “I don’t have enough people to email” or “people never click through to the survey.”

I typically get 10–20% response rates on the emails that I send out. My secret is that I send out the emails from my Gmail account. I BCC lists of users and when I contacted them so I have a record and don’t email people multiple times. This works for multiple reasons:

  1. It ends up in their inbox, not the promo inbox
  2. It feels more personal, there’s no professional email template
  3. One final key: they only have to respond to give you feedback. Nobody wants to click on a link to fill out a survey.

Here’s the email that I send out:

Nobody wants to click on a link to fill out a survey, even if it’s a one question survey. I often get back soliloquies from users with incredibly valuable feedback. I’m grateful, and I also reply back to the emails multiple times doing a typical five whys analysis. This is another reason why it’s superior to embedding a google form into an email.

Step 4: Collect / Analyze the feedback

If you choose to send out this survey to a large group of people, it’ll quickly become difficult to report off the trends and high level information. This is where I use Zapier.

I connect the Gmail and Google Docs zaps like this:

I filter the responses to the survey into a custom label so it doesn’t overload my inbox. I use Zapier to pull the responses into a Google Spreadsheet, so I can easily read them and bucket the responses:

You can see that I have manually gone through and added a feedback bucket for each of the responses. I try to bucket reasons in a handful of categories, and look for common language patterns for how people describe problems. That allows me to create reports for development teams that look like the following:

The value isn’t in doing this with free or low friction tools — the value is in the insight you get into your users and what they love or hated about your product. I’m sure that there are fancy tools that help with this type of analysis, but the bottom line is that you don’t have an excuse for not doing this type of process. It works incredibly fast, produces results, and is free (other than your existing analytics system). I like this because it gets you really close to your users and their behavior, and allows you to quickly get your hands dirty and get some actionable feedback.

How could this process be better? How do you find and the solicit feedback from the users that will have the biggest impact on your company? Let me know through a comment, or drop me an email (it’s on my personal blog).

Thanks to Magdalena Georgieva, Lars Osterberg, and Brian Balfour for reading drafts of this post.

© 2022 Dan Wolchonok

Theme by Anders NorénUp ↑