Analyzing Metabase Usage

I am the head of data at Reforge (we’re hiring!) and data is critical to our success. We use it to track our business, individual teams leverage it in their work, and we believe in empowering people in the company with data to help them do great work. We use all sorts of great tools, but as the head of data I felt it was important to understand whether people were actually using our BI (business intelligence) tool.

I built a weekly report that lays out how my company is using our BI tool, Metabase. It has the following reports:

  • Dashboards created in the past 7 days
  • Questions created in the past 7 days
  • Individual Dashboard views by our exec team in the past 7 days
  • Dashboard view counts by user in the past 7 days
  • Activity by my boss (our CTO)
  • Metabase MAUs
  • Metabase WAUs

It helps me to understand at a high level what is happening inside the company from a data perspective:

  • Who is building reports (questions / dashboards)
  • What questions people are asking of our data
  • What is most important to our exec team
  • Who is looking at information on a regular basis
  • What is top of mind for my boss
  • How engaged our company is with our data & analytics

It was surprisingly easy to set this up. All we had to do was add the Metabase internal database as a source to Metabase (sounds confusing, but actually quite simple). It wouldn’t shock me if this became a paid feature long-term for Metabase on their enterprise plan, but it took less than 30 minutes to figure out their data model and build some quick and dirty SQL questions that produced answers to our most important questions. Metabase is already storing all of the data we needed to visualize to get answers to our questions.

I set this up as a dashboard subscription so it is delivered to my email every week and I can quickly scan it to understand what’s top of mind for the org.

Reforge is Data Driven

Since the beginning of the company we’ve looked at data to understand how well we’re tracking against key goals for the company (revenue, applications, content completion, weekly activity, etc). These stats span across the entire company in terms of usefulness (support, sales, marketing, product, engineering, design, finance, etc) and we strive to make as much information accessible to everyone so that they can be data informed.

If you’re considering joining a company – do you want to be at one that uses data to help inform decisions? You can see below that Reforge employees are using data on a regular basis and you can see how it scales as we’ve grown the team and hit key milestones. We’ve typically seen 75-100% of employees leveraging Metabase on a monthly basis.

Our Metabase monthly active users along with company milestones

The queries that we use to power the charts:

Dashboards created in the past week

select created_at, name, u.email, left(description, 40) as description, 'https://BI_HOSTNAME/dashboard/' || d.id as url
from report_dashboard d
    inner join core_user u 
        on d.creator_id = u.id
where created_at > current_date - interval '7 days'
order by 1 desc

Questions created in the past week

select created_at, name, u.email, 'https://BI_HOSTNAME/question/' || d.id as url
from report_card d
    inner join core_user u 
        on d.creator_id = u.id
where created_at > current_date - interval '7 days'
order by 1 desc

Exec team dashboard views in the past week

For this report, I wanted to know what the exec team was looking at in the past week. I want to know how much they’re looking at the metrics and whether they’re looking at the dashboards I think they should be looking at.

select u.email, d.name, count(v.id)
from view_log v
    inner join core_user u
        on v.user_id = u.id
    inner join report_dashboard d
        on v.model_id = d.id
where model = 'dashboard'
    and u.email in 
        (LIST_OF_EMAILS)
    and v.timestamp > (current_date - interval '7 days')
group by 1, 2
order by 3 desc
limit 200

Dashboard views leaderboard (views in the past week)

This shows the number of views by user in the past 7 days. It’s always a good sign when your head of data isn’t the most prolific viewer of dashboards.

List of users and their views
select u.email, count(v.id)
from view_log v
    inner join core_user u
        on v.user_id = u.id
    inner join report_dashboard d
        on v.model_id = d.id
where model = 'dashboard'
    and v.timestamp > (current_date - interval '7 days')
group by 1
order by 2 desc
limit 200

Most popular dashboards by views

select d.name, count(v.id)
from view_log v
    inner join core_user u
        on v.user_id = u.id
    inner join report_dashboard d
        on v.model_id = d.id
where model = 'dashboard'
    and v.timestamp > (current_date - interval '7 days')
group by 1
order by 2 desc
limit 200

Questions created by a single person

select created_at, name, u.email, 'https://BI_URL/question/' || d.id as url
from report_card d
    inner join core_user u 
        on d.creator_id = u.id
where created_at > current_date - interval '28 days'
    and u.email = EMAIL_OF_YOUR_BOSS
order by 1 desc

Metabase MAUs

select date_trunc('month', timestamp), count(distinct user_id) 
from view_log 
group by 1
order by 1

Metabase WAUs

select date_trunc('week', timestamp), count(distinct user_id) 
from view_log 
group by 1
order by 1

Hiring a Senior Data Engineer at Reforge

It is an exciting time to be joining the data team at Reforge. In the past 16 months, we created the data team and scaled it to 8 people. On the team today, there are five analysts, a data engineer, an analytics manager, and me (I’m the Head of Data at Reforge).

We’ve been busy in this time. Some of the work that we’ve done:

  • Migrated our data warehouse from postgres to snowflake
  • Rebuilt our entire data warehouse schema organization
  • Built countless models to model our programs
  • Enabled the switch to a subscription model (reporting on subscriptions, change over time, renewals, etc)
  • Worked with our R&D team to adopt and use a behavioral analytics system
  • Defined our data dictionary
  • Hired and onboarded 8 people
  • Built data pipelines to enrich data in our system via Airflow
  • Migrated our email automation system

Reforge as a company has also been doing very well during this time. Our business has grown tremendously, we raised our $60M series B round, and we have more than doubled the team in that time.

We had the good fortune of having raised our Series B earlier this spring before the recent economic turmoil and have the the money, time, and resources now to invest heavily in creating new experiences and products. It’s an exciting time to be investing and building when it feels like many companies are cutting back or deprioritizing investments.

Data engineering will be a critical component to our upcoming growth and I’m particularly excited about the projects we are undertaking:

  • We will build data pipelines to enrich data about our prospects and customers. This will be the foundation for personalization and recommendations in the future. Having a reliable and performant system that can help us understand who someone is, where they work, and build classification models for their attributes will be critical for future product experiences as well as analyzing how our business is performing.
  • We will build more sophisticated foundational reporting for our growing analyst team to work with. Our analysts work with stakeholders in every department to help find insight in data, build visualizations, and impact our strategy and future roadmap. You will have a ton of leverage in creating impact – the data sources you create will be used by every department.
  • We will invest in additional data syncs between our systems to ensure that every customer interaction is excellent. Reforge is teaching how the best technology companies execute, and we aspire to be the best example of our own frameworks. Our customers are the leading practitioners in the tech industry and love to hold us accountable.
  • Revamp the infrastructure to enable reporting for our marketing organization. When the data team was spun up we didn’t have anyone in the marketing org, yet were growing 100% year over year due to word of mouth. The team has grown a lot since then, and we are becoming much more sophisticated in how we approach our marketing efforts. We need to empower our analysts and the marketing team with the data they need to understand how well we’re attracting people to Reforge, where our best return on investments are, and what best explains the value of Reforge.

We use best of breed tools in our data stack. Here’s a video of our current data stack. Most of these tools have been implemented in the past year and we will continue to invest heavily as our business grows and we become even more sophisticated.

If you know of anyone who wants more responsibility and has a strong vision for how data engineering should work today, please send them this post or let me know if I should reach out to them. This is a critical role for us to fill before the end of the year.

Apply here for the role.

The Reforge Data Team is hiring

We are hiring for three analysts on the Data Team at Reforge. We’re looking for a Product Analyst, a Senior Product Analyst, and a Senior Marketing Analyst.

It’s a very exciting time for Reforge and for the Data Team.

Helping People do the Best Work of Their Careers

Reforge is an amazing opportunity to work on a product that has a significant impact on people’s lives. We are taking the untapped knowledge, frameworks, and practices from industry leaders and making it accessible to our customers. Our goal is to help our members do the best work of their career by unlocking insights and then helping them apply it back to their jobs right away.

We routinely here from our members that it was the best educational experience they’ve had (better than their MBA if they got one), and that it has helped them be more confident in their role and drive a big impact for their business.

$21 Million Series A Investment

We raised our Series A investment from A16Z in February. At the time, our CEO Brian Balfour wrote about:

  • The history of Reforge
  • Why there’s a real need for our offering
  • The solution we are building

I was lucky to be one of the earliest employees that took this from a nascent concept through today. It was an incredible time to be at the company, but today’s inflection point is even more exciting. While we bootstrapped the company to eight figures of revenue, our recent fundraising round gives us the capital to invest even more aggressively.

We are the rare startup that has real revenue with generous margins and profits, yet also has raised venture capital and the ambition to continue growing 100% each year.

The Data Team at Reforge

The data team is just getting started here at Reforge. We’re looking for people who want to be a part of a small team that is growing fast and are comfortable with ambiguity and fast paced change. It’s a great opportunity for people who want to be a part of building a strong data practice at a company that sees data as invaluable in operations as well as strategic decisions.

The nature of our product is that it’s cross functional. We create content, host events, and build relationships between people through marketing, community, operations, product, design, and engineering.

We have a central data team so that every part of the org has access to all of the data to deliver an exceptional experience. Every part of the experience should be tailored to your role, your company’s business model, where you’re located, how senior you are, what you’ve done in the platform, and what your goals are today. The data team is here to make sure teams have the information they need to deliver and iterate on an exceptional experience.

The data org owns the growth model for the business. It’s our job to help bring together the metrics for the whole business to understand how we’re trending towards our goals, what is performing well, where there is opportunity to improve, and what the greatest points of leverage are. This is especially important for a cross functional product like ours where opportunities require multiple groups within the company to collaborate.

What we’re looking for

We’re looking for someone who can take ambiguous questions, run independent analyses, and then clearly communicate their findings.

  • Analysts will be required to field questions from many potential sources: the leadership team, PMs, Marketers, Engineering, Design, Operations, or support. Being able to listen to the questions of teams, clarify their goals, and then come up with the right way to analyze and summarize findings is critical.
  • Often times the initial question asked is not the right one or needs to be clarified and adjusted. It is not an analysts job to mindlessly give teams the answers to their questions, but to push back when necessary and be a collaborator with their teams to gain insights that will push the product experience forward and improve the business.

An analyst should be deeply quantitative but naturally curious. This person should have opinions about software experiences and be passionate about finding ways to quantify value to end users, impact to the business, areas for improvement, and insights into behavior.

  • Teams may not always know the right question to ask and the best analysts have their own opinions that they pursue.
  • Analysts should understand how the product works, the user psychology of end users, and how the product is different from its alternatives. The quantitative metrics that are used to measure its performance are a direct result of finding ways to express a deep understanding of the product.
  • The best analysts are looking around corners to help uncover pockets of strength, quantify areas that are under-performing, and are thinking about the best ways to influence people about what’s important and should be focused on.

An analyst should be an excellent communicator, story teller, and consultant.

  • We are not looking for analysts that are simply responsible for producing charts. They will need to be able to understand the motivation behind a request, suggest alternatives, and have the conviction to push back when they disagree to foster debate.
  • We want analysts who are able to tell a story with the data, explain the technical elements of an analysis but communicate why it’s important.
  • We want analysts to be consultative with their peers – they need to understand what will be impactful and resonate with an audience and tailor the summary accordingly.

The Tools we Use

As much as possible, we use the best of breed technology stack available today. We have roots as a profitable, bootstrapped company so we haven’t upgraded all of our tools yet but we are constantly evaluating each of the tools to ensure we’re using the best tool for the job.

Our core sources of data:

  • We have a read-replica of our production database so that we can query the latest and greatest production data in real time.
  • We have a data warehouse that is populated hourly by our customer data platform, Segment.
  • We are able to seamlessly run queries that mix and match data between the two systems so that we can use the latest production data with an analysis that uses raw event based data or models computed in the data warehouse.

Some of the tools we use:

  • We use DBT for modeling and data transformation in our data warehouse
  • We use github for our DBT models and all of our code
  • We use metaplane to ensure that we are the first to know of any structural changes to our schemas as well as unexpected changes or trends within our databases.
  • We use iterable as our email marketing tool
  • We use Segment for our event pipeline, reverse ETL tool, and to populate our data warehouse with event level information from sources of data.
  • We use Airflow for any jobs that connect to external services and to create more sophisticated models and jobs that can’t be created in pure SQL.
  • We use Amplitude as our behavioral analytics tool to get insights about how people are using our product, key conversion paths and funnels, and retention behavior.
  • We use Metbase as our BI tool. I’ve written about examples of how we use it here and here.
A simplified version of our data systems

How Teams are Structured

We are rapidly building our teams and how they are structured, but our strategy is a product led experience. That means that our product team is responsible for the core experience of our members. We are structuring teams in pods that own discrete elements of the experience. While every pod may vary, they will typically have:

  • A product manager
  • A product designer
  • A Tech Lead
  • Multiple software engineers
  • A marketer
  • A product analyst

It will be the product analysts responsibilities to:

  • Be the expert on the team about data
  • Empower the team to do their own reporting and answer the vast majority of their own questions
  • Do deeper analyses than any other team member can do

If this is interesting to you, please apply here.

Stripe + Segment + Metabase

I’ve written about how we use Metabase at Reforge, and how I’m a big fan. It has allowed us to make data accessible to anyone in the organization, whether it’s for a deep analysis or for a quick status update on an important initiative.

We use Segment as one of the key pieces of data our data infrastructure, and I recently turned on the Stripe integration. I was pleasantly surprised by how well it works. I authenticated into our Stripe account to configure it as a source within Segment. Segment then updates tables in our data warehouse with the latest and greatest data from Stripe. This is a screenshot of our database and the tables under the Stripe schema:

On one of our key dashboards showing our progress in generating revenue, I wanted a cumulative revenue chart. This is helpful to see how quickly we’re generating revenue, what our total revenue in a period of time is, and how our revenue growth compares to previous periods.

This is a sample chart that’s easy to setup in Metabase:

To generate it, this is the SQL I wrote to generate it:

The way it works:

  • The first statement generates a table (that’s the generate_series function) that has a start date and end date
  • I join from that table with a left join to the stripe payments table. This ensures that if we don’t have a day of revenue, the day still shows up in our table. This table gives me a total amount of revenue per day.
  • In the final part of the statement, I use a window function to do a cumulative sum of all of the revenue per day, so that day 2 has all of day 1’s revenue along with day 2’s revenue added in.

I hope this helps you create similar charts of revenue per day or cumulative revenue reports. I’ve worked with teams where we’ve manually hit the Stripe API to pull all of this information, which is always a pain in the butt. The nice part of this solution is that Segment keeps the table up to date automatically, and then anyone can run these reports on the latest values in the data warehouse.

While we’re still an early stage company, the type of integration is super powerful. There are a lot of interesting applications of being able to access failed charges, successful charges, refunds, and the various types of paying customers when joined to other data sources.

How we use Metabase

When I joined Reforge a year ago, I found that we were querying our databases manually to do routine analysis. If we wanted to update the team on the number of people who had applied or paid for our programs, we’d run a query against the database and then put the results in a spreadsheet. If we wanted a list of users from our programs by company, we’d run a query and put it in a spreadsheet. While this answered our questions at the time, I felt like we could do a lot better. After having used Looker in my time at HubSpot, I wanted a lightweight solution to help us enable the entire company to have access to critical data about the business and make data-informed decisions. We started using metabase.

Metabase has been a huge help for me in democratizing access to our data. Metabase connects directly to any databases you want, and it allows anyone in the company (I have chosen not to set up advanced permissions yet) to manually inspect data, do advanced reporting, or view dashboards.

This is an example of what it looks like when someone looks at our program applications table (simple database table that keeps track of applications to our growth programs):

Benefits:

  • Team members can look at the table as if it were a spreadsheet
  • They can apply filters as if it was a spreadsheet
  • They can visualize the results more easily than in a spreadsheet

I routinely build reports in metabase that filter to people with a certain condition and send it to teammates. It’s so easy to report on people who work at company X that are in participating in one of our programs. Much better to generate a simple report and then share it with a colleague knowing that it should always be up to date, even if our underlying data is updated.

You can also easily switch between a table view, and many other ways to visualize the data (table, line chart, area chart, bar chart, row chart, scatter chart, pie chart, and a map):

Once you’ve filtered your data set and chosen how to visualize it, it’s easy to then add it to a dashboard of other reports. It’s really nice that you can combine data from multiple databases into the same dashboard, and drag and drop the charts in any configuration you want. This is a dashboard that I setup to monitor the performance of applications submitted to our recent cohort of programs, as well as how people were paying for their spot. It has a segmentation of which programs they’re applying to, how much revenue we’re generated, how we’re comparing to previous periods, and where people are submitting applications from:

There are a ton of other features that I am a huge fan of. Some of them:

  • Posting questions to slack at a regular interval:

Or via email:

It has been a huge help for me personally, and this doesn’t even cover all of the ways in which we use it. Best of all, it’s free and open source. We pay to host it ourselves via Amazon Elastic Beanstalk.

This kind of solution comes in incredibly handy in our overall data pipeline, especially when we can point it to a copy of our production database and our analytics data warehouse that is populated by Segment.

© 2022 Dan Wolchonok

Theme by Anders NorénUp ↑