Solving LeetCode Problems with PreQL
Solving LeetCode Problems with PreQL
Monthly Active Users Calculation
Given a table with the below definition.
user_actions Table: Column Name Type user_id integer event_id integer event_type string ("sign-in, "like", "comment") event_date datetime
Write a query that will return "Monthly Active Users" (MAU) for each month in the table.
The twist is that MAU is defined as someone who was active in a month AND the immediately prior month, so this is not as straightforward as select month, count(distinct user_id) from user_actions group by month.
Setup
Setting up the table is straightforward; we have 4 columns we need to define, with stragithtforward types. The table can be argued to be at the event_id grain rather than user_id + event_id, but it doesn't matter for this problem (either will uniquely define a row).
key user_id int;
key event_id int;
property event_id.event_type string;
property event_id.event_date datetime;
datasource user_actions (
user_id:user_id,
event_id:event_id,
event_type:event_type,
event_date:event_date
)
grain (user_id, event_id)
address user_actions ;
Now let's think about how to model 'active in the last two months'. This requires knowing a month and the prior month, so a window function (such as a lag) seems tempting.
First let's find the month: key event_month <- date_trunc(event_date, MONTH);
Now we need to find the last month... per user. Let's start by finding the months a user was active.
key monthly_user_id <- group(user_id) by event_month;
Now we can find out the last month they were active:
key monthly_user_id_prior_active_month <- lag event_month over monthly_user_id order by event_month asc;
And finally we can write the select statement.
We'll get the event month, the count of distinct monthly active users, the trailing month, and we'll filter to where the trailing month is the prior month.
(since we don't need to return the trailing month int he select, we'll hide it with the '--' modifier - this means it can still be used in the where clause without being returned in the tabular output. )
SELECT
event_month,
count_distinct(monthly_user_id) as monthly_active_users,
--monthly_user_id_prior_active_month,
WHERE
event_month = date_add(monthly_user_id_prior_active_month, month, -1)
;
Expand below to see the full SQL query.