In this section, we go over some more complicated queries that can answer questions you may have about your applications. These queries may contain multiple sub queries or make use of some DataMine or HIVE-specific functions.
One thing to note is that the queries here may not match metrics on your dashboard as your dashboard metrics may have specific definitions for your applications. In most circumstances, your metrics should be the most correct source of information, though these examples should produce results that are close unless your specific metric differs significantly.
Some of these examples include
app_id in the
WHERE clause and some do not. In general, you will likely want to limit your queries to a specific application or add
app_id to the
GROUP BY to get per-app information.
This query calculates the average revenue per paying user (ARPPU). It sums the total revenue generated from validated purchases made in the month of January 2017 divided by the number of users who made those purchases.
SELECT SUM(revenue) / COUNT(1) as ARPPU FROM ( SELECT SUM(upsight_data_value_usd) as revenue, sid FROM monetization WHERE month = 201701 and type = ‘upsight.monetization.validated’ GROUP BY sid )t1
This query calculates the average revenue per daily active user (ARPDAU). For ARPDAU, we need to look at two tables, one with spend data and one with session data. There are many ways to merge these two datasets such as a
UNION ALL is another method that is often more efficient as done in the example below. We
sid after unioning the two tables to calculate the total revenue each user has spent. Then, we can count the users and sum this revenue to calculate ARPDAU.
SELECT SUM(user_revenue) / COUNT(1) as ARPDAU FROM ( SELECT sid, SUM(revenue) as user_revenue FROM ( SELECT upsight_data_value_usd as revenue, sid FROM monetization WHERE month = 201701 and type = ‘upsight.monetization.validated’ UNION ALL SELECT 0 as revenue, sid FROM session WHERE month = 201701 )t1 GROUP BY sid )t1
This query counts the number of users who have purchased each individual product over a single day. If a user buys two different items, we want to count that user twice (once for each item). Doing the query with two steps using
SELECT DISTINCT and
count(1) instead of using
COUNT DISTINCT spreads the load more evenly which results in a faster running query.
SELECT upsight_data_product, COUNT(1) as users FROM ( SELECT DISTINCT sid, upsight_data_product FROM monetization WHERE day = 20170125 ) t1 GROUP BY upsight_data_product
This query returns a row for each country and the number of users that have made a purchase in them. Much like how we can not simply sum the number of users each day to get MAU (because a user can play multiple days), here we have to consider the case where a user can spend in two different countries this month.
If we do the simple query, just replacing day by country, we will count that user in BOTH countries. This means you can not simply sum over the results in excel and get a total spender number. In the query below we count each user once by choosing place the user into whatever country they spent most recently. This is just one of many choices we could make, however the benefit is that the total spenders calculated by the query below will also match the number from the query in the previous section.
SELECT location_country_ip as Country, COUNT(1) FROM ( SELECT sid, Location_country_ip, row_number() OVER( PARTITION BY sid ORDER BY seq_id ) as rank FROM monetization WHERE month = 201701 and type = ‘upsight.monetization.validated’ ) t1 WHERE rank = 1 GROUP BY location_country_ip