Funnel and Subscription Retention Analysis

1. Funnel Analysis

Chose which metrics:

  • Click through rate (Growth):
    • (i.e. how many customer clicked on it after seeing the ad)
      • # clicked / # shown
    • Goal: bring customer to the site.
    • Pros:
      • able to identify demand from users
      • allow estimate to test and improve ad features to maximize CTR (since it’s only take account user behavior after seeing ad)
    • Cons:
      • No information of the 2nd layer and 3rd layer behavior, thus does not know what user does once they come to the site
      • Cost of bringing user to site is not considered
      • User conversion remains unsolved, even with high CTR
      • (might be click-bait ads)
  • Conversion Rate (Growth and Product)
    • (i.e. how many customer converted after seeing the ad)
    • # converted / # shown
    • Pros:
      • the final goal of an ad is to get customer, this metric is represents it
    • Cons:
      • No info about cost
      • Highly depend on product: Ad could be great, fair product, low conversion
      • Conversion might happen after a long time, even after visit multiple times(it’s hard to link conversion to click)
  • Profit, revenue from conversions (Business / Product)
    • Not useful for growth, used by large companies (main goal: being profitable)
    • total_revenue – clicked * avg_cost_per_click
    • Pros:
      • Business driven
    • Cons:
      • Profit might happen after a long time(hard to link profit to click)
      • Only consider users after they clicked on ad
      • No information about top funnel (people see ad did not click)

a. Data:

Two format of data is available for funnel analysis.

  1. each row represents a user visited website
    • column: user_id, layer1, layer2, layer3, date
    • Value: layer1, layer2, layer3 is binary
      • if one user was only able to get to layer 2, then it’s 1, 1, 0
  2. each row represents a campaign or ad
    • column: ad, layer1 , layer2, layer3, date
    • Value: layer1, layer2, layer3 is continuous variable indicates total amount of user able to get to that layer on that day

b. Steps:

  • data preparation: merge / logic check
    • note: always check with the flow of funnel
    • i.e. layer1 > layer2 > layer3
  • plot a total count of visit vs date
    • for each campaign / ads/ device/ feature you are interested in
    • check for sudden drop, it’s usually a bug, or error in data collection
      • if count after sudden drop went back to normal:
        • remove those drops using count > 0
      • if count after sudden drop do not went back to normal:
        • bug need to fix
# some useful code

# 1. plot date vs layer, hue = interested_feature
data = data.query('layer1 > 0')
g=sns.lineplot(x="date", y="layer1",hue="interested_feature", data=data, legend="")
g.xaxis.set_major_formatter(mdates.DateFormatter('%b %d'))

# 2. plot date vs multiple layers, hue = interested_feature
#funnel as a time series
funnel = data.groupby(['date', 'interested_feature']).apply(
                    lambda x: pandas.Series({
                            'to_layer2' : x['layer2'].mean()/x['layer1'].mean(),
                            'to_layer3': x['layer3'].mean()/x['layer2'].mean(),
                            'to_layer4': x['layer4'].mean()/x['layer3'].mean()
# Make the dataset long for the plot
funnel = pandas.melt(funnel,id_vars=['date', 'interested_feature'])
# Plot it
# the row will be to_layer2, 3, 4
g = sns.FacetGrid(funnel, hue="interested_feature", row="variable", aspect=4, sharey=False), "date", "value")
g.set_axis_labels("", "Conversion Rate")
plt.legend(loc='upper right')

  • draw business insights:
    • Note: sometimes difference might come from the other feature
      • i.e. when grouping, group by 2 features to investigate where the difference come from.
      • Remember, a feature could always be a proxy for the other
test = stats.ttest_ind(data.loc[data['interested_feautre'] == 'level1']['layer3'],
                       data.loc[data['interested_feautre'] != 'level1']['layer3'], equal_var=False)
  • Some useful index:
    • CTR: clicked number / shown number
    • Profit: Total Revenue – Clicked * Average cost per click
    • Count for layer interested
  • Some useful model:
    • RF for business insights
    • Time series for future prediction

c. General rule:

  • More mobile user than desktop
    • Growing mobile should be priority for company
  •  If one device has lower conversion rate:
    • Product
      • Product team look into the difference among different version and see if UI/UX can be improved
    • Marketing
      • Different platform may attract different users
      • Maybe a proxy for user distribution
      • If a platform has higher conversion rate but with low amount of user:
        • Maybe it’s only attracting power users, who is likely to convert
      • Note:
        • this will be more reasonable
        • if one platform has lower absolute amount of users
      • Solution:
        • Collect more data about users,
        • find if user distribution is the same across the devices
          • Same: product problem
          •  Not same: not product problem
  •  Sudden drop of Conversion Rate:
    • Bug / Bad product changes
  •  Expected Bouncer from home page to search page: 0.5
    • If rate is not reasonable for Search page to payment page:
    • Look into the characteristics of search queries
    • See if highly searched item is not available in our site
    • Understand lack of supply compared to user demand.

2. Subscription Retention Problem

  • Data:
    • Pick a interested billing cycling
    • Collect user information subscribed in that same billing cycling
    • Column: user_id, sign_up_date, service_rate, ….., billing cycles
    • Value: billing cycles is continuous, indicates how many billing cycle that user stays, a.k.a., how long that user subscribed to our service.
  • Terms:
    • Retention Rate for a billing cycle :
      • How many people is still subscript to service at the end of that billing cycle
    • Life Time Value (LTV)
      • Can be defined as one year average customer value
        • i.e. for data less than one year and model is really good,
        • Predict monthly retention
        • Sum it up for 12 month, get the yearly average customer value
        • (Note: if it’s subscription retention problem, need to add one month fee after sum)
  • Insights:
    • Build a table with interested feature + retention rate + count + revenue
      • Retention rate: percentage of current active user
      •  Count: total amount of  user
      • Revenue: average of customer revenue (avg(price * month for each customer))
  • Plot with y= i or ii or iii, x = interested feature, hue = different pricing
    • Focus on Revenue plot:
      • This will give general idea of LTV for each level
      • See which level is doing the best, and pay attention to abnormal levels
    • Combine with retention rate plot:
      • Is a particular level is doing well because of a high retention rate?
  • Goal:
    • Shift users towards the optimal price point ,
    • Thus increase LTV, revenue gain, more click on ads, grow faster
  • Solution
    • UI design:
      • Change the UI so that, for each level, the optimal price (highest LTV) is more prominent
    • Make the best price has most attractive features:
      • Look into how users from each level are using the product
      • Figure out the tool features they like the most,
        • So that best LTV price has the most attractive tool for each level
    • Coupon:Targeted / regional coupons
    • Example:
      • Levels with low count but high LTV and retention_rate:
      • Big opportunity, try to increase those customer
      • (discount, campaign)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: