Funnel and Subscription Retention Analysis

1. Funnel Analysis Chose which metrics: Click through rate (Growth): (i.e. how many customer clicked on it after seeing the ad) # clicked / # shown Goal: bring customer to the site. Pros: able to identify demand from users allow estimate to test and improve ad features to maximize CTR (since it’s only take accountContinue reading “Funnel and Subscription Retention Analysis”

Random Forest — Business Insights

Draw Business Insights from RF 1. Var Imp: Look at the rank of important variables, if the top one are the least actionable variable, meaning that it’s impossible for company to change that variable, delete it and re-build RF check whether the top variable are continuous or categorical variable continuous variables tend to show upContinue reading “Random Forest — Business Insights”

Random Forest — Method and Application (Python)

Advantage of RF: Only little time is needed for optimization (the default param are good enough) Strong with outliers, correlated variables For continuous variables, it’s able to segmentize it Method: Create a bootstrapped dataset (Sample with replacement) Create a decision tree using the bootstrapped datasetBut only use a random subset of variables at each splitContinue reading “Random Forest — Method and Application (Python)”

EDA and Feature Engineering

Data Preparation Before landing a model for optimization or recommendation model, we need to make sure our data is in “ready-to-go” status. Here, I summarized some ways to clean data for future reference. Descriptive Stat Query and Merge Group and Plot Data fill na, replace and assign values Data Transformation on Column (Log, datetime) CheckContinue reading “EDA and Feature Engineering”