Blue Apron CLV

Consider the dataset D2.3, which contains simulated transactional data inspired by the subscription
meal delivery service Blue Apron (https://www.blueapron.com/). The dataset records a random
sample of Blue Apron’s subscribers’ activity (∼22,400 individuals) during Jan 2019. A detailed
codebook is available from the document C2.3.
Consider the following two specifications:
Specification 1:
Specification 2:
Using the Blue Apron data, perform the following tasks:

  1. Task 1:
    a. [2 points] Estimate the two listed specifications using churn indicator as the
    outcome and implementing f() as the logistic model.
    b. [1 point] Select a model based on predictive performance criteria. Justify your
    decision.
    c. [1 point] Use the selected model to predict churn probabilities for every customer
    in the sample. Present a histogram of these probabilities.
  2. Task 2:
    a. [2 points] Estimate the two listed models using MonthlyAddons as the outcome and
    implementing f() as linear regression.
    b. [1 point] Select a model based on predictive performance criteria. Justify your
    decision.
    c. [1 point] Use the selected model to predict MonthlyAddons for every customer in
    the sample. Make sure these predictions are within range. Present a histogram
    thereof.
  3. Task 3:
    a. [1 point] Export the full dataset to a csv file. The exported data must include
    individual predictions for churn probabilities (task 1) and monthly add-ons (task 2),
    each from their respectively preferred specification. After this file is saved as csv,
    convert it into xls or xlsx so that formulas can be saved (this last step is not in R,
    just a simple change of extension).
    BU.450.760 Assignment 2 Blue Apron CLV
  4. Task 4: consider the following policy currently being evaluated by BA’s leadership: by
    making a one-time $20 expenditure on each targeted customer (e.g., mailing a gift
    Champaign bottle), BA can reduce each targeted customer’s probability of churn by 0.01.
    a. [2 points] Compute baseline CLV values for each customer in the initial scenario
    (i.e., if the new policy was not implemented). Plot their distribution.
    b. [2 points] Determine the optimal targeting policy. This is, determine the set of
    customers who the firm should send the one-time gift to. How many customers does
    the firm target?
    c. [2 points] Compute the total financial gains/losses derived from implementing the
    campaign as the before/after difference between the total CLV values in the entire
    portfolio of customers.
    Guidance
    • Use a 70/30 training/validation data split
    • For CLV calculations use the formula used in class,
    1
    𝐶𝐿𝑉 = 𝑀𝑜𝑛𝑡ℎ𝑙𝑦𝑁𝑒𝑡𝐶𝑜𝑛𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 ×
    1 − 𝑟𝑑
    where:
    o 𝑀𝑜𝑛𝑡ℎ𝑙𝑦𝑁𝑒𝑡𝐶𝑜𝑛𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 = (𝑀𝑜𝑛𝑡ℎ𝑙𝑦𝑏𝑎𝑠𝑒𝑝𝑎𝑦𝑚𝑒𝑛𝑡𝑠 + 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑎𝑑𝑑𝑜𝑛𝑠)×
    0.3 (that is, the firm has a 30% margin)
    o Retention rates vary individual by individual, as reflected by predicted churn
    o Periods are months and the discount factor is 𝑑 = 0.98
    BU.450.760 Assignment 2 Blue Apron CLV
    Submission guidelines
    • Submit via Blackboard, 8AM EST on the day of class 2
    ▪ Late submissions will be penalized
    ▪ Late corrections will not be accepted
    • Note that assignments are automatically checked for similarity—it is ok to discuss with other students,
    it is not ok to copy
    • Submit two files (one submission per individual):
  5. Slide Deck (MS Powerpoint or pdf)
    ▪ In the slide deck, I expect you to present results in an executive way – you need to clearly
    describe:
    • what is the goal (question/problem at hand)
    • what you did to achieve the goal (analysis procedures)
    • why you did it (rationales behind key steps)
    • what you obtained (results)
    ▪ Use as many slides as you need.
    ▪ The title page must include your name.
    ▪ If you have worked/discussed with someone else, please also include their name(s) in a separate
    line on the title page.
  6. R script file containing the codes that you used for your analysis.
    ▪ Include comments in the script to help the TA follow your procedures.
    ▪ The script file should be understood as a companion: you are encouraged to include
    screenshots of the command lines (with command line #) in your slide deck to
    demonstrate your key steps. This way TAs can easily go back and double check that
    your answers in the ppt are well supported.

Sample Solution