Blue Apron CLV

Consider the dataset D2.3, which contains simulated transactional data inspired by the subscription
meal delivery service Blue Apron (https://www.blueapron.com/). The dataset records a random
sample of Blue Apron’s subscribers’ activity (∼22,400 individuals) during Jan 2019. A detailed
codebook is available from the document C2.3.
Consider the following two specifications:
Specification 1:
Specification 2:
Using the Blue Apron data, perform the following tasks:

  1. Task 1:
    a. [2 points] Estimate the two listed specifications using churn indicator as the
    outcome and implementing f() as the logistic model.
    b. [1 point] Select a model based on predictive performance criteria. Justify your
    decision.
    c. [1 point] Use the selected model to predict churn probabilities for every customer
    in the sample. Present a histogram of these probabilities.
  2. Task 2:
    a. [2 points] Estimate the two listed models using MonthlyAddons as the outcome and
    implementing f() as linear regression.
    b. [1 point] Select a model based on predictive performance criteria. Justify your
    decision.
    c. [1 point] Use the selected model to predict MonthlyAddons for every customer in
    the sample. Make sure these predictions are within range. Present a histogram
    thereof.
  3. Task 3:
    a. [1 point] Export the full dataset to a csv file. The exported data must include
    individual predictions for churn probabilities (task 1) and monthly add-ons (task 2),
    each from their respectively preferred specification. After this file is saved as csv,
    convert it into xls or xlsx so that formulas can be saved (this last step is not in R,
    just a simple change of extension).