Estimating incremental outcome using regression

Under the exchangeability and consistency assumptions, the conditional expectation of any potential outcome \(\overset \sim Y_{g,t}^{ \left(\left\{ x_{g,t,i}^{(\ast)} \right\}\right) }\) can be written in terms of a conditional expectation that can be estimated by a regression model, where \(x_{g,t,i}^{(\ast)}\) represents the set of intervenable treatment variables: media, organic media, and non-media treatments. For demonstration purposes, we assume the paid and organic media channels here are impression based, although the following also holds for reach and frequency based channels.

From the definitions described in Input data, this can be written as:

$$ \begin{align*} \overset \sim Y_{g,t} &= u_{g,t}^{[Y]} \overset {\cdot \cdot} Y_{g,t} \\ &= u_{g,t}^{[Y]}L_{g,t}^{[Y]-1}(Y_{g,t}) \end{align*} $$

Meridian also makes use of the fact that the pre-modeling KPI transformation function \(L_{g,t}^{[Y]}(\cdot)\) is linear and therefore can be passed outside the conditional expectation operator. This ends up with the following equality, where the result is a quantity that can be estimated from a regression model, such as the Meridian model:

$$ \begin{align*} E\left(\overset \sim Y_{g,t}^{(\left\{ x_{g,t,i}^{(\ast)} \right\})} \Big| \bigl\{ z_{g,t,i} \bigr\} \right) &= E\left( \overset \sim Y_{g,t} \Big| \bigl\{x_{g,t,i}^{(\ast)}\bigr\}, \bigl\{z_{g,t,i}\bigr\} \right) \\ &= E\left( u_{g,t}^{[Y]}L_{g,t}^{[Y]-1}(Y_{g,t}) \Big| \bigl\{ x_{g,t,i}^{(\ast)} \bigr\}, \bigl\{z_{g,t,i}\bigr\} \right) \\ &= u_{g,t}^{[Y]}L_{g,t}^{[Y]-1} E\left( Y_{g,t} \Big| \bigl\{ x_{g,t,i}^{(\ast)} \bigr\}, \bigl\{z_{g,t,i}\bigr\} \right) \end{align*} $$

Based on this, regression can be used to estimate the incremental outcome between any two counterfactual scenarios \(\left\{ x_{g,t,i}^{(1)} \right\}\) and \(\left\{ x_{g,t,i}^{(0)} \right\}\):

$$ \begin{align*} \text{IncrementalOutcome} \left( \bigl\{ x_{g,t,i}^{(1)} \bigr\}, \bigl\{ x_{g,t,i}^{(0)} \bigr\} \right) &= E\left( \sum\limits_{g,t}\left( \overset \sim Y_{g,t}^{ \left( \left\{ x_{g,t,i}^{(1)} \right\} \right) } - \overset \sim Y_{g,t}^{ \left( \left\{ x_{g,t,i}^{(0)} \right\} \right) } \right) \Bigg| \bigl\{ z_{g,t,i} \bigr\} \right) \\ &= \sum\limits_{g,t}u_{g,t}^{[Y]}L_g^{[Y]-1} \left( E\left( Y_{g,t} \Big| \bigl\{ x_{g,t,i}^{(1)} \bigr\}, \bigl\{ z_{g,t,i} \bigr\} \right)\right) - \sum\limits_{g,t}u_{g,t}^{[Y]}L_{g,t}^{[Y]-1} \left( E\left( Y_{g,t} \Big| \bigl\{ x_{g,t,i}^{(0)} \bigr\}, \bigl\{ z_{g,t,c} \bigr\} \right) \right) \end{align*} $$

Under the Meridian model specification:

$$ \begin{align*} E\left( Y_{g,t} \Big| \bigl\{ x_{g,t,i}^{(\ast)} \bigr\}, \bigl\{ z_{g,t,i} \bigr\} \right) = \mu_t &+ \tau_g + \sum\limits_{i=1}^{N_C} \gamma^{[C]}_{g,i}z_{g,t,i} \\ &+ \sum\limits_{i=1}^{N_N} \gamma^{[N]}_{g,i}x^{[N] (\ast)}_{g,t,i} \\ &+ \sum\limits_{i=1}^{N_M} \beta^{[M]}_{g,i} \text{HillAdstock} \left( \bigl\{ x^{[M] (\ast)}_{g,t-s,i} \bigr\}^L_{s=0};\ \alpha^{[M]}_i, ec^{[M]}_i, \text{slope}^{[M]}_i \right) \\ &+ \sum\limits_{i=1}^{N_{OM}} \beta^{[OM]}_{g,i} \text{HillAdstock} \left( \bigl\{ x^{[OM] (\ast)}_{g,t-s,i} \bigr\}^L_{s=0};\ \alpha^{[OM]}_i, ec^{[OM]}_i, \text{slope}^{[OM]}_i \right) \end{align*} $$

This quantity is a function of the model parameters, and therefore has a posterior distribution which Meridian can sample using Markov Chain Monte Carlo (MCMC). ROI, mROI, and response curves can all be calculated based on the incremental outcome definition, and each of these quantities also has a posterior distribution.