Skip to main content
Version: 3.15

Unified Attribution Model for impressions and clicks

A unified approach in marketing attribution for impressions and clicks

Marketing Attribution is a crucial topic for companies that need to allocate budget across multiple marketing channels. Every day, billions of dollars are shifted between the platforms that companies use to advertise their products or services, with the aim of increasing sales and/or improving marketing efficiency.

When Customer Journeys Are Limited or Missing

When customer journeys are available only for some channels or not available at all, the Media-Mix Model (MMM) is often seen as the only available methodology for marketing attribution.

However, if the aggregated number of touchpoints (e.g., impressions or clicks) is available for all involved channels at fixed time intervals, a new alternative becomes viable: the Unified Attribution Model (UAM).

1. Attribution Methodologies

Media-Mix Modeling (MMM)

The first statistical approach introduced for marketing attribution was the Marketing-Mix Model, also known as Media-Mix Model (MMM). This methodology dates back to the late 1940s.

MMM is a regression-based model that takes as input:

  • Marketing spend over time for each channel
  • A target variable (e.g., number of sales or revenue)
  • Optional control variables (e.g., economic indicators, promotions, weather)

These inputs allow the model to estimate how much each channel contributes to the target outcome.

📰 For an introduction to MMM, you can refer to this article

Multi-Touch Attribution (MTA)

With the rise of web tracking and cookies, a new wave of attribution methodologies emerged around 2010. These were based on customer journeys — sequences of interactions that users have across multiple marketing channels before converting.

This shift marked the beginning of the Multi-Touch Attribution (MTA) era. Common MTA approaches include:

  • Logistic Regression
  • Shapley Value
  • Markov Models

These models assign conversion credit across channels based on how frequently and effectively each channel appears in the paths to conversion.

📰 For an introduction to Multi-Touch Attribution, see this article

The Impact of Privacy Restrictions

In recent years, privacy regulations and browser changes have begun to reshape digital tracking:

  • Several browsers have already blocked third-party cookies
  • Google Chrome (which holds ~60% of the global browser market) has not yet fully phased out third-party cookies, but plans to do so

These changes threaten the availability of rich customer journey data that powers multi-touch attribution models.

However, many companies are transitioning to first-party cookie systems to preserve access to user-level data on their own domains.

🔍 This does not mean the end of multi-touch attribution, as some have feared.
But it does increase the need for attribution methodologies that work with aggregated time series data.

Why UAM?

Until now, the Media-Mix Model has been the predominant solution for working with aggregated data.
But now, the Unified Attribution Model (UAM) offers a robust alternative — capable of working with:

  • Channels tracked via customer journeys (when available)
  • Channels with only aggregated touchpoints (clicks or impressions)
  • Mixed digital and traditional media
  • Privacy-friendly data sources

UAM fills the gap between MTA and MMM, offering flexibility and interpretability even when journey-level data is incomplete or missing.

2. Introduction to UAM

Unlike Marketing Mix Modeling (MMM), which fits a regression model between the target variable (e.g., amount sold or number of conversions) and a set of independent variables (e.g., marketing spend per channel, seasonality, etc.), UAM takes a different approach.
UAM uses impressions and clicks to evaluate the contribution of each marketing channel to the observed incremental number of conversions, leveraging a reward model inspired by the Shapley value.

UAM requires the presence of touchpoints (i.e., impressions and/or clicks) for each channel involved. These touchpoints can be provided in either:

  • Aggregated form: e.g., number of clicks or impressions per channel at fixed time intervals
  • Customer journey form: i.e., sequences of touchpoints linked to individual users

Thanks to this flexibility, UAM can be applied to both digital and traditional marketing channels, as long as they can be expressed through measurable touchpoints.

UAM can be effectively applied in the following scenarios:

  • When no customer journeys are available, but the aggregated number of touchpoints for each channel is available at fixed time intervals

  • When customer journeys are available for some channels, while aggregated touchpoint data is available for others

3. UAM when no customer journeys are available

If customer journeys are not available, then the two required inputs to run a UAM attribution analysis are:

  • Aggregated Conversions and Traffic per Channel

This is a table that reports, for each time interval, the number of conversions and the observed traffic (e.g., impressions or clicks) on each marketing channel:

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:591901101210840255
2019-01-02 00:00:002019-01-02 23:59:591201601100820224
2019-01-03 00:00:002019-01-03 23:59:591501501345660220
.....................

In this table, we have four channels (A, B, C, D), and for each of them the number of touchpoints is stored at fixed time intervals.

  • Click-Through Rates per Channel

This table provides, for the same time intervals, the click-through rates (CTR) for each channel, when available:

timestamp_fromtimestamp_toABCD
2019-01-01 00:00:002019-01-01 23:59:590.15NANA0.12
2019-01-02 00:00:002019-01-02 23:59:590.17NANA0.11
2019-01-03 00:00:002019-01-03 23:59:590.21NANA0.10
..................

From the table above, we infer that:

  • Channels A and D are expressed in clicks, because their CTR values are present.
  • Channels B and C are expressed in impressions, since their CTR values are missing (NA).

UAM is capable of mixing clicks and impressions by converting all values into a common unit of measure.

UAM first converts the number of clicks into impressions by dividing the number of clicks by the click-through rate (CTR). This allows every channel to be represented in terms of impressions, which we refer to as touchpoints.

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59190110 / 0.151210840255 / 0.12
2019-01-02 00:00:002019-01-02 23:59:59120160 / 0.171100820224 / 0.11
2019-01-03 00:00:002019-01-03 23:59:59150150 / 0.211345660220 / 0.10
.....................

This results in:

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:5919073312108402125
2019-01-02 00:00:002019-01-02 23:59:5912094111008202036
2019-01-03 00:00:002019-01-03 23:59:5915071413456602200
.....................

To evaluate incremental effects, UAM calculates first differences on the time series:

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59-----
2019-01-02 00:00:002019-01-02 23:59:59-70+208-110-20-89
2019-01-03 00:00:002019-01-03 23:59:59+30-227+245-160+164
.....................

These differences will be used in the reward model to assign positive and negative rewards per channel, based on how changes in conversions correlate with changes in touchpoints.

The table below serves as the input for the reward model. The reward model computes two quantities for each channel: positive reward and negative reward.
A positive reward is assigned to a channel at a given time instant when conversions and the number of touchpoints for that channel move in the same direction (i.e., both increase or both decrease).
A negative reward is assigned when conversions and the number of touchpoints move in opposite directions (i.e., conversions increase while touchpoints decrease, or vice versa).
The total positive reward for a channel is the sum of all its positive rewards over time, and the total negative reward is the sum of all its negative rewards.

Let tt be a generic time instant and kk a generic channel. Let number_touchpointst,k\text{number\_touchpoints}_{t,k} denote the observed number of touchpoints for channel kk at time tt, and let number_conversionst\text{number\_conversions}_t denote the observed number of conversions at time tt. The reward function is defined as:

rewardt,k=min(abs_delta_number_conversionst, abs_delta_number_touchpointst,kavg_number_touchpoints_per_conversion),t=1,,T,k=1,,K\text{reward}_{t,k} = \min \left( \text{abs\_delta\_number\_conversions}_t,\ \frac{ \text{abs\_delta\_number\_touchpoints}_{t,k} }{ \text{avg\_number\_touchpoints\_per\_conversion} } \right), \quad t = 1, \dots, T,\quad k = 1, \dots, K

where:

abs_delta_number_conversionst=number_conversionstnumber_conversionst1\text{abs\_delta\_number\_conversions}_t = \left| \text{number\_conversions}_t - \text{number\_conversions}_{t-1} \right| abs_delta_number_touchpointst,k=number_touchpointst,knumber_touchpointst1,k\text{abs\_delta\_number\_touchpoints}_{t,k} = \left| \text{number\_touchpoints}_{t,k} - \text{number\_touchpoints}_{t-1,k} \right| avg_number_touchpoints_per_conversion=t,knumber_touchpointst,ktnumber_conversionst\text{avg\_number\_touchpoints\_per\_conversion} = \frac{ \sum\limits_{t,k} \text{number\_touchpoints}_{t,k} }{ \sum\limits_t \text{number\_conversions}_t }

Suppose that we found:

avg_number_touchpoints_for_one_conversion=10\text{avg\_number\_touchpoints\_for\_one\_conversion}=10

We demonstrate in this example how to calculate positive rewards.

  1. Zeroing mismatched signs

For each row, we set the reward to zero for all channels where the sign of the value does not match the sign of the value in the conversions column. This means we compare the sign of conversions with the sign of each channel column (A, B, C, D) and keep only those with the same sign; others are set to zero.

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59-----
2019-01-02 00:00:002019-01-02 23:59:59-70-110-20-89
2019-01-03 00:00:002019-01-03 23:59:59+30+2450+164
.....................
  1. Taking the absolute values

We now take the absolute value of all numerical entries, while keeping the 0 values unchanged.

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59-----
2019-01-02 00:00:002019-01-02 23:59:59701102089
2019-01-03 00:00:002019-01-03 23:59:59302450164
.....................
  1. Calculating positive rewards

Assuming avg_touchpoints_per_conversion = 10, we calculate:

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59-----
2019-01-02 00:00:002019-01-02 23:59:597min(7,0/10)min(7,110/10)min(7,20/10)min(7,89/10)
2019-01-03 00:00:002019-01-03 23:59:593min(3,0/10)min(3,245/10)min(3,0/10)min(3,164/10)
.....................
  1. Final reward values
timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59-----
2019-01-02 00:00:002019-01-02 23:59:5970727
2019-01-03 00:00:002019-01-03 23:59:5930303
.....................

In the table above, columns A, B, C, and D contain the positive rewards assigned to each channel based on their contribution to the observed conversions.

  1. Aggregating rewards

We compute the total positive reward per channel:

channeltotal positive rewards
A1,200
B5,400
C3,500
D1,800

The same procedure can be applied to compute negative rewards, by reversing the sign-matching condition.

Suppose we found:

channeltotal negative rewards
A2,400
B8,400
C2,000
D2,000

After computing the total positive and negative rewards for each channel, we calculate the weights to be used for attribution as the ratio between them. To avoid division by zero, we apply a smoothing factor by adding 1 to both the numerator and the denominator:

weightk=1+total_positive_rewardk1+total_negative_rewardkk\text{weight}_k = \frac{ 1 + \text{total\_positive\_reward}_k }{ 1 + \text{total\_negative\_reward}_k } \quad \forall k

These weights can then be used to assign attribution scores proportionally across channels, based on how consistently their activity correlates with conversion trends.

In our example:

channeltotal negative rewards
A1,200/2,400=0.50
B5,400/8,400=0.64
C3,500/2,000=1.75
D1,800/2,000=0.90

Once the attribution weights are defined, the final attribution for each channel at time tt is computed using the following formula:

final_attributiont,k=attribution_weightt,kkattribution_weightt,k×number_conversionstt, k\text{final\_attribution}_{t,k} = \frac{ \text{attribution\_weight}_{t,k} }{ \sum\limits_{k} \text{attribution\_weight}_{t,k} } \times \text{number\_conversions}_{t} \quad \forall t,\ \forall k

This formula distributes the total number of conversions observed at time tt across all channels proportionally to their attribution weights.

The attribution weight for each channel kk at time tt is defined as:

attribution_weightt,k=min(number_conversionst, number_touchpointst,kavg_number_touchpoints_per_conversion)×weightkt, k\text{attribution\_weight}_{t,k} = \min \left( \text{number\_conversions}_{t},\ \frac{ \text{number\_touchpoints}_{t,k} }{ \text{avg\_number\_touchpoints\_per\_conversion} } \right) \times \text{weight}_k \quad \forall t,\ \forall k

In our example then attribution weights are:

timestamp_fromtimestamp_toconversionsABCDROW TOTAL
2019-01-01 00:00:002019-01-01 23:59:59190min(190,110/10) x 0.50 = 5.50min(190,1210/10) x 0.64 = 77.40min(190,840/10) x 1.75 = 147.00min(190,255/10) x 0.90 = 22.95252.85
2019-01-02 00:00:002019-01-02 23:59:59120min(120,160/10) x 0.50 = 4.00min(120,1100/10) x 0.64 = 70.40min(120,820/10) x 1.75 = 143.50min(120,224/10) x 0.90 = 20.16238.06
2019-01-03 00:00:002019-01-03 23:59:59150min(150,150/10) x 0.50 = 7.50min(150,1345/10) x 0.64 = 67.25min(150,660/10) x 1.75 = 115.50min(150,220/10) x 0.90 = 19.80210.05
........................

and final attribution is:

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59190190 × ( 5.50 / 252.85) = 4.13190 × ( 77.40 / 252.85) = 58.16190 × (147.00 / 252.85) = 110.46190 × ( 22.95 / 252.85) = 17.25
2019-01-02 00:00:002019-01-02 23:59:59120120 × ( 4.00 / 238.06) = 2.02120 × ( 70.40 / 238.06) = 35.49120 × (143.50 / 238.06) = 72.33120 × ( 20.16 / 238.06) = 10.16
2019-01-03 00:00:002019-01-03 23:59:59150150 × ( 7.50 / 210.05) = 5.36150 × ( 67.25 / 210.05) = 48.02150 × (115.50 / 210.05) = 82.48150 × ( 19.80 / 210.05) = 14.14
.....................

4. UAM when customer journeys are available only for some channels

When customer journeys are available for some channels, while only an aggregated number of clicks or impressions is available for others,
the Unified Attribution Model (UAM) performs hybrid attribution by combining:

  • The output of the reward model, applied to aggregated time series data
  • The output of a Markov model, applied to customer journey data

This blended approach enables UAM to make the most of the available information from both structured journey data and time-based aggregation, providing reliable attribution in heterogeneous measurement environments.

In this example, we suppose that customer journeys are available for channels A and B, while channels C and D are only tracked through aggregated touchpoints.

Here is a sample of observed user paths:

id_pathtimestampchannel
02019-01-01 00:19:05A
02019-01-01 00:29:18B
12019-01-01 00:39:20A
12019-01-01 00:44:37A
12019-01-01 00:49:34((CONV))
22019-01-01 00:19:31B
22019-01-01 00:24:38B
22019-01-01 00:29:44A
22019-01-01 00:31:08B
.........

We can apply a Markov model to this dataset to estimate the odds of each channel. The result is a table of channel odds:

ChannelOdds
A1.1
B0.8

These odds will be used to adjust the attribution values originally generated by the reward model.

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:591904.13 × 1.1 = 4.5458.16 × 0.8 = 46.53110.4617.25
2019-01-02 00:00:002019-01-02 23:59:591202.02 × 1.1 = 2.2235.49 × 0.8 = 28.3972.3310.16
2019-01-03 00:00:002019-01-03 23:59:591505.36 × 1.1 = 5.9048.02 × 0.8 = 38.4282.4814.14
.....................

Once adjusted, we normalize the contributions from A and B while keeping C and D unchanged. Here's an example for the first row:

Total adjusted value for A and B:

  • A: 4.54
  • B: 46.53
  • Sum: 4.54 + 46.53 = 51.07
  • Raw total for A+B: 4.13 + 58.16 = 62.29

We then distribute the original sum (4.13 + 58.16) proportionally:

timestamp_fromtimestamp_toconversionsABCD
2019-01-01 00:00:002019-01-01 23:59:59190(4.13 + 58.16) × 4.54 / (4.54 + 46.53) = 5.54(4.13 + 58.16) × 46.53 / (4.54 + 46.53) = 56.75110.4617.25
.....................

This way, we combine the reward model (aggregated signals) with the Markov model (journey data) for a consistent and unified attribution.

5. Differences between MMM and UAM

Frequestist MMMBayesian MMMUAM
Parametric approach based on Linear model.Parametric approach based on Bayesian Linear model.Non-parametric approach inspired by Shapley value.
Requires long time series.Requires long time series.Works well with short time series.
Small channels are penalized. The estimated effects of small channels are 0.Small channel coefficients estimation benefit from prior distributions.Small channels estimated effects benefit from the implicit assumption that, before observing data, every channel has the same effect on conversions. It is as if UAM has a prior distribution assumption.
Feasible only for a few channels. If the number of channels is high a very long time series is required.Feasible only for a few channels. If the number of channels is high the Bayesian approach is slow and it requires a very long time series.Feasible for a high number of channels.
It needs a lot of time to be implemented and fine-tuned.It needs a lot of time to be implemented and fine-tuned. It also requires subjective assumptions on the choice of the prior distributions.Automatic approach.