Unified Attribution Model for impressions and clicks

A unified approach in marketing attribution for impressions and clicks

Marketing Attribution is a crucial topic for companies that need to allocate budget across multiple marketing channels. Every day, billions of dollars are shifted between the platforms that companies use to advertise their products or services, with the aim of increasing sales and/or improving marketing efficiency.

When Customer Journeys Are Limited or Missing

When customer journeys are available only for some channels or not available at all, the Media-Mix Model (MMM) is often seen as the only available methodology for marketing attribution.

However, if the aggregated number of touchpoints (e.g., impressions or clicks) is available for all involved channels at fixed time intervals, a new alternative becomes viable: the Unified Attribution Model (UAM).

1. Attribution Methodologies

Media-Mix Modeling (MMM)

The first statistical approach introduced for marketing attribution was the Marketing-Mix Model, also known as Media-Mix Model (MMM). This methodology dates back to the late 1940s.

MMM is a regression-based model that takes as input:

Marketing spend over time for each channel
A target variable (e.g., number of sales or revenue)
Optional control variables (e.g., economic indicators, promotions, weather)

These inputs allow the model to estimate how much each channel contributes to the target outcome.

📰 For an introduction to MMM, you can refer to this article

Multi-Touch Attribution (MTA)

With the rise of web tracking and cookies, a new wave of attribution methodologies emerged around 2010. These were based on customer journeys — sequences of interactions that users have across multiple marketing channels before converting.

This shift marked the beginning of the Multi-Touch Attribution (MTA) era. Common MTA approaches include:

Logistic Regression
Shapley Value
Markov Models

These models assign conversion credit across channels based on how frequently and effectively each channel appears in the paths to conversion.

📰 For an introduction to Multi-Touch Attribution, see this article

The Impact of Privacy Restrictions

In recent years, privacy regulations and browser changes have begun to reshape digital tracking:

Several browsers have already blocked third-party cookies
Google Chrome (which holds ~60% of the global browser market) has not yet fully phased out third-party cookies, but plans to do so

These changes threaten the availability of rich customer journey data that powers multi-touch attribution models.

However, many companies are transitioning to first-party cookie systems to preserve access to user-level data on their own domains.

🔍 This does not mean the end of multi-touch attribution, as some have feared.
But it does increase the need for attribution methodologies that work with aggregated time series data.

Why UAM?

Until now, the Media-Mix Model has been the predominant solution for working with aggregated data.
But now, the Unified Attribution Model (UAM) offers a robust alternative — capable of working with:

Channels tracked via customer journeys (when available)
Channels with only aggregated touchpoints (clicks or impressions)
Mixed digital and traditional media
Privacy-friendly data sources

UAM fills the gap between MTA and MMM, offering flexibility and interpretability even when journey-level data is incomplete or missing.

2. Introduction to UAM

Unlike Marketing Mix Modeling (MMM), which fits a regression model between the target variable (e.g., amount sold or number of conversions) and a set of independent variables (e.g., marketing spend per channel, seasonality, etc.), UAM takes a different approach.
UAM uses impressions and clicks to evaluate the contribution of each marketing channel to the observed incremental number of conversions, leveraging a reward model inspired by the Shapley value.

UAM requires the presence of touchpoints (i.e., impressions and/or clicks) for each channel involved. These touchpoints can be provided in either:

Aggregated form: e.g., number of clicks or impressions per channel at fixed time intervals
Customer journey form: i.e., sequences of touchpoints linked to individual users

Thanks to this flexibility, UAM can be applied to both digital and traditional marketing channels, as long as they can be expressed through measurable touchpoints.

UAM can be effectively applied in the following scenarios:

When no customer journeys are available, but the aggregated number of touchpoints for each channel is available at fixed time intervals
When customer journeys are available for some channels, while aggregated touchpoint data is available for others

3. UAM when no customer journeys are available

If customer journeys are not available, then the two required inputs to run a UAM attribution analysis are:

Aggregated Conversions and Traffic per Channel

This is a table that reports, for each time interval, the number of conversions and the observed traffic (e.g., impressions or clicks) on each marketing channel:

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	190	110	1210	840	255
2019-01-02 00:00:00	2019-01-02 23:59:59	120	160	1100	820	224
2019-01-03 00:00:00	2019-01-03 23:59:59	150	150	1345	660	220
...	...	...	...	...	...	...

In this table, we have four channels (A, B, C, D), and for each of them the number of touchpoints is stored at fixed time intervals.

Click-Through Rates per Channel

This table provides, for the same time intervals, the click-through rates (CTR) for each channel, when available:

timestamp_from	timestamp_to	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	0.15	NA	NA	0.12
2019-01-02 00:00:00	2019-01-02 23:59:59	0.17	NA	NA	0.11
2019-01-03 00:00:00	2019-01-03 23:59:59	0.21	NA	NA	0.10
...	...	...	...	...	...

From the table above, we infer that:

Channels A and D are expressed in clicks, because their CTR values are present.
Channels B and C are expressed in impressions, since their CTR values are missing (NA).

UAM is capable of mixing clicks and impressions by converting all values into a common unit of measure.

UAM first converts the number of clicks into impressions by dividing the number of clicks by the click-through rate (CTR). This allows every channel to be represented in terms of impressions, which we refer to as touchpoints.

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	190	110 / 0.15	1210	840	255 / 0.12
2019-01-02 00:00:00	2019-01-02 23:59:59	120	160 / 0.17	1100	820	224 / 0.11
2019-01-03 00:00:00	2019-01-03 23:59:59	150	150 / 0.21	1345	660	220 / 0.10
...	...	...	...	...	...	...

This results in:

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	190	733	1210	840	2125
2019-01-02 00:00:00	2019-01-02 23:59:59	120	941	1100	820	2036
2019-01-03 00:00:00	2019-01-03 23:59:59	150	714	1345	660	2200
...	...	...	...	...	...	...

To evaluate incremental effects, UAM calculates first differences on the time series:

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	-	-	-	-	-
2019-01-02 00:00:00	2019-01-02 23:59:59	-70	+208	-110	-20	-89
2019-01-03 00:00:00	2019-01-03 23:59:59	+30	-227	+245	-160	+164
...	...	...	...	...	...	...

These differences will be used in the reward model to assign positive and negative rewards per channel, based on how changes in conversions correlate with changes in touchpoints.

The table below serves as the input for the reward model. The reward model computes two quantities for each channel: positive reward and negative reward.
A positive reward is assigned to a channel at a given time instant when conversions and the number of touchpoints for that channel move in the same direction (i.e., both increase or both decrease).
A negative reward is assigned when conversions and the number of touchpoints move in opposite directions (i.e., conversions increase while touchpoints decrease, or vice versa).
The total positive reward for a channel is the sum of all its positive rewards over time, and the total negative reward is the sum of all its negative rewards.

Let $t$ be a generic time instant and $k$ a generic channel. Let $\text{number\_touchpoints}_{t,k}$ denote the observed number of touchpoints for channel $k$ at time $t$ , and let $\text{number\_conversions}_t$ denote the observed number of conversions at time $t$ . The reward function is defined as:

\text{reward}_{t,k} = \min \left( \text{abs\_delta\_number\_conversions}_t,\ \frac{ \text{abs\_delta\_number\_touchpoints}_{t,k} }{ \text{avg\_number\_touchpoints\_per\_conversion} } \right), \quad t = 1, \dots, T,\quad k = 1, \dots, K

where:

\text{abs\_delta\_number\_conversions}_t = \left| \text{number\_conversions}_t - \text{number\_conversions}_{t-1} \right|

\text{abs\_delta\_number\_touchpoints}_{t,k} = \left| \text{number\_touchpoints}_{t,k} - \text{number\_touchpoints}_{t-1,k} \right|

\text{avg\_number\_touchpoints\_per\_conversion} = \frac{ \sum\limits_{t,k} \text{number\_touchpoints}_{t,k} }{ \sum\limits_t \text{number\_conversions}_t }

Suppose that we found:

\text{avg\_number\_touchpoints\_for\_one\_conversion}=10

We demonstrate in this example how to calculate positive rewards.

Zeroing mismatched signs

For each row, we set the reward to zero for all channels where the sign of the value does not match the sign of the value in the conversions column. This means we compare the sign of conversions with the sign of each channel column (A, B, C, D) and keep only those with the same sign; others are set to zero.

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	-	-	-	-	-
2019-01-02 00:00:00	2019-01-02 23:59:59	-7	0	-110	-20	-89
2019-01-03 00:00:00	2019-01-03 23:59:59	+3	0	+245	0	+164
...	...	...	...	...	...	...

Taking the absolute values

We now take the absolute value of all numerical entries, while keeping the 0 values unchanged.

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	-	-	-	-	-
2019-01-02 00:00:00	2019-01-02 23:59:59	7	0	110	20	89
2019-01-03 00:00:00	2019-01-03 23:59:59	3	0	245	0	164
...	...	...	...	...	...	...

Calculating positive rewards

Assuming avg_touchpoints_per_conversion = 10, we calculate:

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	-	-	-	-	-
2019-01-02 00:00:00	2019-01-02 23:59:59	7	min(7,0/10)	min(7,110/10)	min(7,20/10)	min(7,89/10)
2019-01-03 00:00:00	2019-01-03 23:59:59	3	min(3,0/10)	min(3,245/10)	min(3,0/10)	min(3,164/10)
...	...	...	...	...	...	...

Final reward values

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	-	-	-	-	-
2019-01-02 00:00:00	2019-01-02 23:59:59	7	0	7	2	7
2019-01-03 00:00:00	2019-01-03 23:59:59	3	0	3	0	3
...	...	...	...	...	...	...

In the table above, columns A, B, C, and D contain the positive rewards assigned to each channel based on their contribution to the observed conversions.

Aggregating rewards

We compute the total positive reward per channel:

channel	total positive rewards
A	1,200
B	5,400
C	3,500
D	1,800

The same procedure can be applied to compute negative rewards, by reversing the sign-matching condition.

Suppose we found:

channel	total negative rewards
A	2,400
B	8,400
C	2,000
D	2,000

After computing the total positive and negative rewards for each channel, we calculate the weights to be used for attribution as the ratio between them. To avoid division by zero, we apply a smoothing factor by adding 1 to both the numerator and the denominator:

\text{weight}_k = \frac{ 1 + \text{total\_positive\_reward}_k }{ 1 + \text{total\_negative\_reward}_k } \quad \forall k

These weights can then be used to assign attribution scores proportionally across channels, based on how consistently their activity correlates with conversion trends.

In our example:

channel	total negative rewards
A	1,200/2,400=0.50
B	5,400/8,400=0.64
C	3,500/2,000=1.75
D	1,800/2,000=0.90

Once the attribution weights are defined, the final attribution for each channel at time $t$ is computed using the following formula:

\text{final\_attribution}_{t,k} = \frac{ \text{attribution\_weight}_{t,k} }{ \sum\limits_{k} \text{attribution\_weight}_{t,k} } \times \text{number\_conversions}_{t} \quad \forall t,\ \forall k

This formula distributes the total number of conversions observed at time $t$ across all channels proportionally to their attribution weights.

The attribution weight for each channel $k$ at time $t$ is defined as:

\text{attribution\_weight}_{t,k} = \min \left( \text{number\_conversions}_{t},\ \frac{ \text{number\_touchpoints}_{t,k} }{ \text{avg\_number\_touchpoints\_per\_conversion} } \right) \times \text{weight}_k \quad \forall t,\ \forall k

In our example then attribution weights are:

timestamp_from	timestamp_to	conversions	A	B	C	D	ROW TOTAL
2019-01-01 00:00:00	2019-01-01 23:59:59	190	min(190,110/10) x 0.50 = 5.50	min(190,1210/10) x 0.64 = 77.40	min(190,840/10) x 1.75 = 147.00	min(190,255/10) x 0.90 = 22.95	252.85
2019-01-02 00:00:00	2019-01-02 23:59:59	120	min(120,160/10) x 0.50 = 4.00	min(120,1100/10) x 0.64 = 70.40	min(120,820/10) x 1.75 = 143.50	min(120,224/10) x 0.90 = 20.16	238.06
2019-01-03 00:00:00	2019-01-03 23:59:59	150	min(150,150/10) x 0.50 = 7.50	min(150,1345/10) x 0.64 = 67.25	min(150,660/10) x 1.75 = 115.50	min(150,220/10) x 0.90 = 19.80	210.05
...	...	...	...	...	...	...	...

and final attribution is:

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	190	190 × ( 5.50 / 252.85) = 4.13	190 × ( 77.40 / 252.85) = 58.16	190 × (147.00 / 252.85) = 110.46	190 × ( 22.95 / 252.85) = 17.25
2019-01-02 00:00:00	2019-01-02 23:59:59	120	120 × ( 4.00 / 238.06) = 2.02	120 × ( 70.40 / 238.06) = 35.49	120 × (143.50 / 238.06) = 72.33	120 × ( 20.16 / 238.06) = 10.16
2019-01-03 00:00:00	2019-01-03 23:59:59	150	150 × ( 7.50 / 210.05) = 5.36	150 × ( 67.25 / 210.05) = 48.02	150 × (115.50 / 210.05) = 82.48	150 × ( 19.80 / 210.05) = 14.14
...	...	...	...	...	...	...

4. UAM when customer journeys are available only for some channels

When customer journeys are available for some channels, while only an aggregated number of clicks or impressions is available for others,
the Unified Attribution Model (UAM) performs hybrid attribution by combining:

The output of the reward model, applied to aggregated time series data
The output of a Markov model, applied to customer journey data

This blended approach enables UAM to make the most of the available information from both structured journey data and time-based aggregation, providing reliable attribution in heterogeneous measurement environments.

In this example, we suppose that customer journeys are available for channels A and B, while channels C and D are only tracked through aggregated touchpoints.

Here is a sample of observed user paths:

id_path	timestamp	channel
0	2019-01-01 00:19:05	A
0	2019-01-01 00:29:18	B
1	2019-01-01 00:39:20	A
1	2019-01-01 00:44:37	A
1	2019-01-01 00:49:34	((CONV))
2	2019-01-01 00:19:31	B
2	2019-01-01 00:24:38	B
2	2019-01-01 00:29:44	A
2	2019-01-01 00:31:08	B
...	...	...

We can apply a Markov model to this dataset to estimate the odds of each channel. The result is a table of channel odds:

Channel	Odds
A	1.1
B	0.8

These odds will be used to adjust the attribution values originally generated by the reward model.

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	190	4.13 × 1.1 = 4.54	58.16 × 0.8 = 46.53	110.46	17.25
2019-01-02 00:00:00	2019-01-02 23:59:59	120	2.02 × 1.1 = 2.22	35.49 × 0.8 = 28.39	72.33	10.16
2019-01-03 00:00:00	2019-01-03 23:59:59	150	5.36 × 1.1 = 5.90	48.02 × 0.8 = 38.42	82.48	14.14
...	...	...	...	...	...	...

Once adjusted, we normalize the contributions from A and B while keeping C and D unchanged. Here's an example for the first row:

Total adjusted value for A and B:

A: 4.54
B: 46.53
Sum: 4.54 + 46.53 = 51.07
Raw total for A+B: 4.13 + 58.16 = 62.29

We then distribute the original sum (4.13 + 58.16) proportionally:

timestamp_from	timestamp_to	conversions	A	B	C	D
2019-01-01 00:00:00	2019-01-01 23:59:59	190	(4.13 + 58.16) × 4.54 / (4.54 + 46.53) = 5.54	(4.13 + 58.16) × 46.53 / (4.54 + 46.53) = 56.75	110.46	17.25
...	...	...	...	...	...	...

This way, we combine the reward model (aggregated signals) with the Markov model (journey data) for a consistent and unified attribution.

5. Differences between MMM and UAM

Frequestist MMM	Bayesian MMM	UAM
Parametric approach based on Linear model.	Parametric approach based on Bayesian Linear model.	Non-parametric approach inspired by Shapley value.
Requires long time series.	Requires long time series.	Works well with short time series.
Small channels are penalized. The estimated effects of small channels are 0.	Small channel coefficients estimation benefit from prior distributions.	Small channels estimated effects benefit from the implicit assumption that, before observing data, every channel has the same effect on conversions. It is as if UAM has a prior distribution assumption.
Feasible only for a few channels. If the number of channels is high a very long time series is required.	Feasible only for a few channels. If the number of channels is high the Bayesian approach is slow and it requires a very long time series.	Feasible for a high number of channels.
It needs a lot of time to be implemented and fine-tuned.	It needs a lot of time to be implemented and fine-tuned. It also requires subjective assumptions on the choice of the prior distributions.	Automatic approach.

A unified approach in marketing attribution for impressions and clicks

When Customer Journeys Are Limited or Missing​

1. Attribution Methodologies​

Media-Mix Modeling (MMM)​

Multi-Touch Attribution (MTA)​

The Impact of Privacy Restrictions​

Why UAM?​