Making Aggregated Attribution More Useful
Marketing data is rarely clean, balanced, or measured on a single scale. One channel may produce millions of impressions, another may produce a few thousand clicks, and a third may appear only as branded searches, email interactions, store visits, or offline activity. Yet, at the end of the analysis, all of these signals are expected to answer the same question: how much did each channel contribute?
That is the practical problem addressed by mmm_attribution.
The function is designed for cases where path-level journeys are unavailable, incomplete, or not enough to explain the full marketing picture. Instead of requiring user-level journeys, it works with aggregated signals over time: clicks, impressions, searches, views, sessions, leads, revenue proxies, or custom indicators built by the business.
The hidden difficulty: not all signals speak the same language
A click and an impression should not usually be treated as interchangeable. A direct search and a video view may both contain information, but they do not represent the same level of intent. Even two variables with the same measure can behave differently depending on the platform, campaign type, funnel position, and business context.
This is why mmm_attribution uses a prior_weight for each input variable. prior_weight is the business weight assigned to each raw signal. A common way to estimate it is conversions / touchpoints, but it can also be set from domain knowledge or historical benchmarks.
A prior weight is a compact way to bring business knowledge into the model before the attribution engine starts comparing signals. It does not need to be perfect. It simply gives the model a more realistic starting point than raw volumes alone.
For example:
| variable | channel | measure | prior_weight |
|---|---|---|---|
google_clicks | google_ads | clicks | 0.25 |
google_impressions | google_ads | impressions | 0.015 |
facebook_impressions | facebook_ads | impressions | 0.015 |
email_clicks | email | clicks | 0.30 |
direct_searches | direct | searches | 0.50 |
The exact values depend on the business. They may come from conversions divided by touchpoints, conversions divided by cost, previous experiments, incrementality studies, historical attribution, domain knowledge, platform knowledge, historical benchmarks, or conservative assumptions agreed with the client. The important point is that the analyst is no longer forced to pretend that every unit of every signal has the same meaning.
Why this matters
Without prior weights, high-volume upper-funnel variables can easily dominate the input space. That may produce results that are mathematically stable but commercially hard to accept. With prior weights, the model can still learn from the data, but it starts from a more credible representation of signal strength.
This is especially useful when a channel is represented by multiple signals. For example, a paid social channel may have impressions, clicks, landing page views, and spend-related activity. A search channel may have impressions, clicks, and branded searches. A CRM channel may have sent messages, opens, and clicks.
The goal is not to force all of those variables into a single simplistic unit. The goal is to make them comparable enough for the attribution process to produce useful channel-level estimates.
A more realistic view of channels
Modern marketing channels are not single signals. A channel is often a bundle of measures: exposure, engagement, intent, and conversion proximity. mmm_attribution treats this structure explicitly through D_variables, where each input signal is mapped to a final channel and to a measure type.
This makes the model easier to explain:
- variables remain visible;
- channels remain the final attribution level;
- measure types provide context;
- prior weights encode business assumptions;
- diagnostics can show how the final attribution differs from the starting signal distribution.
The result is an attribution workflow that is less dependent on raw volume and more aligned with how marketers actually reason about signal quality.
What the analyst controls
The analyst controls three important choices:
- which variables enter the model;
- how each variable maps to a channel and measure;
- what prior weight each variable receives.
This creates a useful balance between automation and judgment. The model is not a black box that blindly consumes raw columns, but it is also not a manual scoring table. It combines structured business knowledge with data-driven attribution.
Choosing prior weights
A practical way to think about prior weights is:
how much conversion-equivalent value should one unit of this signal carry before the model evaluates the data?
For some businesses, a click may be much more valuable than an impression. For others, impressions from a highly qualified audience may deserve more credit than generic clicks. For email, a click may represent strong intent. For direct searches, the signal may be close to demand capture. For video views, the value may depend heavily on campaign objective and audience quality.
The following values are examples of valid prior weights that can be used as a conservative starting point before adapting them to the specific business, market, and data source.
| Measure | Description | Example prior_weight = conversions / touchpoints |
|---|---|---|
impressions | Ad impressions served or exposures to the message. | 0.01 |
views | Generic views of content or media assets. | 0.01 |
video_views | Video views, including partial views, on paid or owned channels. | 0.01 |
reach | Unique users reached by a campaign or content. | 0.01 |
clicks | Clicks on ads, links, content, or calls to action. | 0.25 |
sessions | Website or app sessions generated by a touchpoint. | 0.30 |
visits | Visits to a website, landing page, app, or digital property. | 0.30 |
landing_page_views | Effective landing page views after an interaction. | 0.25 |
engaged_sessions | Sessions with qualified engagement signals. | 0.40 |
add_to_cart | Add-to-cart events or equivalent strong intent actions. | 0.80 |
checkout_started | Checkout starts or beginning of a final conversion process. | 0.90 |
direct_searches | Direct searches for the brand, product, or website. | 0.50 |
brand_searches | Searches explicitly related to the brand. | 0.70 |
generic_searches | Non-brand searches related to needs, categories, or generic keywords. | 0.30 |
emails_sent | Emails sent or delivered in a campaign. | 0.05 |
email_opens | Recorded email opens. | 0.10 |
email_clicks | Clicks generated from emails. | 0.30 |
sms_sent | SMS messages sent or delivered. | 0.08 |
sms_clicks | Clicks generated from SMS messages. | 0.35 |
push_sent | Push notifications sent. | 0.03 |
push_opens | Opens or initial interactions with push notifications. | 0.10 |
affiliate_clicks | Clicks from affiliates or partners. | 0.30 |
store_visits | Physical visits to a store or point of sale. | 0.50 |
calls | Phone calls generated by campaigns or touchpoints. | 0.80 |
app_installs | App installs attributable to campaigns or touchpoints. | 0.80 |
app_opens | App opens after install or re-engagement. | 0.20 |
There is no universal table that is correct for every company. The best prior weights are usually business-specific and should improve over time as experiments, benchmarks, and historical evidence accumulate.