Skip to main content
Version: 3.20.1

Making Aggregated Attribution More Useful

Marketing data is rarely clean, balanced, or measured on a single scale. One channel may produce millions of impressions, another may produce a few thousand clicks, and a third may appear only as branded searches, email interactions, store visits, or offline activity. Yet, at the end of the analysis, all of these signals are expected to answer the same question: how much did each channel contribute?

That is the practical problem addressed by mmm_attribution.

The function is designed for cases where path-level journeys are unavailable, incomplete, or not enough to explain the full marketing picture. Instead of requiring user-level journeys, it works with aggregated signals over time: clicks, impressions, searches, views, sessions, leads, revenue proxies, or custom indicators built by the business.

The hidden difficulty: not all signals speak the same language

A click and an impression should not usually be treated as interchangeable. A direct search and a video view may both contain information, but they do not represent the same level of intent. Even two variables with the same measure can behave differently depending on the platform, campaign type, funnel position, and business context.

This is why mmm_attribution uses a prior_weight for each input variable. prior_weight is the business weight assigned to each raw signal. A common way to estimate it is conversions / touchpoints, but it can also be set from domain knowledge or historical benchmarks.

A prior weight is a compact way to bring business knowledge into the model before the attribution engine starts comparing signals. It does not need to be perfect. It simply gives the model a more realistic starting point than raw volumes alone.

For example:

variablechannelmeasureprior_weight
google_clicksgoogle_adsclicks0.25
google_impressionsgoogle_adsimpressions0.015
facebook_impressionsfacebook_adsimpressions0.015
email_clicksemailclicks0.30
direct_searchesdirectsearches0.50

The exact values depend on the business. They may come from conversions divided by touchpoints, conversions divided by cost, previous experiments, incrementality studies, historical attribution, domain knowledge, platform knowledge, historical benchmarks, or conservative assumptions agreed with the client. The important point is that the analyst is no longer forced to pretend that every unit of every signal has the same meaning.

Why this matters

Without prior weights, high-volume upper-funnel variables can easily dominate the input space. That may produce results that are mathematically stable but commercially hard to accept. With prior weights, the model can still learn from the data, but it starts from a more credible representation of signal strength.

This is especially useful when a channel is represented by multiple signals. For example, a paid social channel may have impressions, clicks, landing page views, and spend-related activity. A search channel may have impressions, clicks, and branded searches. A CRM channel may have sent messages, opens, and clicks.

The goal is not to force all of those variables into a single simplistic unit. The goal is to make them comparable enough for the attribution process to produce useful channel-level estimates.

A more realistic view of channels

Modern marketing channels are not single signals. A channel is often a bundle of measures: exposure, engagement, intent, and conversion proximity. mmm_attribution treats this structure explicitly through D_variables, where each input signal is mapped to a final channel and to a measure type.

This makes the model easier to explain:

  • variables remain visible;
  • channels remain the final attribution level;
  • measure types provide context;
  • prior weights encode business assumptions;
  • diagnostics can show how the final attribution differs from the starting signal distribution.

The result is an attribution workflow that is less dependent on raw volume and more aligned with how marketers actually reason about signal quality.

What the analyst controls

The analyst controls three important choices:

  1. which variables enter the model;
  2. how each variable maps to a channel and measure;
  3. what prior weight each variable receives.

This creates a useful balance between automation and judgment. The model is not a black box that blindly consumes raw columns, but it is also not a manual scoring table. It combines structured business knowledge with data-driven attribution.

Choosing prior weights

A practical way to think about prior weights is:

how much conversion-equivalent value should one unit of this signal carry before the model evaluates the data?

For some businesses, a click may be much more valuable than an impression. For others, impressions from a highly qualified audience may deserve more credit than generic clicks. For email, a click may represent strong intent. For direct searches, the signal may be close to demand capture. For video views, the value may depend heavily on campaign objective and audience quality.

The following values are examples of valid prior weights that can be used as a conservative starting point before adapting them to the specific business, market, and data source.

MeasureDescriptionExample prior_weight = conversions / touchpoints
impressionsAd impressions served or exposures to the message.0.01
viewsGeneric views of content or media assets.0.01
video_viewsVideo views, including partial views, on paid or owned channels.0.01
reachUnique users reached by a campaign or content.0.01
clicksClicks on ads, links, content, or calls to action.0.25
sessionsWebsite or app sessions generated by a touchpoint.0.30
visitsVisits to a website, landing page, app, or digital property.0.30
landing_page_viewsEffective landing page views after an interaction.0.25
engaged_sessionsSessions with qualified engagement signals.0.40
add_to_cartAdd-to-cart events or equivalent strong intent actions.0.80
checkout_startedCheckout starts or beginning of a final conversion process.0.90
direct_searchesDirect searches for the brand, product, or website.0.50
brand_searchesSearches explicitly related to the brand.0.70
generic_searchesNon-brand searches related to needs, categories, or generic keywords.0.30
emails_sentEmails sent or delivered in a campaign.0.05
email_opensRecorded email opens.0.10
email_clicksClicks generated from emails.0.30
sms_sentSMS messages sent or delivered.0.08
sms_clicksClicks generated from SMS messages.0.35
push_sentPush notifications sent.0.03
push_opensOpens or initial interactions with push notifications.0.10
affiliate_clicksClicks from affiliates or partners.0.30
store_visitsPhysical visits to a store or point of sale.0.50
callsPhone calls generated by campaigns or touchpoints.0.80
app_installsApp installs attributable to campaigns or touchpoints.0.80
app_opensApp opens after install or re-engagement.0.20

There is no universal table that is correct for every company. The best prior weights are usually business-specific and should improve over time as experiments, benchmarks, and historical evidence accumulate.