mmm_attribution
mmm_attribution is the unified public function for attribution from aggregated marketing signals.
It estimates channel attribution from time series data such as impressions, clicks, direct searches, sessions, costs, leads, revenue proxies, or custom business signals.
The attribution engine is selected with the model parameter:
model = "reward" | "copula" | "linear"
Conceptually:
model="reward"uses directional movement/coherence scoring;model="copula"uses non-linear dependence scoring based on copula/Spearman-style relationships;model="linear"uses a positive linear relationship score.
The user does not need to choose whether the score is computed on levels or on increments. The public function automatically blends level-based and delta-based evidence internally.
What the function does
mmm_attribution estimates how much each channel contributes to the target, using only aggregated input signals.
The function supports:
- one or more variables per final attribution channel;
- arbitrary measure labels in
D_variables; - multiple measures in the same model;
- lag evaluation through
max_p; - automatic level/delta score blending;
- conservative score multipliers;
- optional diagnostics.
When multiple measures are provided, the model works hierarchically:
raw signals
→ measure-level saturation
→ attribution across measures
→ attribution inside each measure
→ final channel attribution
Measure labels are free. They are used to group variables and estimate measure-level behavior. They are not validated against a predefined cap table.
When the same channel receives contributions from multiple measures, the final channel-level raw contribution is assembled conservatively by taking the maximum contribution across measures rather than summing all measure contributions. This reduces the risk of double counting when the same channel is represented by multiple related signals.
Input structure
Data must contain:
- a timestamp column;
- the target column passed through
target; - one or more signal columns used as attribution inputs.
target is required and must be the name of an existing column in Data. The target column must not be included in D_variables.
D_variables must contain one row per non-target signal variable and these columns:
| COLUMN | DESCRIPTION |
|---|---|
| variable | Name of the signal column in Data. |
| channel | Final attribution channel associated with the signal. Multiple variables can map to the same channel. |
| measure | Measure label associated with the signal, such as impressions, clicks, direct_searches, sessions, cost, or any custom label. |
Example:
| Signal variable | Final channel | Measure type |
|---|---|---|
direct_searches | direct | direct_searches |
facebook_impressions | facebook_ads | impressions |
facebook_clicks | facebook_ads | clicks |
google_impressions | google_ads | impressions |
google_clicks | google_ads | clicks |
If multiple variables map to the same channel, the final output is aggregated at channel level.
Parameters
| PARAMETER | TYPE | DEFAULT | DESCRIPTION |
|---|---|---|---|
| Data | data.frame | Aggregated dataset with timestamp, target, and input signal columns. | |
| D_variables | data.frame / table | Table with variable, channel, and measure columns. | |
| target | str | required | Name of the target column in Data. |
| model | str | "linear" | Aggregated attribution engine. Allowed values are "reward", "copula", and "linear". |
| max_p | int | 12 | Maximum lag length considered when evaluating delayed effects. |
| nsim | int | 1000 | Simulation parameter used by the copula engine and kept for a consistent public API. |
| seed | int | 1234567 | Random seed for reproducible results. |
| verbose | int / bool | 1 | Controls runtime logging. |
| server | str / list[str] | hosted endpoint | Server endpoint used by the hosted computation service. |
| password | str | NULL / None | Authentication token for the hosted service. |
| return_diagnostics | bool | FALSE / False | If enabled, returns attribution together with diagnostic objects. |
Model options
| MODEL | DESCRIPTION |
|---|---|
| reward | Directional movement/coherence attribution. The delta component rewards aligned movements between the signal and the target: ΔX > 0, ΔY > 0 and ΔX < 0, ΔY < 0 are coherent; opposite signs are incoherent. |
| copula | Non-linear dependence-based attribution. The copula score requires positive Spearman dependence between the signal and the target; non-positive dependence is not rewarded. |
| linear | Positive linear relationship-based attribution. If the estimated relationship between the signal and the target is not positive, the linear score is set to zero. |
linear is the default public model because it provides a stable and interpretable baseline. reward is useful when aligned movements are the main signal of interest. copula is useful when non-linear positive dependence is expected.
Legacy aliases may still be accepted for compatibility, but new documentation and examples should use reward, copula, or linear.
Automatic level/delta blending
The public function does not expose a parameter for choosing level or delta scoring. For every signal, the core computes both:
score_level = score on the original levels
score_delta = score on increments / differences
Then it estimates a dynamic weight from the signal variation:
delta_mass = sum_t |X[t] - X[t-1]|
level_mass = sum_t |X[t]|
w_raw = delta_mass / (level_mass + eps)
w_delta = w_raw / (1 + w_raw)
For robustness, w_delta is constrained to the interval [0.30, 0.70]. Stable signals receive more level weight; dynamic or intermittent signals receive more delta weight.
Multi-measure attribution
When D_variables contains more than one measure, attribution is computed in two stages.
First, the model aggregates raw signals by measure, estimates measure-level saturation, and attributes the target across measures.
Then, for each measure, the model attributes the measure-level contribution across the variables/channels belonging to that measure.
Finally, channel-level contributions are assembled from the measure-level results. If a channel appears in multiple measures, the model uses the strongest measure-level contribution for that channel rather than summing every contribution:
channel_raw[t, c] = max_m contribution[t, c, m]
The final attribution is then normalized back to the target value for each time period.
Output
If return_diagnostics = FALSE, the function returns an attribution data frame.
If return_diagnostics = TRUE, the function returns a list/dict that may include:
| OUTPUT | TYPE | DESCRIPTION |
|---|---|---|
| attribution | data.frame | Final channel attribution by time period. |
| base_raw | data.frame / object | Raw attribution output before final normalization and adjustments. |
| model | str | Resolved model: reward, copula, or linear. |
| score_multiplier | dict / object | Channel score multipliers. |
| applied_score_multiplier | dict / object | Effective multipliers applied in attribution assembly. |
| individual_scores | dict / object | Channel-level scores selected from lag evaluation. |
| measure_weights | dict / object | Measure-level weights when multiple measures are present. |
| input_saturation_alpha_by_measure | dict / object | Estimated saturation alpha by measure. |
| input_measure_scaled_share_pct | dict / object | Scaled/saturated input share by measure. |
| base_raw_source | str | Internal source of the raw attribution assembly. In multi-measure mode, this may indicate max-measure channel aggregation. |
| mode | str | Execution mode metadata. |
| core_diagnostics | list / dict | Additional model diagnostics. |
Python example
import pandas as pd
from ChannelAttributionPro import mmm_attribution
Data = pd.read_csv("data_mmm.csv")
D_variables = pd.DataFrame({
"variable": [
"direct_searches",
"facebook_impressions",
"facebook_clicks",
"google_impressions",
"google_clicks",
],
"channel": [
"direct",
"facebook_ads",
"facebook_ads",
"google_ads",
"google_ads",
],
"measure": [
"direct_searches",
"impressions",
"clicks",
"impressions",
"clicks",
],
})
res = mmm_attribution(
Data=Data,
D_variables=D_variables,
target="conversions",
model="linear",
max_p=12,
return_diagnostics=False,
)
R example
library(ChannelAttributionPro)
Data = read.csv("data_mmm.csv")
D_variables = data.frame(
variable = c(
"direct_searches",
"facebook_impressions",
"facebook_clicks",
"google_impressions",
"google_clicks"
),
channel = c(
"direct",
"facebook_ads",
"facebook_ads",
"google_ads",
"google_ads"
),
measure = c(
"direct_searches",
"impressions",
"clicks",
"impressions",
"clicks"
)
)
res = mmm_attribution(
Data = Data,
D_variables = D_variables,
target = "conversions",
model = "linear",
max_p = 12,
return_diagnostics = FALSE
)