choose_order
ChannelAttribution Pro includes an out-of-sample algorithm for choosing the best Markov model order. First, the data are split into a train set and a test set. Using the train set a Markov model is estimated for each considered order. Each Markov model is used to predict the end state (conversion/no conversion) for each customer journey on the test set. For each Markov model, a ROC curve is defined and the area under the curve is calculated (AUC). The procedure is repeated on multiple test sets which are randomly chosen from the full data set (cross-validation procedure). For each order, an average AUC over all the test sets considered is calculated. The order with the maximum average AUC is finally chosen.
Best Markov model order in ChannelAttribution Pro can be choosen through choose_order function which incorporate the out-of-sample procedure procedure explained above.
Parameters
PARAMETER | TYPE | DEFAULT | DESCRIPTION |
---|---|---|---|
Data | data.frame/str | data.frame or a file address where customer journeys are stored | |
var_path | str | name of the column containing paths | |
var_conv | str | name of the column containing total conversions | |
var_value | str | None | name of the column containing total conversion value |
var_null | str | None | name of the column containing total paths that do not lead to conversion |
row_sep | str | "," | if Data is a file address then row _sep is the line separator |
cha_sep | str | ">" | separator between channels |
roc_npt | int | 100 | number of points in ROC |
max_order | int | 10 | maximum Markov model order to be considered |
nfold | int | 10 | number of fold for cross validation |
perc_test | double | 0.3 | percentage of customer journeys that will be included in the test set |
seed | int | 1234567 | random seed. Giving this parameter the same value over different runs guarantees that results will not vary |
perc_tol | double | 0.01 | percentage of tolerance. If order o has an AUC(o) which is greater than (1-perc_tol) x AUC(o+1) then order o is consider better than o+1 |
plot | bool | True if True, a plot with auc will be displayed | |
type | str | "auc-roc" | if "auc-roc", area under ROC curve will be calculated, if "auc-prerec" area under Precision-Recall curve will be calculated |
server | str | "app.channelattribution.io" | address of the server where password will be checked to authorize the execution of the function |
password | str | None | user |
Output
OUTPUT | TYPE | DESCRIPTION |
---|---|---|
auc | data.frame | AUC for each analyzed order |
best_order | int | best order selected by the procedure |