Keywords: Relative Importance Analysis 🞄 Dominance Analysis 🞄 Shapley Value Decomposition 🞄 Owen Value Decomposition
Comparing independent variables (IVs) in terms of how each contributes to predicting a dependent variable is a common practice when evaluating statistical models like linear regression and is known as relative importance analysis (e.g., Tonidandel & LeBreton, 2011). Historically, many research applications of relative importance analysis have used simple approaches to evaluating IV contributions to prediction such as comparing standardized regression coefficients, the change in the \(R^2\) or \(\Delta R^2\) when each IV when included first in the model, or the \(\Delta R^2\) when an IV is included last after all other IVs are included (Grömping, 2007; Johnson & LeBreton, 2004). Relative importance analysis has, however, been defined as a method that evaluates contributions that an IV makes when alone as well as the contribution it makes when included along with other IVs (Johnson & LeBreton, 2004). The definition of relative importance analysis has led statisticians to recommend methods that can accommodate the contributions IVs make to prediction in the context of different subsets of IVs simultaneously like Dominance Analysis (DA; Budescu & Azen, 2004). DA is one of the most widely used relative importance methods in the literature and has been applied to research in multiple behavioral science domains including understanding response styles on surveys (e.g., Miller, Kirby, & Stevens, 2025), communication patterns in educational settings (e.g., Yin & Zhou, 2025), and longitudinal injury risks (e.g., McLaurin, West, & Thomson, 2025).
DA differs from other recommended relative importance methods (e.g., relative weights, Pratt’s method/geometrical decomposition; Johnson & LeBreton, 2004; Thomas, Zumbo, Kwan, & Schweitzer, 2014) in that it generates a hierarchy of pairwise dominance designations that describe the relative importance of two IVs. Across all pairs of IVs, the hierarchy of dominance designations results in a rank ordering of the IVs from which importance can be determined. The three dominance designations, in order of the strength of evidence they provide about the relative importance of two IVs, are: complete, conditional, and general (Azen & Budescu, 2003). The three strength of evidence levels reported by DA provide the researcher much detail about the predictive utility of each IV relative to each other IV in the model. However detailed, the hierarchy of designations generated by DA is only possible because the method uses \(\Delta R^2\) values for all possible combinations of IVs included in the model. Computing \(\Delta R^2\) values for all possible IV combinations is a computationally expensive methodology and, as the number of IVs gets large, DA grows computationally intractable.
Restructuring DA by making inseparable groups of IVs can mitigate the effect of having many IVs in a model as the \(\Delta R^2\) values required will derive from all combinations of IV groups instead of all combinations of IVs (e.g., Bittmann, 2024; Gu, 2023; Luchman, 2021). This is because all members of an IV group are included simultaneously and the predictive usefulness/\(\Delta R^2\) value of the group as a whole is used in the dominance statistics and designations. Grouping IVs is not a full solution to the issue of having many IVs as it results in a strong constraint on the determinations from the DA. The constraint imposed by grouping IVs is that a researcher will not be able to determine the importance of individual IVs within an IV group as all IVs in an IV group are inseparable.
The purpose of this manuscript is to develop a within-group extension of DA that uses IV groups yet will allow relative importance determinations for IVs within an IV group. As I will show, the proposed within-group extension of DA combines aspects of the DA method considering all IVs in the model separately, referred to from this point on as the traditional DA method, with aspects of the DA method where IVs are grouped into inseparable IV groups, referred to from this point on as the grouped DA method. The primary benefit of the proposed within-group method is that it retains the traditional DA method’s ability to compare all IVs to one another while also substantially reducing the number of required combinations for obtaining dominance designations similar to the grouped DA method. For example, I show later in this manuscript that a statistical model with 20 IVs could be grouped such that the within-group DA method requires only .01% of the combinations required for the traditional DA method yet still would allow each IV to be compared to each other IV. As a result, the proposed within-group DA method offers researchers a way to apply relative importance analysis to statistical models with a large number of IVs—so long as those IVs can be grouped together.
I discuss the proposed within-group DA method in a series of five sections. First, I review computational details of the traditional DA method. Extending from the review of the traditional DA methodology, I discuss how the traditional DA method’s general dominance statistics derive from the Shapley value solution concept in cooperative game theory (Shapley, 1953). Third, I discuss the grouped DA/Shapley value method where IVs are bundled into inseparable IV groups. The discussion of grouped DA/Shapley values leads directly to introducing Owen values (Owen, 1977), an extension of Shapley values where the computation is broken into multiple stages that depend on the structure of the IV groups. Finally, I show how Owen values can be translated back into DA computations and define the within-group DA methodology.
Following the five conceptual sections, this manuscript provides an analytic example which includes a detailed account of computing traditional, grouped, and within-group DA statistics and designations using the ability.cov data from the datasets package in the R statistical computing environment.
Imagine I am using variables \(\alpha \), \(\beta \), \(\xi \), and \(\zeta \), as IVs in a linear regression model and that I am seeking to determine the relative importance of these four IVs in predicting some dependent variable. As I mentioned in the introduction, approaches to determining relative importance have historically focused on computing \(\Delta R^2\) for IVs when they are included in the model and I begin by taking that approach. As opposed to the approaches discussed in the introduction (i.e., comparing IVs when included first or last), another way I could determine the relative importance of all four IVs is to use \(\Delta R^2\) values for each IV as included in a specific sequence. If I were to use the sequence of the four IVs described above, I would then obtain four values:
This sequence of \(R^2\) and \(\Delta R^2\) values are comparable to one another as as relative importance results and will result in a full decomposition of the model’s \(R^2\) that can be ascribed to each IV (e.g., Kruskal, 1987). Therefore, an added benefit of this method is that each value is simple to interpret as each value is a part of the whole model \(R^2\) and can be described as a percentage of the \(R^2\).
The sequential method described in the previous paragraph requires a researcher to predetermine an inclusion precedence sequence for the IVs in their model. I define an inclusion precedence sequence to be a determination by the researcher as to the order in which the IVs will be included in the model to compute, and compare, \(R^2\) or \(\Delta R^2\) values. In addition, I define the inclusion precedence position of an IV as an IV’s sequential position of inclusion in a specific inclusion precedence sequence. IVs with a higher inclusion precedence position precede, and are thus included before, IVs with a lower IV inclusion precedence position.
Predetermining a single IV inclusion precedence sequence is a simple and compelling, but infrequently used, method for determining IV relative importance. This method is infrequently used as it is often difficult to determine a best, most plausible inclusion precedence sequence for a set of IVs. That is, there may be reasonable disagreements that different researchers would have about the inclusion precedence sequencing for the IVs in a model. When IVs are correlated with one another, such disagreements over inclusion precedence sequencing can lead to different conclusions about IV relative importance as IVs that are positioned earlier in the sequence are ascribed components of the \(R^2\) that overlap among multiple IVs. Hence, if \(\alpha \), \(\beta \), \(\xi \), and \(\zeta \), are correlated, their positions in the sequence can strongly affect their relative importance determinations.
The DA methodology derives directly from the sequential/predetermined inclusion precedence sequence approach by averaging across all possible IV inclusion precedence sequences (Budescu, 1993). DA then needs no predetermined inclusion precedence sequence as all sequences are incorporated into the dominance designations used to determine IV relative importance.
All three dominance designations (i.e., complete, conditional, and general) evaluate the inclusion precedence sequences of a set of IVs, \(\textbf {X}\), with a comparison methodology for any given pair of IVs that follows Equation 1. \begin {equation} D_{\alpha \, > \, \beta } = \sum _{i=1}^{K} \frac { \left \{ \begin {array}{l} 1, \ V_{i}^{\alpha } > V_{i}^{\beta } \\ 0, \ Otherwise \end {array} \right .} {K} \label {eq:dm} \end {equation} where \(\alpha \in \textbf {X}\) (i.e., \(\alpha \) is a member of \(\textbf {X}\)), \(\beta \in \textbf {X}\), \(K\) is a number of comparisons, \(V_{i}^{\alpha }\) is a value (e.g., a \(\Delta R^2\)) that is associated with \(\alpha \) given comparison \(i\), and \(V_{i}^{\beta }\) is a value that is associated with \(\beta \) given the same comparison \(i\). Equation 1 shows that each pairwise dominance designation is built on the proportion of comparisons between \(\alpha \) and \(\beta \) that favor \(\alpha \). A dominance designation is achieved by \(\alpha \) when, for that level of dominance, the proportion produced by Equation 1 is a value of 1. Thus, dominance designations are only achieved when all the comparisons favor \(\alpha \) over \(\beta \).
Recall that each dominance designation differs in terms of the strength of evidence it offers about a pair of IVs. The key difference between the dominance designations is the number of comparisons, \(K\), that are made between the two IVs. Complete dominance has the most comparisons (i.e., \(2^{|\textbf {X}|-2}\); where \(|\textbf {X}|\) is the cardinality, or number of members in, \(\textbf {X}\)) which is the reason why it is the strongest designation as it is the hardest to achieve. General dominance has the fewest comparisons (i.e., 1) which is the reason why it is the weakest designation as it is the easiest to achieve. Conditional dominance falls in-between complete and general in terms of the number of comparisons (i.e., \(|\textbf {X}|\)) and in terms of its strength of evidence.
The \(V^{\alpha }_i\) and \(V^{\beta }_i\) values in Equation 1 also differ across dominance designations and incorporate the inclusion precedence sequences differently. For instance, complete dominance uses \(\Delta R^2\) values directly which produces a larger \(K\) value as many specific comparisons, \(i\), are required. By contrast, general dominance averages all the \(\Delta R^2\) values for an IV into a single statistic. In the sections to come, I provide details on the computation of the \(V^{\alpha }_i\) and \(V^{\beta }_i\) values for each of the dominance designations beginning with complete dominance. Note that following the five conceptual sections, section six provides computational details for most of the dominance designations discussed in the coming three subsections.
Complete dominance seeks to determine whether \(\alpha \) produces larger \(\Delta R^2\) values than \(\beta \) across all possible inclusion precedence sequences in which the other \(|\textbf {X}|-2\) IVs would have higher inclusion precedence positions. Computationally, the focus of complete dominance is on combinations of other IVs as opposed to sequences of other IVs. This is because the \(R^2\) is agnostic about the order in which IVs are included. For example, imagine I’m seeking to compare the \(\Delta R^2\) of \(\alpha \) and \(\beta \) beyond \(\xi \) and \(\zeta \). Because the \(R^2\) value when \(\xi \) and \(\zeta \) are included prior to \(\alpha \) or \(\beta \) does not change depending on whether \(\xi \) or \(\zeta \) is included as the first IV in the sequence, the specific inclusion precedence positions of \(\xi \) and \(\zeta \) for complete dominance designations is irrelevant. Thus, the number of comparisons for complete dominance focuses on unique \(\Delta R^2\) values which will be all combinations of other IVs that have a higher inclusion precedence position, and precede, \(\alpha \) and \(\beta \).
I will refer to the comparison IV subsets that include all other IVs as \(\textbf {X}^{\setminus \{\alpha ,\beta \}}_i\) where \(\textbf {X}^{\setminus \{\alpha ,\beta \}}_i \in \mathcal {P}(\textbf {X} \setminus \{\alpha ,\beta \})\). This means that IV subset \(\textbf {X}^{\setminus \{\alpha ,\beta \}}_i\) is one member, \(i\), of the power set (i.e., \(\mathcal {P}()\)) of \(\textbf {X}\) omitting \(\alpha \) and \(\beta \). The power set is a set that contains all possible subsets of members of a set. Thus, \(\textbf {X}^{\setminus \{\alpha ,\beta \}}_i\) refers to one IV subset of the all the possible ways of combining the IVs in \(\textbf {X} \setminus \{\alpha ,\beta \}\) into IV subsets.
The \(V_{i}^{\alpha }\) values used in Equation 1 for complete dominance are computed as \(R^2_{\textbf {X}^{\setminus \{\alpha ,\beta \}}_i \cup \alpha } - R^2_{\textbf {X}^{\setminus \{\alpha ,\beta \}}_i} = \Delta R^2_{\textbf {X}^{\setminus \{\alpha ,\beta \}}_i \cup \alpha }\). The value \(\Delta R^2_{\textbf {X}^{\setminus \{\alpha ,\beta \}}_i \cup \alpha }\) is then the increment to the \(R^2\) that \(\alpha \) makes when it is included with the set of IVs in \(\textbf {X}^{\setminus \{\alpha ,\beta \}}_i\). The \(V_{i}^{\beta }\) values are computed as \(\Delta R^2_{\textbf {X}^{\setminus \{\alpha ,\beta \}}_i \cup \beta }\) where \(\beta \) is substituted for \(\alpha \). Complete dominance is then determined by comparing \(V_{i}^{\alpha }\) and \(V_{i}^{\beta }\) across all \(K = 2^{(|\textbf {X}|-2)} = |\mathcal {P}(\textbf {X} \setminus \{\alpha ,\beta \})|\) comparisons of the different members of \(\mathcal {P}(\textbf {X} \setminus \{\alpha ,\beta \})\).
Because the complete dominance designation requires every \(\alpha \) versus \(\beta \) \(\Delta R^2\) comparison to favor \(\alpha \), complete dominance is a comprehensive, non-compensatory designation and is difficult to meet for most IV pairs. When IV relative importance cannot be determined using complete dominance, it may still be possible to determine IV relative importance using conditional dominance.
Conditional dominance seeks to determine whether \(\alpha \), on average, produces larger \(\Delta R^2\) values at a specific inclusion precedence position, \(P\), than does \(\beta \) across all possible \(|\textbf {X}|\) inclusion precedence positions.
The \(V_{i}^{\alpha }\) values used in Equation 1 for conditional dominance are known as conditional dominance statistics and are computed as in Equation 2. \begin {equation} C_{\alpha }^P = \sum _{i = 1}^{B^P} \frac {\Delta R^2_{\textbf {X}^{\setminus \alpha :P}_i\cup \alpha }} {B^P} \label {eq:cdl} \end {equation} where \(\textbf {X}^{\setminus \alpha :P}_i \in \mathcal {P}(\textbf {X} \setminus \alpha ): |\textbf {X}^{\setminus \alpha :P}_i| = P-1\) and \(B^P = \frac {(|\textbf {X}|-1)!}{(P-1)!([|\textbf {X}|-1]-[P-1])!}\).
The set \(\textbf {X}^{\setminus \alpha :P}_i\) is then defined as one member, \(i\), of the power set of \(\textbf {X}\) excluding \(\alpha \) with the constraint that it must have \(P-1\) members. Similar to complete dominance, conditional dominance then also considers combinations of other IVs that could precede \(\alpha \) given the agnosticism of the \(R^2\) statistic related to inclusion precedence. As compared to complete dominance, conditional dominance incorporates all inclusion precedence sequences of IVs more directly as the value of \(B^P\). This is because \(B^P\) is composed of all possible sequences of IVs given \(\alpha \) is at a fixed inclusion precedence position (i.e., the numerator value ‘\((|\mathbf {X}|-1)!\))’ adjusting for redundant \(R^2\) values given the number of IVs that would precede \(\alpha \) (i.e., the ‘\((P-1)!\)’ component of the denominator) and the number of IVs that would succeed \(\alpha \) (i.e., the ‘\(([|\mathbf {X}|-1] - [P-1])!\)’ component of the denominator). Moreover, \(B^P = |(\mathcal {P}(\textbf {X} \setminus \alpha ): |\textbf {X}^{\setminus \alpha :P}_i| = P-1)|\). The value of \(B^P\) is then equivalent to the number of combinations of the IVs in \(\textbf {X} \setminus \alpha \) such that they have \(P - 1\) members.
The \(V_{i}^{\beta }\) values are computed using \(C^P_{\beta }\) which substitutes \(\beta \) for \(\alpha \) in Equation 2 and conditional dominance is determined by comparing \(V_{i}^{\alpha }\) and \(V_{i}^{\beta }\) across all \(K = |\mathbf {X}|\) inclusion precedence positions.
Conditional dominance is a more compensatory relative importance determination than complete dominance in that \(\Delta R^2\) values for \(\alpha \) at a specific inclusion precedence position need not always be larger than those for \(\beta \), they just need to be on average larger. Hence, at a specific inclusion precedence position, one or more \(\Delta R^2\) value(s) for \(\alpha \) can be smaller than those for \(\beta \) and yet \(\alpha \) could still be determined to conditionally dominate \(\beta \). When relative importance for a pair of IVs cannot be determined using complete or conditional dominance, it may still be possible to determine relative importance for the pair using general dominance.
General dominance seeks to determine whether \(\alpha \) produces larger average conditional dominance statistics than does \(\beta \). The \(V_{i}^{\alpha }\) value used in Equation 1 for general dominance is known as the general dominance statistic and it is computed as in Equation 3: \begin {equation} C_{\alpha } = \sum _{P = 1}^{|\textbf {X}|} \frac {C^{P}_{\alpha }}{|\textbf {X}|}. \label {eq:gen} \end {equation} The \(V_{i}^{\beta }\) value is computed using as \(C_{\beta }\) where \(\beta \) is substituted for \(\alpha \) in Equation 3. General dominance is determined by comparing each IV’s general dominance statistic which means that \(K = 1\) or that there is only one comparison necessary to determine general dominance.
General dominance statistics are equivalent to Shapley values that have been used as a method to produce an additive decomposition of the \(R^2\) which ascribes components of its value to the IVs included in the model (e.g., Grömping, 2007). In the section below, I show how general dominance statistics translate into Shapley values.
General dominance statistics for \(\alpha \) are the arithmetic average of \(\alpha \)’s conditional dominance statistics and, if I were to combine Equations 2 and 3 and expand on the \(B^P\) term, I would obtain: \begin {equation} C_{\alpha } = \sum _{P = 1}^{|\textbf {X}|} \sum _{i = 1}^{B^P} \frac {(P - 1)!([|\textbf {X}| - 1] - [P- 1])!}{|\textbf {X}|!} \Delta R^2_{\textbf {X}^{\setminus \alpha :P}_i\cup \alpha }. \label {eq:gen_shap} \end {equation} The result in Equation 4 is implied by taking the multiplicative inverse of \(B^P\) and considering that \((|\textbf {X}| - 1)! \cdot |\textbf {X}| = |\textbf {X}|!\). Equation 4 also shows that \(C_{\alpha }\) statistics are a weighted average of the \(\Delta R^2\) values associated with including \(\alpha \) beyond each possible combination of the other \(|\mathbf {X}|-1\) IVs. Additionally, Equation 4 shows that the weight applied to each \(\Delta R^2\) is proportional to the number of times \(\alpha \) appears at a specific inclusion precedence position given all possible inclusion precedence sequences, or permutations, of IVs in \(\mathbf {X}\). As was discussed in the section on conditional dominance, this is because there will be \((P-1)!\) ways that the IVs with a higher inclusion precedence (i.e., the IVs in \(\mathbf {X}_{i}^{\setminus \alpha : P}\)) will precede \(\alpha \). There will also be \(([|\textbf {X}|-1]-[P-1])!\) ways that IVs with a lower inclusion precedence (i.e., the IVs in \(\mathbf {X}\setminus \mathbf {X}_{i}^{\setminus \alpha : P}\)) will succeed \(\alpha \).
As opposed to expressing \(C_{\alpha }\) as a weighted average of all combinations of IVs in \(\mathbf {X}\), \(C_{\alpha }\) can also be expressed as an average of all inclusion precedence sequences of IVs in \(\mathbf {X}\) as in Equation 5: \begin {equation} C_{\alpha } = \sum _{i = 1}^{|\mathbf {X}|!} \frac {\Delta R^2_{\tilde {\textbf {X}}_i}} {|\mathbf {X}|!}, \label {eq:shap} \end {equation} where \(\tilde {\textbf {X}}_i = (\tilde {\mathbf {X}}_{>}^{\setminus \alpha }, \alpha , \tilde {\mathbf {X}}_{<}^{\setminus \alpha })_i \in \mathsf {Sym}(\mathbf {X})\) and \(\mathsf {Sym}()\) is the symmetric group function which creates all possible permutations of the elements of a set.
The term \(\tilde {\textbf {X}}_i\) is then a complete inclusion precedence sequence, \(i\), of all of the IVs in \(\mathbf {X}\) where \(\tilde {\mathbf {X}}_{>}^{\setminus \alpha }\) is an ordered set, or sub-sequence, of IVs that have higher inclusion precedence positions than, and thus precede, \(\alpha \) in sequence \(i\) and \(\tilde {\mathbf {X}}_{<}^{\setminus \alpha }\) is a sub-sequence of the IVs that have lower inclusion precedence positions than, and thus succeed, \(\alpha \) in sequence \(i\). In addition, the value of \(\Delta R^2_{\tilde {\textbf {X}}_i}\) is constructed as \(R^2_{(\mathbf {X}_{>}^{\setminus \alpha }, \alpha )_i} - R^2_{(\mathbf {X}_{>}^{\setminus \alpha })_i}\).
Equation 5 is also a simplified formulation for Shapley values that more clearly illustrates its conceptual origins in ordered sequences. Indeed, the primary difference between Equations 4 and 5 is that the numerator of the weight in Equation 4 is translated into additional \(\Delta R^2\) values to be averaged in Equation 5.
The Shapley value solution concept was developed in cooperative game theory and ascribes contributions to a payoff value earned by a set of players in a game to the individual players. As is implied by Equation 5, Shapley values ascribe values to players by computing the incremental contribution to the payoff each player makes in all possible permutations of sequences for including players in the game (Shapley, 1953). Specifically, players are added to the game sequentially and, each time a new player is added, the change to the payoff is computed and ascribed to that player. Across all permutations of player inclusion precedence sequences, the incremental payoffs for a player are averaged and this average constitutes the Shapley value for that player. General dominance statistics map to Shapley values by considering IVs as players, the linear regression as the game they play, and the \(\Delta R^2\) as the payoff. Equation 5 is also equivalent to the metric proposed by Lindeman, Marenda, and Gold (1980; as cited in Budescu, 1993) from which the general dominance statistic was originally derived.
As was mentioned in the introduction, one problem with the use of Shapley values and DA for relative importance is that these methods tend to require large numbers of \(\Delta R^2\) values corresponding with different subsets of IVs. This is because \(C_{\alpha }\) statistics require \(\Delta R^2\) values from all \(|\mathcal {P}(\mathbf {X})| = 2^{|\mathbf {X}|}\) combinations of IVs in \(\mathbf {X}\). I also mentioned in the introduction that a practical solution to this problem is to use the grouped DA method and put the IVs in \(\mathbf {X}\) into IV groups where members of an IV group are inseparable.
IVs can be put into inseparable groups prior to generating \(\Delta R^2\) values for use in Shapley values and DA. In some cases, IV groups are formed from collections of IVs with a conceptual similarity that the researcher believes are more valuable to discuss as a conceptual category than as individual IVs (see Gu, 2023, for an example). In other cases, IV groups are formed to reduce the number of IV combinations that are required to compute Shapley values or DA statistics (e.g., Bittmann, 2024; Luchman, 2021).
I define a set of IV groups as \(\textbf {G}\) which include all IVs in \(\textbf {X}\) such that \(|\textbf {G}| > 1\) or there must be at least two IV groups. In addition, \(\alpha \in \Gamma _{\alpha } \in \textbf {G}\). This means that IV \(\alpha \) is a member of IV group \(\Gamma _{\alpha }\) which is also a member of the IV groups in \(\textbf {G}\). \(\textbf {G}\) can then be substituted for \(\textbf {X}\) and \(\Gamma _{\alpha }\) can be substituted for \(\alpha \) in all places in Equation 5 to compute general dominance statistics/Shapley values on the IV groups as in Equation 6: \begin {equation} C_{\Gamma _{\alpha }} = \sum _{j = 1}^{|\textbf {G}|!} \frac {\Delta R^2_{\tilde {\textbf {G}}_j}} {|\textbf {G}|!}. \label {eq:shap_grp} \end {equation} where \(\tilde {\textbf {G}}_j = (\tilde {\mathbf {G}}_{>}^{\setminus \Gamma _{\alpha }}, \alpha , \tilde {\mathbf {G}}_{<}^{\setminus \Gamma _{\alpha }})_j \in \mathsf {Sym}(\mathbf {G})\). The sub-sequence \(\tilde {\mathbf {G}}^{\setminus \Gamma _{\alpha }}_{>}\) is a permutation of the IVs groups in \(\textbf {G} \setminus \Gamma _{\alpha }\) that have higher inclusion precedence positions than \(\Gamma _{\alpha }\) in sequence \(j\). Similarly, the sub-sequence \(\tilde {\mathbf {G}}^{\setminus \Gamma _{\alpha }}_{<}\) is a permutation of the IVs groups in \(\textbf {G} \setminus \Gamma _{\alpha }\) that have lower inclusion precedence positions than \(\Gamma _{\alpha }\) in sequence \(j\). The value of \(\Delta R^2_{\tilde {\textbf {G}}_j}\) is computed in a way identical to that of \(\Delta R^2_{\tilde {\textbf {X}}_i}\) by focusing on the increment that \(\Gamma _{\alpha }\) makes when included in the model following inclusion of all IV groups in \(\tilde {\mathbf {G}}^{\setminus \Gamma _{\alpha }}_{>}\).
I noted in the introduction that the grouped DA methodology imposes the strong constraint that it cannot distinguish between individual IVs within IV groups. This is because each IV within an IV group has the same inclusion precedence position as all the other members of their IV group. As a result, distinctions between IVs within an IV group are not possible as individual IV contributions to the \(\Delta R^2\) are pooled. Although grouped DA statistics/Shapley values cannot distinguish IV contributions within an IV group, there exist other methods which can disentangle contributions an IV makes from their IV group.
Owen values are an extension of Shapley values in that allow players to “unionize” and pool their impact in terms of how they are ascribed components of the payoff. Owen values then allow players to join together when being ascribed parts of a payoff across player unions yet still obtain individual, within-union payoff values (Owen, 1977). The method to ascribe components of the payoff to unions/groups, and then to individuals within those unions/groups, results in a two-step approach that extends on Equation 6.
Owen values begin by computing the Shapley values for all IV groups in \(\textbf {G}\) but add an additional pseudo-Shapley value like approach within an IV group, \(\Gamma _{\alpha }\). This second pseudo-Shapley value step holds the inclusion precedence positions of other IV groups constant but considers all inclusion precedence positions of IVs within \(\Gamma _{\alpha }\). Using this two step-approach, Owen values are able to first ascribe a component of the \(R^2\) to all the IV groups in \(\mathbf {G}\) and then, subsequently, are able to ascribe a sub-component of the \(R^2\) associated with an IV group to each of the individual IVs within the IV group.
Owen values, \(W_{\alpha }\), are computed as in Equation 7: \begin {equation} W_{\alpha } = \sum _{i = 1}^{grp\_perm(\mathbf {G})} \frac {\Delta R^2_{\tilde {\textbf {S}}_i}} {grp\_perm(\mathbf {G})}, \label {owen} \end {equation} where \(\tilde {\textbf {S}}_i = (\tilde {\mathbf {G}}_{>}^{\setminus \Gamma _{\alpha }}, \, {\tilde {\Gamma _{\alpha }}}_{>}^{\setminus \alpha }, \, \alpha , \, {\tilde {\Gamma _{\alpha }}}_{<}^{\setminus \alpha }, \, \tilde {\mathbf {G}}_{<}^{\setminus \Gamma _{\alpha }})_i \in \mathsf {Sym}(\mathbf {X}): \mathbf {G}\). A sequence \(\tilde {\textbf {S}}_i\) will then include all the IVs in \(\mathbf {X}\) but will require that IVs in the same IV group are contiguous. The sub-sequence \(\tilde {\Gamma _{\alpha }}^{\setminus \alpha }_{>}\) includes the IVs in \(\Gamma _{\alpha }\setminus \alpha \) with higher inclusion precedence positions than \(\alpha \) in sequence \(i\). The sub-sequence \(\tilde {\Gamma _{\alpha }}^{\setminus \alpha }_{<}\) includes the IVs in \(\Gamma _{\alpha }\setminus \alpha \) with lower inclusion precedence positions than \(\alpha \) in sequence \(i\). In addition, \(|\mathsf {Sym}(\mathbf {X}): \mathbf {G}| = grp\_perm(\mathbf {G}) = |\mathbf {G}|! \cdot \prod _{l=1}^{|\mathbf {G}|} |\Gamma _l|!\) where \(\Gamma _{l} \in \mathbf {G}\). The \(grp\_perm()\) function then takes a set of IV groups and computes the number of possible permutations of the IVs in \(\mathbf {X}\) such that it respects the grouping structure of \(\mathbf {G}\). Hence, \(grp\_perm(\mathbf {G})\) will compute permutations based on the number of IV groups (i.e., \(|\mathbf {G}|!\)) and the composition of the individual IV groups (i.e., \(\prod _{l=1}^{|\mathbf {G}|} |\Gamma _l|!\)). Finally, the value of \(\Delta R^2_{\tilde {\textbf {S}}_i}\) represents the increment to the \(R^2\) that \(\alpha \) makes beyond the IV groups in \(\tilde {\mathbf {G}}_{>}^{\setminus \Gamma _{\alpha }}\) and the other IVs in \(\tilde {\Gamma _{\alpha }}_{>}^{\setminus \alpha }\).
Note the similarity in the formulation of Owen values in Equation 7 and Shapley values in Equation 5. Owen values differ from Shapley values only in that they focus on \(\mathsf {Sym}(\mathbf {X}): \mathbf {G}\) instead of \(\mathsf {Sym}(\mathbf {X})\) and thus include only a subset of the possible inclusion precedence sequences of IVs in \(\mathbf {X}\). The inclusion precedence sequences used by Owen values require that all members of an IV group are contiguous in terms of their inclusion precedence positions.
An implication of the contiguity requirement imposed by Owen values is that all members of one IV group must all be included before members of another IV group are included in the sequence. This leads directly to another implication of the grouping structure; that \(|\mathsf {Sym}(\mathbf {X})| > |\mathsf {Sym}(\mathbf {X}): \mathbf {G}| > |\mathsf {Sym}(\mathbf {G})|\) which is another way of saying that the number of inclusion precedence sequences required for Owen values will always be between the number of inclusion precedence sequences required by traditional Shapley values and grouped Shapley values. Again, Owen values only use a subset of the possible inclusion precedence sequences used by Shapley values.
Owen values can translate back into DA statistics and this is the focus of the next section. I also note before moving on that an example of Owen decomposition is included in the analytic example section focusing on within-group DA statistics for interested readers.
The statistics and designations for the traditional DA method are derived in a way that extends conceptually from Shapely values and, similarly, the statistics and designations for the within-group DA method will be defined in a way that extends conceptually from Owen values. Therefore, a goal of the definition of the within-group DA method will be to preserve the relationships that the statistics and designations of traditional DA have with one another while ensuring that they are conceptually aligned with Owen values.
The intention of complete dominance is to compare two IVs, \(\alpha \) and \(\beta \), across all possible combinations of subsets of the other IVs to determine whether \(\alpha \) or \(\beta \) always produces a larger \(\Delta R^2\). I intend to ensure that a definition of within-group complete dominance derives from this requirement yet also respects the IV grouping structure imposed by \(\mathbf {G}\) and the contiguity in inclusion precedence sequences requirement that extends from it in Owen values. There is, however, no way to define a version of within-group complete dominance such that both of these constraints are met.
The reason that it is not possible to define a version of within-group complete dominance is that the IV grouping structure could result in two IVs, \(\alpha \) and \(\beta \), being in the same, or different, IV groups. IVs being in the same or in different IV groups results in strong constraints on inclusion precedence sequencing given the Owen value contiguity requirements.
When \(\alpha \) and \(\beta \) are in the same IV group, \(\alpha \) and \(\beta \) can precede or succeed the other \(|\mathbf {G}| - 1\) IV groups or the other \(|\Gamma _{\alpha }|-2\) IVs within their group in any given inclusion precedence sequence. It is then possible to compare \(\alpha \) and \(\beta \) across any subset of the other \(|\mathbf {G}| - 1\) IV groups or the other \(|\Gamma _{\alpha }|-2\) IVs in their IV group. Hence IVs in the same IV group could be compared using complete dominance in a way that derives from the traditional method yet respects the contiguity requirement of Owen values.
On the other hand, when \(\alpha \) and \(\beta \) are in different IV groups, \(\alpha \) and \(\beta \) are never to appear in an inclusion precedence sequence without one of them also appearing with all other members of their respective IV group. In such cases, \(\alpha \) and \(\beta \) could precede or succeed the other \(|\mathbf {G}| - 2\) IV groups, but there are no cases where \(\beta \) could precede or succeed subsets of members of \(\Gamma _{\alpha }\) as \(\beta \) is only allowed to appear in a sequence with all members of \(\Gamma _{\alpha }\), never a subset. Thus, the contiguity requirement of Owen values makes comparing IVs in different IV groups conceptually impossible in a way that derives from the traditional method.
As an illustration of the contiguity issue when comparing IVs in different IV groups, consider the four example IVs discussed above (i.e., \(\alpha \), \(\beta \), \(\xi \), and \(\zeta \)). In the traditional DA methodology, \(\alpha \) and \(\beta \) would be compared across the following four subsets of other IVs: \(\emptyset \) (i.e., the empty set; compared directly to one another), \(\xi \), \(\zeta \), and \(\{\xi ,\zeta \}\). Imagine I were to group the IVs such that \(\Gamma _{\{\alpha ,\xi \}} = \{\alpha ,\xi \}\) and \(\Gamma _{\{\beta ,\zeta \}} = \{\beta ,\zeta \}\). This grouping structure makes comparing \(\xi \) across \(\alpha \) and \(\beta \) impossible as \(\beta \) cannot appear with \(\xi \) without also including \(\alpha \). Similarly, \(\zeta \) cannot be compared across \(\alpha \) and \(\beta \) as \(\alpha \) cannot appear with \(\zeta \) without also including \(\beta \). This leaves only the following two subsets, \(\emptyset \) and \(\{\xi ,\zeta \}\), across which \(\alpha \) and \(\beta \) can be compared. By not being able to use \(\xi \) and \(\zeta \) separately, the comparisons between \(\alpha \) and \(\beta \) are confounded with \(\xi \) and \(\zeta \) which defies the idea underlying complete dominance that \(\alpha \) and \(\beta \) should be compared across all the other \(|\mathbf {X}| - 2\) IVs in the model.
Given that a within-group complete dominance method does not extend to IVs in different IV groups in a way that respects the contiguity requirement of Owen values, all IVs cannot be compared to one another using complete dominance. As such, there is no within-group complete dominance method.
The intention of conditional dominance is to compare averaged \(\Delta R^2\) values of \(\alpha \) and \(\beta \) across all inclusion precedence positions to determine whether \(\alpha \) or \(\beta \) always produces a larger averaged \(\Delta R^2\) value. Similar to complete dominance, I intend to ensure that a definition of within-group conditional dominance meets this requirement yet also respects the contiguity requirement of Owen values.
A critical first step in defining within-group conditional dominance is to determine how the concept of inclusion precedence position translates from the traditional method into the within-group method. For the traditional method, inclusion precedence is straightforward in that it corresponds with the inclusion precedence of each IV in \(\mathbf {X}\). When translating that idea into Owen values, the IV grouping structure complicates the translation process as the method could consider the inclusion precedence positions of IV groups in \(\mathbf {G}\)/values of \(Q\) or the inclusion precedence positions of individual IVs within an IV group \(\Gamma _{\alpha }\)/values of \(P^{\mathbf {G}}\).
I argue that inclusion precedence for within-group conditional dominance should be based on IV group inclusion precedence values, \(Q\), as they extend in a more natural way from the traditional DA method. IV group inclusion precedence is advantageous as all IVs will be in an IV group and the size of each IV group is irrelevant for IV group inclusion precedence positions as all \(\Delta R^2\) values within an IV group inclusion precedence position would be averaged. Thus, there will always be the same number of comparisons, \(|\mathbf {G}|\), for within-group conditional dominance.
By contrast, using inclusion precedence positions of IVs within an IV group would be exceedingly complicated due to the possibility that IV groups could be of different sizes. Moreover, it is not clear how the inclusion precedence of IV groups would be incorporated in a way that is conceptually reasonable and similar to the traditional DA method if the focus was on IVs within an IV group. I then again assert that it is advantageous to average over inclusion precedence positions of individual IVs within an IV group and to use IV group inclusion precedence positions as the basis for determining within-group conditional dominance.
Next, I consider how to average \(\Delta R^2\) values by IV group inclusion precedence position in a way that respects the contiguity requirement of Owen values. Doing so requires that I first define a new combination of IV groups and IVs within an IV group, \(\mathbf {T}_{(Q,j,P^{\mathbf {G}},i)}^{\setminus \alpha }\).
\(\mathbf {T}_{(Q,j,P^{\mathbf {G}},i)}^{\setminus \alpha }\) is one combination of \(Q-1\) IV groups, \(j\), from \(\mathbf {G} \setminus \Gamma _{\alpha }\) and one combination of \(P^{\mathbf {G}}-1\) IVs, \(i\), from \(\Gamma _{\alpha } \setminus \alpha \). More formally, \(\mathbf {T}_{(Q,j,P^{\mathbf {G}},i)}^{\setminus \alpha } = \{\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j,{\Gamma _{\alpha }}^{\setminus \alpha :P^{\mathbf {G}}}_i\}\) where \(\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j \in \mathcal {P}(\mathbf {G}\setminus \Gamma _{\alpha }): |\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j| = Q - 1\) and \({\Gamma _{\alpha }}_i^{\setminus \alpha :P^{\mathbf {G}}} \in \mathcal {P}(\Gamma _{\alpha }\setminus \alpha ): |{\Gamma _{\alpha }}_i^{\setminus \alpha :P^{\mathbf {G}}}| = P^{\mathbf {G}}-1\).
There will be a total of \(B^Q = \frac {(|\mathbf {G}|-1)!}{[Q-1]!([|\mathbf {G}|-1]-[Q-1])!}\) different ways in which other IV groups could precede \(\Gamma _{\alpha }\) at IV group inclusion precedence position \(Q\). In addition, there will be a total of \(|\Gamma _{\alpha }|\) IV inclusion precedence positions for IVs within \(\Gamma _{\alpha }\). Finally, there will be a total of \(B^{P^{\mathbf {G}}} = \frac {(|\mathbf {\Gamma _{\alpha }}|-1)!}{[P^{\mathbf {G}}-1]!([|\mathbf {\Gamma _{\alpha }}|-1]-[P^{\mathbf {G}}-1])!}\) different ways in which other IVs could precede \(\alpha \) at IV inclusion precedence position \(P^{\mathbf {G}}\). These three sets of combinations determine a specific number of required \(\Delta R^2\) summations for a conditional dominance statistic but not the number of times a specific \(\Delta R^2\) value is repeated at a summation value that can be incorporated into a weight similar to the traditional method.
The weight at any given summation for within-group conditional dominance is defined as in Equation 8: \begin {equation} wgt(Q,j,P^{\mathbf {G}},i) = \frac {(Q-1)! \cdot (P^{\mathbf {G}}-1)! \cdot (|\Gamma _{\alpha }|-P^{\mathbf {G}})! \cdot (|\mathbf {G}|-Q)!} {(|\mathbf {G}|-1)!\cdot |\Gamma _{\alpha }|!}. \label {eq:perms} \end {equation} The value of \(wgt(Q,j,P^{\mathbf {G}},i)\) reflects the number of identical \(\Delta R^2\) values obtained for different permutations of IV group members of \(\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j\) that precede \(\Gamma _{\alpha }\) (i.e., ‘\((Q-1)!\)’), different permutations of the IV group members of \(\mathbf {G} \setminus \mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j\) that succeed \(\Gamma _{\alpha }\) (i.e., ‘\((|\mathbf {G}|-Q)!\)’), different permutations of IV members of \({\Gamma _{\alpha }}_i^{\setminus \alpha :P^{\mathbf {G}}}\) that precede \(\alpha \) (i.e., ‘\((P^{\mathbf {G}}-1)!\)’), and different permutations of IV members of \(\Gamma _{\alpha } \setminus {\Gamma _{\alpha }}_i^{\setminus \alpha :P^{\mathbf {G}}}\) that succeed \(\alpha \) (i.e., ‘\((|\Gamma _{\alpha }|-P^{\mathbf {G}})!\)’).
In the section discussing Owen values, I mentioned that Owen values use \(grp\_perm(\mathbf {G}) = |\mathsf {Sym}(\mathbf {X}):\mathbf {G}|\) different inclusion precedence sequences total. It may come as a surprise then that the denominator of \(wgt(Q,j,P^{\mathbf {G}},i)\) includes only the number of permutations of members of \(\Gamma _{\alpha }\) times the number of permutations of the other \(|\mathbf {G}|-1\) IV groups. This result extends from the fact that, at a fixed IV group inclusion precedence position, \(Q\), the number of possible permutations (i.e., the value in the denominator) is \(|\mathsf {Sym}(\mathbf {X}): \{\mathbf {G}: {\Gamma _{\alpha }}_Q \}| = grp\_perm(\mathbf {G}\setminus \Gamma _{\alpha })\cdot |\Gamma _{\alpha }|!\) where \({\Gamma _{\alpha }}_Q\) indicates that IV group \(\Gamma _{\alpha }\) is at inclusion precedence position \(Q\). However, there would also be \(grp\_perm(\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j)\) permutations of IV groups and their members that precede \(\Gamma _{\alpha }\) and \(grp\_perm(\mathbf {G}\setminus \mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j)\) permutations of IV groups and their members that succeed \(\Gamma _{\alpha }\) in the numerator. The repeated products in the numerator and denominator (i.e., those related to \(|\Gamma _{l}|!\)) then cancel, leaving only \((|\mathbf {G}|-1)!\cdot |\Gamma _{\alpha }|!\) in the denominator as well as \(|\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j|=(Q-1)!\) and \(|\mathbf {G}\setminus \mathbf {G}^{\setminus \Gamma _{\alpha }:Q}_j|=(|\mathbf {G}|-Q)!\) in the numerator.
I can now define the within-group conditional dominance statistic, \(W_{\alpha }^Q\), as: \begin {equation} W_{\alpha }^{Q} = \sum _{j = 1}^{B^Q} \sum _{P^{\mathbf {G}}=1}^{|\Gamma _{\alpha }|} \sum _{i = 1}^{B^{P^{\mathbf {G}}}} wgt(Q,j,P^{\mathbf {G}},i)\cdot R^2_{\mathbf {T}_{(Q,j,P^{\mathbf {G}},i)}^{\setminus \alpha }\cup \alpha }. \label {eq:wg_cdl} \end {equation}
The within-group conditional dominance statistics determine conditional dominance with Equation 1 using \(K = |\mathbf {G}|\), \(V_{i}^{\alpha } = W_{\alpha }^Q\), and \(V_{i}^{\beta }= W_{\beta }^Q\).
The \(W^{Q}_{\alpha }\) values show one interesting property that is worth noting; each value can be summed within its respective IV group by inclusion precedence position to obtain the value of \(C_{\Gamma _{\alpha }}^Q\), or the grouped conditional dominance statistic value for \(\Gamma _{\alpha }\) at inclusion precedence \(Q\). This is because \(C_{\Gamma _{\alpha }}^Q = \sum _{j = 1}^{B^Q} \frac {\Delta R^2_{\mathbf {G}^{\setminus \Gamma _{\alpha }:Q}\cup \Gamma _{\alpha }}}{B^Q}\) which has a form similar to Equation 9. In fact, Equation 9 is an extension of these grouped conditional dominance statistics that includes additional averaging for IVs within an IV group (i.e., the two other summations and extensions given the \(wgt()\) function and \(\mathbf {T}\) combination). I show an example of this grouped to within-group DA decomposition property in the analytic example discussed in section six.
The intention of general dominance is to compare the averaged conditional dominance statistic values of \(\alpha \) and \(\beta \). I again intend to ensure that a definition of within-group general dominance meets this requirement yet also respects the contiguity requirement of Owen values.
Extending general dominance to adhere to Owen value computations is straightforward and, like the traditional method, involves merely averaging the within-group conditional dominance statistics. I then define the within-group general dominance statistic, \(W_{\alpha }\) as: \begin {equation} W_{\alpha } = \sum _{Q = 1}^{|\textbf {G}|} \frac {W^{Q}_{\alpha }}{|\textbf {G}|}. \label {eq:wg_gen} \end {equation} The within-group general dominance statistics determine general dominance with Equation 1 using \(K = 1\), \(V_{i}^{\alpha } = W_{\alpha }\), and \(V_{i}^{\beta }= W_{\beta }\).
The \(W_{\alpha }\) values also show an interesting property; each value can be summed within its respective IV group to obtain the value of \(C_{\Gamma _{\alpha }}\), or the grouped general dominance statistic value for \(\Gamma _{\alpha }\). This property of \(W_{\alpha }\) values extends directly from the \(W^Q_{\alpha }\) values on which they are based; the \(W^Q_{\alpha }\) values decompose \(C^Q_{\Gamma _{\alpha }}\) and, when averaged, the \(W_{\alpha }\) values decompose \(C_{\Gamma _{\alpha }}\).
Recall that Owen values also produce a value of the payoff ascribed to a union of players and then subsequently ascribe components of that player union’s value to individual players within the union. Hence, Owen values applied to the \(R^2\) in a linear regression also decompose the values \(C_{\Gamma _{\alpha }}\) and, as is suggested by their shared notation, the result in Equation 10 is equivalent to that of Equation 7. The within-group general dominance statistics are then tantamount to Owen values in the same way that traditional general dominance statistics are tantamount to Shapley values. I also show an example of this grouped to within-group DA decomposition property in the analytic example discussed in section six.
Within-group DA extends on the traditional and grouped DA methodologies by combining aspects of both approaches which produces a set of relative importance determinations that focuses on individual IVs but incorporates information relevant to IV groupings.
This section provides an analytic example which applies the traditional, grouped, and within-group DA methodologies to data. The purpose of this section is to more concretely illustrate the differences between the methods in terms of the amount of information they provide about IVs and how the IV groups affect the information used by the different DA methods.
The data used in this example were derived the ability.cov covariance matrix from the R package datasets (Antal, 2025). These data describe the relationships between a number of different ability and intelligence tests given the data from 112 different test takers. I used the variables picture, blocks, reading, and vocab in the example analyses. picture is described as a picture-completion test, blocks is described as a block design task, reading is a reading comprehension test, and vocab is a vocabulary test. The IVs were grouped such that picture and blocks formed one IV group, \(\Gamma _{spatial}\), and that reading and vocab formed another, \(\Gamma _{verbal}\). As the subscripts to the IV groups suggest, the spatial tests were grouped together and the verbal tests were also grouped together. Finally, the dependent variable was general which is described as a non-verbal measure of general intelligence using Cattell’s culture-fair test. The covariance matrix provided in the data was transformed into a correlation matrix and the intercorrelations between all study variables are reported below in Table 1.
| general | picture | blocks | reading | vocab | |
| general | 1.0000 | 0.4663 | 0.5517 | 0.5765 | 0.5144 |
| picture | 0.4663 | 1.0000 | 0.5724 | 0.2629 | 0.2393 |
| blocks | 0.5517 | 0.5724 | 1.0000 | 0.3540 | 0.3565 |
| reading | 0.5765 | 0.2629 | 0.3540 | 1.0000 | 0.7914 |
| vocab | 0.5144 | 0.2393 | 0.3565 | 0.7914 | 1.0000 |
Consistent with the idea that they might assess similar content within a group, the members of \(\Gamma _{spatial}\) correlated fairly strongly as did the members of \(\Gamma _{verbal}\).
The results from the linear regressions of the 16 IV subsets, which form all possible combinations of the four IVs, is reported in Table 2.
Subset Number | IV Subset | \(R^2\) |
| 1 | picture blocks reading vocab | 0.4960 |
| 2 | blocks reading vocab | 0.4726 |
| 3 | picture reading vocab | 0.4448 |
| 4 | reading vocab | 0.3414 |
| 5 | picture blocks vocab | 0.4482 |
| 6 | blocks vocab | 0.4200 |
| 7 | picture vocab | 0.3895 |
| 8 | vocab | 0.2646 |
| 9 | picture blocks reading | 0.4935 |
| 10 | blocks reading | 0.4704 |
| 11 | picture reading | 0.4387 |
| 12 | reading | 0.3323 |
| 13 | picture blocks | 0.3380 |
| 14 | blocks | 0.3043 |
| 15 | picture | 0.2174 |
| 16 | \(\emptyset \) | 0.0000 |
Table 2 shows that the overall model \(R^2\) (i.e., subset 1) is \(.4960\). In addition, the \(R^2\) values associated with reading are among the highest and those associated with picture are among the lowest. This suggests the possibility that reading is the most and picture is the least important IV.
Because complete dominance is not achievable using the within-group DA methodology, I began evaluating the relative importance of the IVs using conditional dominance. As an illustration of making conditional dominance designations across the methods, I considered the comparison of picture and reading. This comparison was chosen to emphasize that the within-group DA method can compare IVs in different IV groups.
The four conditional dominance statistics for picture used the average \(\Delta R^2\) values where picture’s inclusion precedence position was first, second, third, and fourth in the model. There was only one subset where picture was included first, subset 15, and its conditional dominance statistic was the increment it made beyond subset 16. Applying Equation 2 and representing the \(P\) values with ordinal positions (i.e., as the value \(\mathit {1}^{st}\)), resulted in: \(C^{\mathit {1}^{st}}_{picture} = \frac {(.2174-.0000)}{1} = .2174\).
The next computation focused on subsets where picture was included second. This included using the increment of subsets 7 over 8, 11 over 12, and 13 over 14. When included in Equation 2 the result was: \(C^{\mathit {2}^{nd}}_{picture} = \frac {(.3895-.2646)}{3} + \frac {(.4387-.3323)}{3} + \frac {(.3380-.3043)}{3} = .0883\).
The third computation focused on subsets where picture was included third which incorporated the increments of subsets 3 over 4, 5 over 6, and 9 over 10. The conditional dominance computation resulted in: \(C^{\mathit {3}^{rd}}_{picture} = \frac {(.4448-.3414)}{3} + \frac {(.4482-.4200)}{3} + \frac {(.4935-.4704)}{3} = .0516\).
The fourth computation focused on subsets where picture was included last or fourth. Similar to when it was included first, there was only one increment of subset 1 over 2 which produced: \(C^{\mathit {4}^{th}}_{picture} = \frac {(.4960-.4726)}{1} = .0234\).
The four conditional dominance statistics for reading also used the average \(\Delta R^2\) values where its inclusion precedence position was first, second, third, and fourth in the model. When in the first inclusion precedence position, the relevant increment for picture was the increment of subset 12 over 16. The conditional dominance statistic result was: \(C^{\mathit {1}^{st}}_{reading} = \frac {(.3323-.0000)}{1} = .3323\).
When reading was included as the second IV, the relevant subsets included the increments of subsets 4 over 8, 10 over 14, and 11 over 15. Applying those three increments in Equation 2 produced: \(C^{\mathit {2}^{nd}}_{reading} = \frac {(.3414-.2646)}{3} + \frac {(.4704-.3043)}{3} + \frac {(.4387-.2174)}{3} = .1547\).
As the the variable included third, the relevant increments for reading were subsets 2 over 6, 3 over 7, and 9 over 13. These three increments resulted in: \(C^{\mathit {3}^{rd}}_{reading} = \frac {(.4726-.4200)}{3} + \frac {(.4448-.3895)}{3} + \frac {(.4935-.3380)}{3} = .0878\).
Lastly, when reading was included last or fourth, the relevant increment was subset 1 over 5. The resulting conditional dominance statistic was: \(C^{\mathit {4}^{th}}_{reading} = \frac {(.4960-.4482)}{1} = .0477\).
With all eight conditional dominance statistics, I then applied Equation 1 to determine which IV conditionally dominated the other. The pattern of results showed that \(C^{\mathit {1}^{st}}_{reading} > C^{\mathit {1}^{st}}_{picture}\), \(C^{\mathit {2}^{nd}}_{reading} > C^{\mathit {2}^{nd}}_{picture}\), \(C^{\mathit {3}^{rd}}_{reading} > C^{\mathit {3}^{rd}}_{picture}\), and \(C^{\mathit {4}^{th}}_{reading} > C^{\mathit {4}^{th}}_{picture}\) which meant reading conditionally dominated picture.
Notice that the computation of conditional dominance statistics for both IVs required the use of \(\Delta R^2\) values from all 16 IV subsets reported on in Table 2. This is the reason that the traditional method is computationally expensive as it uses the \(\Delta R^2\) values from all possible combinations of IVs. Because picture was in \(\Gamma _{spatial}\) and reading was in \(\Gamma _{verbal}\), they could also be compared indirectly using the grouped method. The grouped method illustrated next required the use of many fewer of the \(\Delta R^2\) values in Table 2 but did not allow me to distinguish the predictive utility of picture from blocks or the predictive utility of reading from vocab.
The two conditional dominance statistics for \(\Gamma _{spatial}\) used the average \(\Delta R^2\) values where \(\Gamma _{spatial}\)’s inclusion precedence position was first and second in the model. There was only one increment where \(\Gamma _{spatial}\) was included first in the model, the increment of subset 13 (i.e., \(\Gamma _{spatial}\) which includes \(picture\) and \(blocks\)) over 16 (i.e., no IV groups). Applying Equation 2 produced \(C^{\mathit {1}^{st}}_{\Gamma _{spatial}} = \frac {(.3380-.0000)}{1} = .3380\).
When \(\Gamma _{spatial}\) was included as the last IV group, there was also only one relevant increment of subset 1 (i.e., all IVs and hence both IV groups) over 4 (i.e., \(\Gamma _{verbal}\) which includes \(reading\) and \(vocab\)). This produced a conditional dominance statistic value of \(C^{\mathit {2}^{nd}}_{\Gamma _{spatial}} = \frac {(.4960-.3414)}{1} = .1546\).
The two conditional dominance statistics for \(\Gamma _{verbal}\) used the average \(\Delta R^2\) values where \(\Gamma _{verbal}\)’s inclusion precedence position was first and second in the model. When \(\Gamma _{verbal}\) was included as the first IV group, its conditional dominance statistic was comprised of the increment of subset 4 over 16 or \(C^{\mathit {1}^{st}}_{\Gamma _{verbal}} = \frac {(.3414-.0000)}{1} = .3414\).
When \(\Gamma _{verbal}\) was included as the last IV group, its conditional dominance statistic was comprised of the increment of subset 1 over 13 or \(C^{\mathit {2}^{nd}}_{\Gamma _{verbal}} = \frac {(.4960-.3380)}{1} = .1580\).
These four conditional dominance statistics show that \(C^{\mathit {1}^{st}}_{\Gamma _{verbal}} > C^{\mathit {1}^{st}}_{\Gamma _{spatial}}\) and \(C^{\mathit {2}^{nd}}_{\Gamma _{verbal}} > C^{\mathit {2}^{nd}}_{\Gamma _{spatial}}\) when applying Equation 1 which meant \(\Gamma _{verbal}\) conditionally dominated \(\Gamma _{spatial}\).
The grouped DA methodology used substantially fewer IV subsets (i.e., \(2^2 = 4\)) than the traditional method but was not able to disentangle blocks from \(picture\) and vocab from \(reading\) in the comparisons. The comparison between picture and reading was then confounded with the other two IVs in the model. The within-group method illustrated next is a balance of the traditional and grouped DA methods that requires the use of fewer IV subsets than the traditional method yet allows a researcher to determine importance between all IVs in the model like the traditional method.
The two conditional dominance statistics for \(picture\) used the average \(\Delta R^2\) values where \(\Gamma _{spatial}\)’s inclusion precedence position was first and second in the model but also averaged over subsets where \(picture\) preceded and succeeded \(blocks\) within \(\Gamma _{spatial}\).
When \(\Gamma _{spatial}\) was included first, there were two relevant increments: subset 11 over 12 (i.e., \(picture\) succeeded \(blocks\)) and subset 15 over 16 (i.e., \(picture\) preceded \(blocks\)). Applying Equation 9 to those values produced \(W^{\mathit {1}^{st}}_{picture} = \frac {(.4387-.3323)}{2} + \frac {(.2174-.0000)}{2} = .1255\).
When \(\Gamma _{spatial}\) was included last, there were also two relevant increments: subset 1 over 2 (i.e., \(picture\) succeeded \(blocks\)) and subset 5 over 6 (i.e., \(picture\) preceded \(blocks\)). These two values produced a conditional dominance statistic of \(W^{\mathit {2}^{nd}}_{picture} = \frac {(.4960-.4726)}{2} + \frac {(.4482-.4200)}{2} = .0634\).
The two conditional dominance statistics for reading also included averages where \(\Gamma _{verbal}\) was included first and second in the model averaging over subsets where \(reading\) preceded or succeeded \(vocab\). When \(\Gamma _{verbal}\) was included first, there were two relevant increments: subset 11 over 15 and subset 12 over 16. The conditional dominance statistic produced by these increments was \(W^{\mathit {1}^{st}}_{reading} = \frac {(.4387-.2174)}{2} + \frac {(.3323-.0000)}{2} = .2046\).
When \(\Gamma _{verbal}\) was included last, there were also two relevant increments: subset 1 over 5 and subset 2 over 6. These final increments resulted in a value of \(W^{\mathit {2}^{nd}}_{reading} = \frac {(.4960-.4482)}{2} + \frac {(.4726-.4200)}{2} = .1016\).
The four conditional dominance statistics, when applying Equation 1, resulted in \(W^{\mathit {1}^{st}}_{reading} > W^{\mathit {1}^{st}}_{picture}\) and \(W^{\mathit {2}^{nd}}_{reading} > W^{\mathit {2}^{nd}}_{picture}\) which meant that reading conditionally dominated picture.
The within-group method used more IV subsets than the grouped method but fewer than the traditional method as it required only \(2^2 + (2^{(2-1)})*(2^2 - 2) + (2^{(2-1)})*(2^2 - 2) = 12\) IV subsets to compute conditional dominance statistics. Again, the within-group DA serves as a balance between the traditional and grouped methods that allows determinations like those possible with the traditional method but reduces the number of required IV subsets like the grouped method. Ultimately in the case of this example, the within-group DA methodology resulted in a materially similar conclusion as the traditional method but used 25% fewer IV subsets.
The conditional dominance statistics for all IVs and IV groups across the traditional, grouped, and within-group methods are reported in Table 3.
| \(\mathit {1}^{st}\) | \(\mathit {2}^{nd}\) | \(\mathit {3}^{rd}\) | \(\mathit {4}^{th}\) | |
| Traditional
| ||||
| \(\textit {picture}\) | .2174 | .0883 | .0516 | .0234 |
| \(\textit {blocks}\) | .3043 | .1380 | .0816 | .0512 |
| \(\textit {reading}\) | .3323 | .1547 | .0878 | .0477 |
| \(\textit {vocab}\) | .2646 | .0990 | .0395 | .0024 |
| Grouped
| ||||
| \(\Gamma _{spatial}\) | .3380 | .1546 | ||
| \(\Gamma _{verbal}\) | .3414 | .1580 | ||
| Within-group
| ||||
| \(\textit {picture}\) | .1255 | .0634 | ||
| \(\textit {blocks}\) | .2125 | .0912 | ||
| \(\textit {reading}\) | .2046 | .1016 | ||
| \(\textit {vocab}\) | .1368 | .0563 | ||
Note. \(\Gamma _{spatial} = \{picture,blocks\}\)
| ||||
\(\Gamma _{verbal} = \{reading,vocab\}\). | ||||
The results in Table 3 are reported such that each IV’s results appear in the rows and the inclusion precedence position of the conditional dominance statistics appear in the columns. The conditional dominance results in Table 3 showed that reading conditionally dominated picture and vocab, but not blocks, for both the traditional and within-group methods. The results also showed that blocks conditionally dominated both picture and vocab for the traditional and within-group methods.
Recall I mentioned that a property of the within-group conditional dominance statistics is that they would sum to the grouped conditional dominance statistics for their IV group. This property was true of the results in Table 3 as \(W^{\mathit {1}^{st}}_{picture} + W^{\mathit {1}^{st}}_{blocks} = C^{\mathit {1}^{st}}_{\Gamma _{spatial}}\) or \(.1255 + .2125 = .3380\) and \(W^{\mathit {2}^{nd}}_{picture} + W^{\mathit {2}^{nd}}_{blocks} = C^{\mathit {2}^{nd}}_{\Gamma _{spatial}}\) or \(.0634 + .0912 = .1546\).
Conditional dominance could not be determined for two of the between IV group comparisons: reading versus blocks and vocab versus picture. I then proceeded to compare the IVs using general dominance. The focus of the example computations below was on comparing reading and blocks.
General dominance statistics are always computed as the average of the conditional dominance statistics as is shown in Equation 3. The value for blocks was then computed as \(C_{blocks} = \frac {C_{blocks}^{\mathit {1}^{st}}}{4} + \frac {C_{blocks}^{\mathit {2}^{nd}}}{4} + \frac {C_{blocks}^{\mathit {3}^{rd}}}{4} + \frac {C_{blocks}^{\mathit {4}^{th}}}{4} = \frac {.3043}{4} + \frac {.1380}{4} + \frac {.0816}{4} + \frac {.0512}{4} = .1438\).
In addition, the value for reading was computed as \(C_{reading} = \frac {C_{reading}^{\mathit {1}^{st}}}{4} + \frac {C_{reading}^{\mathit {2}^{nd}}}{4} + \frac {C_{reading}^{\mathit {3}^{rd}}}{4} + \frac {C_{reading}^{\mathit {4}^{th}}}{4} = \frac {.3323}{4} + \frac {.1547}{4} + \frac {.0878}{4} + \frac {.0477}{4} = .1556\).
The comparison between the two IVs resulted in \(C_{reading} > C_{blocks}\) and thus reading generally dominated blocks.
I already know that \(\Gamma _{verbal}\), which contained reading, conditionally dominated \(\Gamma _{spatial}\), which contained blocks. It must then also have been the case that \(\Gamma _{verbal}\) would generally dominate \(\Gamma _{spatial}\). Indeed, when computed, I found that this was the case as \(C_{\Gamma _{spatial}} = \frac {C_{\Gamma _{spatial}}^{\mathit {1}^{st}}}{2} + \frac {C_{\Gamma _{spatial}}^{\mathit {2}^{nd}}}{2} = \frac {.3380}{2} + \frac {.1546}{2} = .2463\) and \(C_{\Gamma _{verbal}} = \frac {C_{\Gamma _{verbal}}^{\mathit {1}^{st}}}{2} + \frac {C_{\Gamma _{verbal}}^{\mathit {2}^{nd}}}{2} = \frac {.3414}{2} + \frac {.1580}{2} = .2497\). Thus, \(C_{verbal} > C_{spatial}\) which demonstrated that the expected relationship held.
Equation 10 focuses on the computation of within-group general dominance statistics but is, like the traditional method, also a simple average of conditional dominance statistics. Thus, the value for \(blocks\) was \(W_{blocks} = \frac {W_{blocks}^{\mathit {1}^{st}}}{2} + \frac {W_{blocks}^{\mathit {2}^{nd}}}{2} = \frac {.2125}{2} + \frac {.0634}{2} = .1518\) and the value for \(reading\) was \(W_{reading} = \frac {W_{reading}^{\mathit {1}^{st}}}{2} + \frac {W_{reading}^{\mathit {2}^{nd}}}{2} = \frac {.2046}{2} + \frac {.1016}{2} = .1531\). This resulted in \(W_{reading} > W_{blocks}\), indicating that reading generally dominated blocks.
The general dominance statistics computed for the traditional, grouped, and within-group methods are reported in Table 4.
| Traditional | Grouped | Within-group | ||
\(\Gamma _{spatial}\) | picture | .0952 |
.2463 | .0945 |
| blocks | .1438 | .1518 | ||
\(\Gamma _{verbal}\) | reading | .1556 |
.2497 | .1531 |
| vocab | .1014 | .0966 |
The results in Table 4 showed that, in addition to reading having generally dominated blocks, vocab generally dominated picture. Hence, all pairs of IVs were ranked with their respective dominance designations indicating differences in strength of evidence. Specifically, these results indicated that \(reading\) was the most important IV, followed by \(blocks\), then \(vocab\), and finally \(picture\).
Note again I mentioned that the within-group general dominance statistics were tantamount to Owen values. Thus, the within-group general dominance statistics for members of an IV group summed to the grouped general dominance statistic for their IV group. The results in Table 4 also illustrated this idea as, for instance, \(C_{\Gamma _{spatial}} = W_{picture} + W_{blocks}\) or \(.2463 = .0945 + .1518\).
In this work, I discussed the link between DA and Shapley values focusing specifically on how DA/Shapley values allow for determining importance with IVs in a statistical model like linear regression. I also outlined how the traditional DA/Shapley value methodology can result in large number of combinations of subsets of IVs to produce dominance determinations and how the grouped IV DA methodology can reduce the number of subsets of IVs. I then introduced Owen values as a method for decomposing the grouped DA/Shapley values. Extending from Owen values, I devised within-group conditional and within-group general dominance statistics that use IV grouping information to eliminate IV subset combinations yet allow for importance determinations between individual IVs.
Following the definition of the within-group DA methodology, I also provided an analytic example based on the ability.cov dataset in the R statistical computing environment that illustrates how the within-group DA methodology compares to the traditional and grouped methodologies. The proposed within-group DA methodology is intended to be a useful tool for practicing researchers who are seeking importance determinations for IVs when there are IV groups that could be formed from the IVs. The sections below elaborate further on within-group DA, recommendations on how to apply the methodology, and discuss future directions for this line of research.
The primary advantage of using the within-group DA methodology is that it can improve the efficiency of DA by obtaining relative importance determinations between IVs while eliminating a, sometimes substantial, number of IV subsets. Recall that for the analytic example with four IVs, the required number of subsets of IVs to estimate all dominance statistics was \(16\) using traditional DA. The within-group version of DA needed fewer IV subsets as any IV subset that included a combination of IVs where there were multiple incomplete IV groups were eliminated. This resulted in the need for only \(12\) subsets of IVs for the within-group method—eliminating 25% of IV subsets needed for the traditional approach.
To determine the number of IV subsets that will be required using the within-group DA methodology first recall that, for traditional DA, the number of IV subsets required is \(2^{|\mathbf {X}|}\) or all combinations of the IVs. Within-group DA is, however, more similar to grouped DA for its IV subset requirements. Grouped DA requires \(2^{|\mathbf {G}|}\) IV group subsets. Within-group DA expands on the IV group subsets by including all combinations of IV subsets within each IV group. The number of required IV subsets for within-group DA is given in Equation 11: \begin {equation} 2^{|\mathbf {G}|} + \sum ^{|\mathbf {G}|}_{l=1} (2^{|\mathbf {G}| - 1}) \cdot (2^{|\Gamma _{l}|} - 2). \label {submod_num} \end {equation} Note that the \(2^{|\mathbf {G}| - 1}\) term will include all combinations of the \(\mathbf {G}\) IV groups not including IV group \(\Gamma _l\). In addition, the \(2^{|\Gamma _{l}|} - 2\) term subtracts the two IV subsets where all and none of the IVs in IV group \(\Gamma _l\) are included as those IV subsets are included in the \(2^{|\mathbf {G}|}\) term.
Excluding IV subsets by applying within-group DA could allow for conducting DA with a much larger set of IVs than would be computationally feasible with the traditional methodology. Traditional DA, even with modern computing power, can require a considerable time investment and computational resources to analyze 20 IVs or more in a model. Twenty IVs in a traditional DA would require \(2^{20} = 1,\!048,\!576\) IV subsets to get dominance statistics and designations. If these 20 IVs were to be grouped into two groups of three and one group of four, the number of IV subsets is reduced to \(2^3 + 2^{3-1} \cdot (2^{3} - 2) + 2^{3-1} \cdot (2^{3} - 2) + 2^{3-1} \cdot (2^{4} - 2) = 112\) IV subsets which is about .01% of those required by the traditional methodology. Thus, within-group DA can be structured such that the number of required IV subsets is far lower than that of traditional DA.
That within-group DA can substantially reduce the number of IV subsets required for a determining the relative importance of IVs is an important practical advantage of this methodology and could help to buffer against one of the biggest limitations of the DA methodology; that the method tends to become computationally infeasible with more IVs (e.g., Johnson & LeBreton, 2004).
A complication for practicing researchers in applying the within-group DA method could be in ascertaining how to group IVs. My recommendation for grouping IVs is to do so using conceptual categories when possible. The use of conceptual categories is advantageous for grouping IVs as they will ensure that IVs which are more strongly related conceptually are nearer one another in terms of inclusion precedence sequences and IV subsets which more strongly affects how the DA statistics separate out the variance they explain in the dependent variable. Indeed, this was the approach used for creating the \(\Gamma _{spatial}\) and \(\Gamma _{verbal}\) IV groups in the analytic example.
When it is not possible to group IVs using conceptual categories, grouping IVs based on their shared variance is a useful and practical alternative. I recommend this approach as IVs that are more strongly correlated affect variance partitioning for one another more than less correlated IVs. Moreover, IVs in the same IV group will be nearer one another in inclusion precedence sequences and IV subsets which more strongly affects how the variance they explain in the dependent variable is ascribed. Hence, putting IVs that are more strongly correlated into the same group uses the most critical information about IV overlap to partition the \(R^2\). I then recommend that IVs that are more strongly correlated with one another are placed into an IV group together in the absence of conceptual categories. It is also worth noting that the correlation method would have led to grouping picture and blocks as well as reading and vocab into separate IV groups even if they were not as strongly aligned conceptually (i.e., see the results in Table 1).
An interesting future direction for research on within-group DA would be to offer researchers alternatives when no conceptual or correlation-based IV grouping is reasonable. One possible direction for exploration is to examine how randomly assigning IVs to IV groups might affect the conclusions reached by within-group DA compared to traditional DA. A random assignment strategy might be useful in cases where there are many IVs and no clear patterns of interrelationships between the IVs. In such cases, it would be sensible to evaluate more than one random assignment to ensure that some IV assignments, by chance, do not eliminate IV subsets that mask crucial importance results.
As an example, consider a model with 30 IVs and no good conceptual groupings for the IVs. This model produces an astronomical 1.07 billion subsets for the traditional DA methodology. By contrast, when grouping the IVs into six groups of five IVs, this set of 30 IVs produces a much more reasonable 5,824 subsets. Given the smaller number of subsets, the researcher could choose multiple random assignments of IVs to IV groups and evaluate how the different random IV group assignments affect dominance designation results. For instance, the researcher could choose 30 different random groupings of the IVs as a test to ensure that the way in which the IVs are put into groups does not affect the conclusions. This set of 30 IV groupings would result in \(5,\!824 \cdot 30 = 174,\!720\) subsets which is still far fewer than would be needed for the traditional DA methodology yet would be similar to it in that no predetermined IV conceptual groupings would be necessary.
At current, the within-group DA methodology allows for IVs to be grouped together which affects the number of valid IV subsets. It is conceivable that further grouping would be possible such that there are IV subgroups within an IV group that function in a way similar to how the IV group works in the context of the other IV groups. IV subgroups within an IV group would also eliminate IV subsets among the members of an IV group in a way similar to how IV groups eliminate IV subsets overall.
For example, a researcher with eight IVs would need 256 (i.e., \(2^8 = 256\)) IV subsets for the traditional DA method. If this researcher grouped the eight IVs into two groups of four, the within-group method would require 60 (i.e., \(2^2 + 2 \cdot 2^{2-1} \cdot (2^4 - 2))\) IV subsets. Consider now whether this researcher further grouped their IVs into subgroups of size two within each IV group. This would require 44 (i.e., \(2^2 + 2 \cdot 2^{2-1} \cdot (2^2 - 2) + 4 \cdot 2^{2-1} \cdot 2^{2-1} \cdot (2^2 - 2)\)) IV subsets. The IV subgrouping then required around 6 times fewer (i.e., \(\frac {256}{44}\)) IV subsets compared to the traditional methodology. The IV subgrouping also reduced the number of subsets required by around 36% (i.e., \(\frac {60}{44}\)) compared to the single level of IV grouping.
It is also conceivable that a researcher would want an IV or IV group to be constrained such that it always precedes or succeeds one or more a counterpart IVs or IV groups. It would not be necessary in these cases that the focal IV or IV group immediately precede or succeed the counterparts in the sense that they must be contiguous. Rather the constraint described here would merely eliminate all IV subsets which imply that the focal IV or IV group is somewhere before (when it must succeed) or somewhere after (when it must precede) the counterparts they are constraint to, or not to, follow. For example, a researcher might want to require that an interaction term always succeeds its constituent IVs to produce a valid result. A methodology such as this could be a useful alternative to the current best practice in the DA literature for the relative importance analysis of interactions which involves the residualization of the constituent IVs (LeBreton, Tonidandel, & Krasikova, 2013).
Best practices in the literature suggest bootstrapping dominance designations to understand the impact of sampling variability on their reproducibility (Azen & Budescu, 2003). An additional advantage of using the within-group methodology is the improved computational feasibility for obtaining bootstrapped estimates of DA designation reproducibility for general and conditional dominance designations.
For example, estimating reproducibility from a model with six IVs and 100 bootstrap replications for a traditional DA would require \(2^6 \cdot 100 = 6,\!400\) or 64 subsets over 100 bootstrap samples to be estimated in total. If the six IVs were grouped into two groups of three, the number of models is reduced to \((2^2 + 2 \cdot 2^{2-1} \cdot [2^3 - 2]) \cdot 100 = 2,\!800\) or 28 subsets with 100 bootstrap samples—less than half of the number required for the traditional methodology.
The within-group DA methodology developed in this work extends on traditional DA by discussing its foundation in Shapely values and by devising the within-group method such that it derives from Owen values, a similar methodology that accommodates player unions yet still produces payoff estimates for individual players. The within-group DA method is valuable to research practice as it improves the computational feasibility of DA as the number of IVs in a model increases and only requires the researcher to generate mutually exclusive groups of IVs in their model.
Before concluding, I note that traditional DA remains a valuable tool for the evaluation of relative importance with statistical models where the number of IVs is relatively small or the researcher cannot group IVs. I also note that the methodology discussed in this manuscript will be implemented using the domir function in the package domir (Luchman, 2024) available in the R statistical computing environment which also includes methods to compute the traditional and grouped methodologies.
Antal, D. (2025). dataset: Create data frames for exchange and reuse [Computer software manual]. (R package version 0.4.1) doi: https://doi.org/10.32614/CRAN.package.dataset
Azen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8(2), 129–148. doi: https://doi.org/10.1037/1082-989X.8.2.129
Bittmann, F. (2024). A primer on dominance analysis.
doi: https://doi.org/10.20944/preprints202404.1606.v1
Budescu, D. V. (1993). Dominance analysis: A new approach to the problem of relative importance of predictors in multiple regression. Psychological Bulletin, 114(3), 542–551. doi: https://doi.org/10.1037/0033-2909.114.3.542
Budescu, D. V., & Azen, R. (2004). Beyond global measures of relative importance: Some insights from dominance analysis. Organizational Research Methods, 7(3), 341–350. doi: https://doi.org/10.1177/1094428104267049
Grömping, U. (2007). Estimators of relative importance in linear regression based on variance decomposition. The American Statistician, 61(2), 139–147. doi: https://doi.org/10.1198/000313007X188252
Gu, X. (2023). Evaluating predictors’ relative importance using bayes factors in regression models. Psychological Methods, 28(4), 825–842. doi: https://doi.org/10.1037/met0000431
Johnson, J. W., & LeBreton, J. M. (2004). History and use of relative importance indices in organizational research. Organizational Research Methods, 7(3), 238–257. doi: https://doi.org/10.1177/109442810426651
Kruskal, W. (1987). Relative importance by averaging over orderings. The American Statistician, 41(1), 6–10. doi: https://doi.org/10.1080/00031305.1987.10475432
LeBreton, J. M., Tonidandel, S., & Krasikova, D. V. (2013). Residualized relative importance analysis: A technique for the comprehensive decomposition of variance in higher order regression models. Organizational Research Methods, 16(3), 449–473. doi: https://doi.org/10.1177/1094428113481065
Luchman, J. N. (2021). Determining relative importance in stata using dominance analysis: domin and domme. The Stata Journal, 21(2), 510–538. doi: https://doi.org/10.1177/1536867X211025837
Luchman, J. N. (2024). domir: Tools to support relative importance analysis [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=domir (R package version 1.2.0, https://jluchman.github.io/domir/)
McLaurin, F. A., West, S. J., & Thomson, N. D. (2025). Exploring the relationship between facets of childhood trauma and violent injury risk during adulthood: A dominance analysis study. Child Abuse & Neglect, 161, 107307. doi: https://doi.org/10.1016/j.chiabu.2025.107307
Miller, B. K., Kirby, E. G., & Stevens, K. B. (2025). Dominance analysis of bright and dark dispositional predictors of socially desirable responding. Psychological Reports, 128(6), 4799–4819. doi: https://doi.org/10.1177/00332941241226908
Owen, G. (1977). Values of games with a priori unions. In Essays in mathematical economics and game theory (pp. 76–88). Springer.
Shapley, L. S. (1953). A value for n-person games. In Contributions to the theory of games II (pp. 307–317). Princeton University Press.
Thomas, D. R., Zumbo, B. D., Kwan, E., & Schweitzer, L. (2014). On Johnson’s (2000) relative weights method for assessing variable importance: A reanalysis. Multivariate Behavioral Research, 49(4), 329–338. doi: https://doi.org/10.1080/00273171.2014.905766
Tonidandel, S., & LeBreton, J. M. (2011). Relative importance analysis: A useful supplement to regression analysis. Journal of Business and Psychology, 26(1), 1–9.
Yin, K., & Zhou, L. (2025). The relative importance of peace of mind, grit, and classroom environment in predicting willingness to communicate among learners in multi-ethnic regions: a latent dominance analysis. BMC Psychology, 13(1), 1–17. doi: https://doi.org/10.1186/s40359-025-02676-2