Phi Correlation Coefficient of Association Between Paired Binary Variables
phi_coef.Rd
The phi correlation coefficient (or mean square contingency coefficient and denoted by \(\phi\) or \(r\phi\)) is a measure of association between two naturally dichotomous variables.
Details
For a two-by-two contingency table \(n_{11}\) \(n_{12}\) \(n_{21}\) \(n_{22}\) the \(\phi\) correlation coefficient is given by: -
$$\displaystyle \phi = \frac{n_{11}n_{22} - n_{12}n_{21}} {\sqrt{(n_{11} + n_{21})(n_{12} + n_{22})(n_{11} + n_{12})(n_{21} + n_{22})}}$$
or equivalently, the determinant of the matrix divided by the (principal) square root of the product of its four marginal sums.
References
Yule, G.U. (1912). On the Methods of Measuring Association Between Two Attributes. J Royal Stat Soc. 75 (6): 579–652. doi:10.2307/2340126 .
See also
Other correl_coef:
cor_coef.test()
,
phi_coef.test()
Examples
## Example from Wikipedia
twobytwo <- matrix(c(6, 1, 2, 3), nrow = 2, dimnames = rep(list(c("Cat", "Dog")), 2) |>
setNames(c("Actual", "Predicted")))
addmargins(twobytwo)
#> Predicted
#> Actual Cat Dog Sum
#> Cat 6 2 8
#> Dog 1 3 4
#> Sum 7 5 12
phi_coef(twobytwo)
#> [1] 0.4780914
## Example from Statology
twobytwo <- matrix(c(4, 8, 9, 4), nrow = 2, dimnames =
list(Gender = c("Male", "Female"), Party = c("Dem", "Rep")))
addmargins(twobytwo)
#> Party
#> Gender Dem Rep Sum
#> Male 4 9 13
#> Female 8 4 12
#> Sum 12 13 25
phi_coef(twobytwo)
#> [1] -0.3589744
rm(twobytwo)