Explanatory Factors in Data as List of Expressions
expl_fcts.Rd
Create a list of defused expressions representing the names of all or a selection of explanatory factors or character vectors in a dataset.
Usage
expl_fcts(
.data,
...,
.named = FALSE,
.val = c("syms", "data_syms", "character")
)
Arguments
- .data
a data frame, or a data frame extension (e.g. a
tibble
).- ...
<
tidy-select
> quoted name(s) of one or morefactors
orcharacter vectors
in.data
, to be included in (or excluded from) the output.- .named
logical
, whether to name the elements of the list. IfTRUE
, unnamed inputs are automatically named withset_names()
; defaultFALSE
.- .val
the type of output required. The default
"syms"
returns a list ofsymbols
; the alternative"data_syms"
returns a list ofsymbols
prefixed with the.data
pronoun. The"character"
option returns acharacter vector
.
Value
A list
of symbols
representing the names of selected explanatory factors
or character vectors
in
.data
; unless .val = "data_syms"
, in which case the symbols
are prefixed with the .data
pronoun or .val = "character"
whereupon the selected names are returned as a character vector
instead.
Details
By default, expl_fcts()
creates a list
of symbols
i.e.,
defused R expressions, representing the names of all or a selection of explanatory factors (or
character vectors) in .data
, using syms()
from package rlang. Alternatively,
if .val = "data_syms"
, a list of symbols prefixed with the .data
pronoun is returned instead.
Finally, if .val = "character"
, expl_fcts()
returns a character vector
of the names of the explanatory factors
(or character vectors) in .data
Variables in .data
may be selected for inclusion or exclusion using the ...
argument and the
<tidy-select
> syntax from package dplyr, including use of
“selection helpers”. If no ...
arguments are supplied, all categorical variables in .data
will
be included in the list.
A list of symbols
returned by expl_fcts()
may be “injected” into the ...
arguments of
contingency_table()
, xcontingency_table()
,
binom_contingency()
and other similar functions, using the
splice-operator !!!
. If .val = "character"
, the functions all_of()
or any_of()
should be used to wrap the resulting character vector
of names instead of using
!!!
. A list of symbols
returned by expl_fcts()
may also be used to provide a list argument with injection
support to lapply()
(or purrr package map() functions), using the
injection-operator !!
(see examples).
See also
!!
, !!!
, all_of
,
any_of
, set_names()
, defused R expressions
,
map()
and symbol
.
Other contingency_table:
binom_contingency()
,
contingency_table()
Examples
(d <- list(
iv2 = list(g = c("a", "c", "e"), h = c("b", "d", "f")),
iv3 = list(i = c("a", "b", "c"), j = c("d", "e", "f")),
iv4 = list(k = c("a", "b"), l = c("c", "d"), m = c("e", "f"))
) |> add_grps(bernoulli_data(levels = 6), iv, .key = _))
#> ___________________________
#> Simulated Bernoulli Data: -
#>
#> # A tibble: 396 × 5
#> iv iv2 iv3 iv4 dv
#> <fct> <fct> <fct> <fct> <int>
#> 1 a g i k 0
#> 2 a g i k 0
#> 3 a g i k 0
#> 4 a g i k 0
#> 5 a g i k 0
#> 6 a g i k 0
#> 7 a g i k 1
#> 8 a g i k 1
#> 9 a g i k 0
#> 10 a g i k 1
#> # ℹ 386 more rows
d |> expl_fcts()
#> [[1]]
#> iv
#>
#> [[2]]
#> iv2
#>
#> [[3]]
#> iv3
#>
#> [[4]]
#> iv4
#>
d |> expl_fcts(.named = TRUE)
#> $iv
#> iv
#>
#> $iv2
#> iv2
#>
#> $iv3
#> iv3
#>
#> $iv4
#> iv4
#>
d |> expl_fcts(.val = "data_syms")
#> [[1]]
#> .data$iv
#>
#> [[2]]
#> .data$iv2
#>
#> [[3]]
#> .data$iv3
#>
#> [[4]]
#> .data$iv4
#>
d |> expl_fcts(.named = TRUE, .val = "data_syms")
#> $iv
#> .data$iv
#>
#> $iv2
#> .data$iv2
#>
#> $iv3
#> .data$iv3
#>
#> $iv4
#> .data$iv4
#>
d |> expl_fcts(.val = "character")
#> [1] "iv" "iv2" "iv3" "iv4"
d |> expl_fcts(.named = TRUE, .val = "character")
#> iv iv2 iv3 iv4
#> "iv" "iv2" "iv3" "iv4"
## Select or exclude factors
d |> expl_fcts(iv, iv3)
#> [[1]]
#> iv
#>
#> [[2]]
#> iv3
#>
d |> expl_fcts(!c(iv, iv3))
#> [[1]]
#> iv2
#>
#> [[2]]
#> iv4
#>
## Use {dplyr} selection helpers e.g., last_col(), num_range() and starts_with()
d |> expl_fcts(last_col(1L)) ## Offset of 1L used, since last column of d is dv
#> [[1]]
#> iv4
#>
d |> expl_fcts(!last_col())
#> [[1]]
#> iv
#>
#> [[2]]
#> iv2
#>
#> [[3]]
#> iv3
#>
#> [[4]]
#> iv4
#>
d |> expl_fcts(num_range("iv", 2:3))
#> [[1]]
#> iv2
#>
#> [[2]]
#> iv3
#>
d |> expl_fcts(!num_range("iv", 2:3))
#> [[1]]
#> iv
#>
#> [[2]]
#> iv4
#>
d |> expl_fcts(starts_with("iv"))
#> [[1]]
#> iv
#>
#> [[2]]
#> iv2
#>
#> [[3]]
#> iv3
#>
#> [[4]]
#> iv4
#>
## Negation of selection helper excludes all explanatory factors
d |> expl_fcts(!starts_with("iv"))
#> list()
## In following three examples, each triplet should give identical results
## Include all explanatory factors
d |> binom_contingency(dv)
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 6
#> iv iv2 iv3 iv4 pn qn
#> * <fct> <fct> <fct> <fct> <int> <int>
#> 1 a g i k 30 36
#> 2 b h i k 29 37
#> 3 c g i l 24 42
#> 4 d h j l 18 48
#> 5 e g j m 10 56
#> 6 f h j m 5 61
d |> binom_contingency(dv, !!!expl_fcts(d))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 6
#> iv iv2 iv3 iv4 pn qn
#> * <fct> <fct> <fct> <fct> <int> <int>
#> 1 a g i k 30 36
#> 2 b h i k 29 37
#> 3 c g i l 24 42
#> 4 d h j l 18 48
#> 5 e g j m 10 56
#> 6 f h j m 5 61
d |> binom_contingency(dv, all_of(expl_fcts(d, .val = "character")))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 6
#> iv iv2 iv3 iv4 pn qn
#> * <fct> <fct> <fct> <fct> <int> <int>
#> 1 a g i k 30 36
#> 2 b h i k 29 37
#> 3 c g i l 24 42
#> 4 d h j l 18 48
#> 5 e g j m 10 56
#> 6 f h j m 5 61
## Include only iv and iv3
d |> binom_contingency(dv, iv, iv3)
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 4
#> iv iv3 pn qn
#> * <fct> <fct> <int> <int>
#> 1 a i 30 36
#> 2 b i 29 37
#> 3 c i 24 42
#> 4 d j 18 48
#> 5 e j 10 56
#> 6 f j 5 61
d |> binom_contingency(dv, !!!expl_fcts(d, iv, iv3))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 4
#> iv iv3 pn qn
#> * <fct> <fct> <int> <int>
#> 1 a i 30 36
#> 2 b i 29 37
#> 3 c i 24 42
#> 4 d j 18 48
#> 5 e j 10 56
#> 6 f j 5 61
d |> binom_contingency(dv, all_of(expl_fcts(d, iv, iv3, .val = "character")))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 4
#> iv iv3 pn qn
#> * <fct> <fct> <int> <int>
#> 1 a i 30 36
#> 2 b i 29 37
#> 3 c i 24 42
#> 4 d j 18 48
#> 5 e j 10 56
#> 6 f j 5 61
## Exclude iv and iv3
d |> binom_contingency(dv, !c(iv, iv3))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 4
#> iv2 iv4 pn qn
#> * <fct> <fct> <int> <int>
#> 1 g k 30 36
#> 2 h k 29 37
#> 3 g l 24 42
#> 4 h l 18 48
#> 5 g m 10 56
#> 6 h m 5 61
d |> binom_contingency(dv, !!!expl_fcts(d, !c(iv, iv3)))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 4
#> iv2 iv4 pn qn
#> * <fct> <fct> <int> <int>
#> 1 g k 30 36
#> 2 h k 29 37
#> 3 g l 24 42
#> 4 h l 18 48
#> 5 g m 10 56
#> 6 h m 5 61
d |> binom_contingency(dv, all_of(expl_fcts(d, !c(iv, iv3), .val = "character")))
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 4
#> iv2 iv4 pn qn
#> * <fct> <fct> <int> <int>
#> 1 g k 30 36
#> 2 h k 29 37
#> 3 g l 24 42
#> 4 h l 18 48
#> 5 g m 10 56
#> 6 h m 5 61
## Use with lapply, binom_contingency(), glm() and odds_ratio()
expl_fcts(d, .named = TRUE) |>
lapply(\(x) binom_contingency(d, dv, !!x))
#> $iv
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 6 × 3
#> iv pn qn
#> * <fct> <int> <int>
#> 1 a 30 36
#> 2 b 29 37
#> 3 c 24 42
#> 4 d 18 48
#> 5 e 10 56
#> 6 f 5 61
#>
#> $iv2
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 2 × 3
#> iv2 pn qn
#> * <fct> <int> <int>
#> 1 g 64 134
#> 2 h 52 146
#>
#> $iv3
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 2 × 3
#> iv3 pn qn
#> * <fct> <int> <int>
#> 1 i 83 115
#> 2 j 33 165
#>
#> $iv4
#> _____________________________
#> Binomial Contingency Table: -
#>
#> # A tibble: 3 × 3
#> iv4 pn qn
#> * <fct> <int> <int>
#> 1 k 59 73
#> 2 l 42 90
#> 3 m 15 117
#>
expl_fcts(d, .named = TRUE) |>
lapply(\(x)
binom_contingency(d, dv, !!x) |>
glm(cbind(pn, qn) ~ ., binomial, data = _)
)
#> $iv
#>
#> Call: glm(formula = cbind(pn, qn) ~ ., family = binomial, data = binom_contingency(d,
#> dv, !!x))
#>
#> Coefficients:
#> (Intercept) ivb ivc ivd ive ivf
#> -0.1823 -0.0613 -0.3773 -0.7985 -1.5404 -2.3191
#>
#> Degrees of Freedom: 5 Total (i.e. Null); 0 Residual
#> Null Deviance: 42.07
#> Residual Deviance: 2.531e-14 AIC: 37.66
#>
#> $iv2
#>
#> Call: glm(formula = cbind(pn, qn) ~ ., family = binomial, data = binom_contingency(d,
#> dv, !!x))
#>
#> Coefficients:
#> (Intercept) iv2h
#> -0.7390 -0.2934
#>
#> Degrees of Freedom: 1 Total (i.e. Null); 0 Residual
#> Null Deviance: 1.758
#> Residual Deviance: -4.174e-14 AIC: 15.1
#>
#> $iv3
#>
#> Call: glm(formula = cbind(pn, qn) ~ ., family = binomial, data = binom_contingency(d,
#> dv, !!x))
#>
#> Coefficients:
#> (Intercept) iv3j
#> -0.3261 -1.2833
#>
#> Degrees of Freedom: 1 Total (i.e. Null); 0 Residual
#> Null Deviance: 31.25
#> Residual Deviance: 9.859e-14 AIC: 14.87
#>
#> $iv4
#>
#> Call: glm(formula = cbind(pn, qn) ~ ., family = binomial, data = binom_contingency(d,
#> dv, !!x))
#>
#> Coefficients:
#> (Intercept) iv4l iv4m
#> -0.2129 -0.5492 -1.8412
#>
#> Degrees of Freedom: 2 Total (i.e. Null); 0 Residual
#> Null Deviance: 38.86
#> Residual Deviance: -4.441e-15 AIC: 20.96
#>
expl_fcts(d, .named = TRUE) |>
lapply(\(x)
binom_contingency(d, dv, !!x, .drop_zero = TRUE) |>
odds_ratio(.ind_var = !!x)
)
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> $iv
#> ____________________________
#> Estimates and Odds Ratios: -
#>
#> # A tibble: 6 × 7
#> parameter estimate se p_val odds_ratio ci[,"2.5%"] [,"97.5%"] sig
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 (Intercept) -0.182 0.247 0.461 1 NA NA NS
#> 2 ivb -0.0613 0.350 0.861 0.941 0.472 1.87 NS
#> 3 ivc -0.377 0.356 0.289 0.686 0.339 1.37 NS
#> 4 ivd -0.799 0.371 0.0313 0.45 0.215 0.923 *
#> 5 ive -1.54 0.423 0.000271 0.214 0.0898 0.478 ***
#> 6 ivf -2.32 0.527 0.0000107 0.0984 0.0313 0.256 ***
#>
#> $iv2
#> ____________________________
#> Estimates and Odds Ratios: -
#>
#> # A tibble: 2 × 7
#> parameter estimate se p_val odds_ratio ci[,"2.5%"] [,"97.5%"] sig
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 (Intercept) -0.739 0.152 0.0000012 1 NA NA ***
#> 2 iv2h -0.293 0.222 0.186 0.746 0.482 1.15 NS
#>
#> $iv3
#> ____________________________
#> Estimates and Odds Ratios: -
#>
#> # A tibble: 2 × 7
#> parameter estimate se p_val odds_ratio ci[,"2.5%"] [,"97.5%"] sig
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 (Intercept) -0.326 0.144 0.0236 1 NA NA *
#> 2 iv3j -1.28 0.239 0.0000001 0.277 0.172 0.439 ***
#>
#> $iv4
#> ____________________________
#> Estimates and Odds Ratios: -
#>
#> # A tibble: 3 × 7
#> parameter estimate se p_val odds_ratio ci[,"2.5%"] [,"97.5%"] sig
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 (Intercept) -0.213 0.175 0.224 1 NA NA NS
#> 2 iv4l -0.549 0.256 0.0320 0.577 0.348 0.951 *
#> 3 iv4m -1.84 0.325 0 0.159 0.0814 0.293 ***
#>
rm(d)