library(psych)
library(tidyverse)
library(modelsummary) # for summarizing data
Coefficient \(\alpha\)
We’ll once again use the bfi
data set, this time with the Openness items
data(bfi, package = "psych")
head(bfi[c("O1", "O2", "O3", "O4", "O5")])
Descriptive Statistics
# Get means and variances
<- datasummary(O1 + O2 + O3 + O4 + O5 ~ Mean + Var, data = bfi,
mv output = "data.frame")
# Correlation, adding means and variances
datasummary_correlation(bfi[c("O1", "O2", "O3", "O4", "O5")],
add_columns = mv[-1])
O1 | O2 | O3 | O4 | O5 | Mean | Var | |
---|---|---|---|---|---|---|---|
O1 | 1 | . | . | . | . | 4.82 | 1.28 |
O2 | −.21 | 1 | . | . | . | 2.71 | 2.45 |
O3 | .40 | −.26 | 1 | . | . | 4.44 | 1.49 |
O4 | .18 | −.07 | .19 | 1 | . | 4.89 | 1.49 |
O5 | −.24 | .32 | −.31 | −.18 | 1 | 2.49 | 1.76 |
You can find that O2 and O5 have negative correlations with other items
\(\alpha\) Without Recoding
<- cov(bfi[c("O1", "O2", "O3", "O4", "O5")], use = "complete")
cov1 # Compute alpha by hand using formula
5 / (5 - 1) * (1 - sum(diag(cov1)) / sum(cov1))
[1] -0.1568749
# Using `psych::alpha()`; the estimate is labelled "raw_alpha"
::alpha(bfi[c("O1", "O2", "O3", "O4", "O5")]) psych
Warning in psych::alpha(bfi[c("O1", "O2", "O3", "O4", "O5")]): Some items were negatively correlated with the total scale and probably
should be reversed.
To do this, run the function again with the 'check.keys=TRUE' option
Some items ( O2 O5 ) were negatively correlated with the total scale and
probably should be reversed.
To do this, run the function again with the 'check.keys=TRUE' option
Reliability analysis
Call: psych::alpha(x = bfi[c("O1", "O2", "O3", "O4", "O5")])
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
-0.13 -0.096 0.096 -0.018 -0.087 0.035 3.9 0.56 -0.12
95% confidence boundaries
lower alpha upper
Feldt -0.2 -0.13 -0.07
Duhachek -0.2 -0.13 -0.06
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
O1 -0.194 -0.234 -0.0273 -0.0498 -0.190 0.037 0.066 -0.1233
O2 -0.031 0.027 0.1619 0.0069 0.028 0.032 0.082 0.0001
O3 -0.096 -0.145 0.0093 -0.0326 -0.126 0.033 0.054 -0.1233
O4 -0.232 -0.239 0.0298 -0.0506 -0.193 0.039 0.103 -0.2261
O5 0.028 0.134 0.2166 0.0374 0.155 0.031 0.067 0.0551
Item statistics
n raw.r std.r r.cor r.drop mean sd
O1 2778 0.42 0.52 0.46 0.020 4.8 1.1
O2 2800 0.49 0.36 -0.11 -0.088 2.7 1.6
O3 2772 0.39 0.47 0.37 -0.054 4.4 1.2
O4 2786 0.48 0.52 0.28 0.039 4.9 1.2
O5 2780 0.36 0.28 -0.33 -0.136 2.5 1.3
Non missing response frequency for each item
1 2 3 4 5 6 miss
O1 0.01 0.04 0.08 0.22 0.33 0.33 0.01
O2 0.29 0.26 0.14 0.16 0.10 0.06 0.00
O3 0.03 0.05 0.11 0.28 0.34 0.20 0.01
O4 0.02 0.04 0.06 0.17 0.32 0.39 0.01
O5 0.27 0.32 0.19 0.13 0.07 0.03 0.01
You can see that \(\alpha\) is negative.
Recode
$O2r <- 7 - bfi$O2
bfi$O5r <- 7 - bfi$O5 bfi
\(\alpha\) After Recoding
<- cov(bfi[c("O1", "O2r", "O3", "O4", "O5r")], use = "complete")
cov2 # Compute alpha by hand using formula
5 / (5 - 1) * (1 - sum(diag(cov2)) / sum(cov2))
[1] 0.6025464
# Using `psych::alpha()`; the estimate is labelled "raw_alpha"
::alpha(bfi[c("O1", "O2r", "O3", "O4", "O5r")]) psych
Reliability analysis
Call: psych::alpha(x = bfi[c("O1", "O2r", "O3", "O4", "O5r")])
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.6 0.61 0.57 0.24 1.5 0.012 4.6 0.81 0.23
95% confidence boundaries
lower alpha upper
Feldt 0.58 0.6 0.62
Duhachek 0.58 0.6 0.62
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
O1 0.53 0.53 0.48 0.22 1.1 0.014 0.0092 0.23
O2r 0.57 0.57 0.51 0.25 1.3 0.013 0.0076 0.22
O3 0.50 0.50 0.44 0.20 1.0 0.015 0.0071 0.20
O4 0.61 0.62 0.56 0.29 1.6 0.012 0.0044 0.29
O5r 0.51 0.53 0.47 0.22 1.1 0.015 0.0115 0.20
Item statistics
n raw.r std.r r.cor r.drop mean sd
O1 2778 0.62 0.65 0.52 0.39 4.8 1.1
O2r 2800 0.65 0.60 0.43 0.33 4.3 1.6
O3 2772 0.67 0.69 0.59 0.45 4.4 1.2
O4 2786 0.50 0.52 0.29 0.22 4.9 1.2
O5r 2780 0.67 0.66 0.52 0.42 4.5 1.3
Non missing response frequency for each item
1 2 3 4 5 6 miss
O1 0.01 0.04 0.08 0.22 0.33 0.33 0.01
O2r 0.06 0.10 0.16 0.14 0.26 0.29 0.00
O3 0.03 0.05 0.11 0.28 0.34 0.20 0.01
O4 0.02 0.04 0.06 0.17 0.32 0.39 0.01
O5r 0.03 0.07 0.13 0.19 0.32 0.27 0.01
Here is an example for reporting \(\alpha\) in a manuscript:
Tip
The reliability coefficient of the composite scores of the five Openness items for our sample was .60, 95% CI [.58, .62].
Multilevel Reliability
# Example data from Bolger & Laurenceau (2013) chapter 7
<- read.csv(
dat "https://quantdev.ssri.psu.edu/sites/qdev/files/psychometrics.csv",
header = TRUE
)# Reshape the data to have items in multiple columns, which is more common
<- dat |>
dat2 pivot_wider(names_from = "item", names_prefix = "x",
values_from = "y")
head(dat2, 20)
As can be seen, each person has 10 time points, each time with 4 items. We can get three types of composites:
# Raw composite
<- rowSums(dat2[c("x1", "x2", "x3", "x4")])
xsum var(xsum, na.rm = TRUE) # total variance
[1] 11.88269
# Between composite (one for each person)
<- tapply(xsum, INDEX = dat2$person, FUN = mean)
xsumb var(xsumb, na.rm = TRUE) # between variance
[1] 7.962253
# Within composite
<- xsum - ave(xsum, dat2$person)
xsumw var(xsumw, na.rm = TRUE) # within variance
[1] 3.91179
And you can compute reliability for each composite using the following code:
source("https://github.com/marklhc/mcfa_reliability_supp/raw/master/multilevel_alpha.R")
multilevel_alpha(dat2[c("x1", "x2", "x3", "x4")],
id = dat2$person)
Loading required package: lavaan
This is lavaan 0.6-15
lavaan is FREE software! Please report any bugs.
Attaching package: 'lavaan'
The following object is masked from 'package:psych':
cor2cov
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING:
Level-1 variable "y2" has no variance within some clusters. The
cluster ids with zero within variance are: 409 431 501
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING:
Level-1 variable "y3" has no variance within some clusters. The
cluster ids with zero within variance are: 351 451 463
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING:
Level-1 variable "y4" has no variance within some clusters. The
cluster ids with zero within variance are: 351 423 447 451 471
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING:
Level-1 variable "y2" has no variance within some clusters. The
cluster ids with zero within variance are: 409 431 501
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING:
Level-1 variable "y3" has no variance within some clusters. The
cluster ids with zero within variance are: 351 451 463
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING:
Level-1 variable "y4" has no variance within some clusters. The
cluster ids with zero within variance are: 351 423 447 451 471
Parallel analysis suggests that the number of factors = NA and the number of components = 1
Parallel analysis suggests that the number of factors = NA and the number of components = 1
$alpha
alpha2l alphab alphaw
0.8342556 0.8218115 0.7679396
$alpha_ci
2.5% 97.5%
alpha2l 0.7667533 0.8837675
alphab 0.6966022 0.8881618
alphaw 0.7115038 0.8117901
$omega
omega2l omegab omegaw
0.8326827 0.8158880 0.7717208
$omega_ci
2.5% 97.5%
omega2l 0.7583216 0.8845281
omegab 0.6731932 0.8888178
omegaw 0.7171751 0.8129809
$ncomp
within between
1 1
The values are labelled “alpha2l”, “alphab”, and “alphaw”, followed by the corresponding 95% CIs.