# Define a function for formatting numbers
<- function(x, d = 2) format(x, digits = d, big.mark = ",") comma
Interrater Reliability
library(psych)
library(lme4)
library(tidyverse)
Interrater Reliability
Data analyzed in this paper: https://www.mdpi.com/2076-3425/12/8/1011/
<- read.csv("https://osf.io/download/p9gqk/")
slp_dat <- slp_dat |>
slp_vas_wide select(slpID, Speaker, slp_VAS) |>
pivot_wider(names_from = slpID, values_from = slp_VAS)
head(slp_vas_wide)
For Two Raters
We’ll first select just the first two raters.
<- slp_vas_wide |>
slp_2rater select(slp14, slp15)
Nominal Agreement
To compute nominal agreement, we need to consider the ratings between two raters to be exactly the same. If we go by that definition, we have
with(slp_2rater, mean(slp14 == slp15))
[1] 0.1
If we instead relax the definition a little bit and say that agreement is reached if it is the same after rounding,
<- round(slp_2rater / 10)
slp_2round <- lapply(slp_2round, FUN = factor, levels = 0:10)
slp_2round # Contingency table
table(slp_2round)
slp15
slp14 0 1 2 3 4 5 6 7 8 9 10
0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 1 0 0 0 0
6 0 0 0 0 0 1 0 0 0 0 2
7 0 0 0 0 0 0 0 1 0 0 1
8 0 0 0 0 0 0 0 0 0 2 0
9 0 0 0 0 0 0 0 0 1 0 2
10 0 0 0 0 0 0 0 1 0 0 7
<- with(slp_2round, mean(slp14 == slp15))
p0 p0
[1] 0.4
Cohen’s Kappa
\[\kappa = \frac{P_0 - P_c}{1 - P_c}\]
\[P_c = \frac{1}{N^2} \sum_{i = 1}^c (n_{i+}) (n_{+i})\]
<- table(slp_2round)
slp_2tab <- sum(colSums(slp_2tab) * rowSums(slp_2tab)) / sum(slp_2tab)^2
pc <- (p0 - pc) / (1 - pc)) (kappa
[1] 0.1666667
Drawback: Kappa tends to be small when scores are unevenly distributed (e.g., most scores belong to certain categories). It is certainly the case above.
Multiple Raters
Coefficient \(\alpha\)
Treat each rater as an “item.”
::alpha(slp_vas_wide[-1]) psych
Number of categories should be increased in order to count frequencies.
Warning in cor.smooth(r): Matrix was not positive definite, smoothing was done
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs < 0 were set to .0
In smc, smcs > 1 were set to 1.0
In smc, smcs > 1 were set to 1.0
Reliability analysis
Call: psych::alpha(x = slp_vas_wide[-1])
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.98 0.98 0.99 0.68 45 0.0076 62 25 0.7
95% confidence boundaries
lower alpha upper
Feldt 0.96 0.98 0.99
Duhachek 0.96 0.98 0.99
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N var.r med.r
slp10 0.97 0.98 0.98 0.68 42 0.019 0.70
slp11 0.97 0.98 1.00 0.68 42 0.018 0.70
slp13 0.97 0.98 0.99 0.68 42 0.019 0.70
slp14 0.97 0.98 0.99 0.68 43 0.019 0.71
slp15 0.97 0.98 1.00 0.68 43 0.019 0.71
slp16 0.98 0.98 0.99 0.69 45 0.018 0.71
slp17 0.97 0.98 0.99 0.68 42 0.019 0.69
slp18 0.97 0.98 1.00 0.69 44 0.017 0.71
slp19 0.97 0.98 0.99 0.68 43 0.019 0.71
slp2 0.97 0.98 0.99 0.68 43 0.018 0.71
slp20 0.97 0.98 0.99 0.68 42 0.019 0.69
slp21 0.97 0.98 0.99 0.68 42 0.018 0.69
slp22 0.97 0.98 0.99 0.68 43 0.020 0.69
slp23 0.97 0.98 0.99 0.68 42 0.019 0.69
slp25 0.97 0.98 0.99 0.69 44 0.018 0.71
slp3 0.98 0.98 0.99 0.69 44 0.019 0.71
slp4 0.97 0.98 0.99 0.69 44 0.018 0.71
slp5 0.97 0.98 0.99 0.68 43 0.019 0.70
slp6 0.98 0.98 0.99 0.71 48 0.011 0.71
slp8 0.97 0.98 0.99 0.68 42 0.020 0.70
slp9 0.97 0.98 0.99 0.68 42 0.018 0.69
Item statistics
n raw.r std.r r.cor r.drop mean sd
slp10 18 0.89 0.88 0.88 0.86 81 28
slp11 18 0.89 0.91 0.92 0.90 69 33
slp13 19 0.86 0.88 0.88 0.86 54 33
slp14 20 0.82 0.84 0.84 0.81 80 25
slp15 20 0.81 0.82 0.81 0.80 85 22
slp16 20 0.77 0.76 0.76 0.73 50 42
slp17 20 0.90 0.91 0.91 0.90 63 34
slp18 20 0.82 0.81 0.78 0.79 70 30
slp19 19 0.83 0.83 0.83 0.81 56 25
slp2 20 0.83 0.82 0.82 0.80 64 32
slp20 20 0.90 0.91 0.91 0.89 49 25
slp21 19 0.91 0.91 0.91 0.91 59 25
slp22 20 0.86 0.86 0.85 0.84 50 27
slp23 20 0.89 0.88 0.88 0.86 71 32
slp25 20 0.79 0.79 0.79 0.77 79 28
slp3 19 0.76 0.76 0.76 0.74 52 40
slp4 20 0.79 0.80 0.78 0.77 47 27
slp5 19 0.82 0.82 0.81 0.81 43 23
slp6 13 0.80 0.57 0.57 0.54 76 23
slp8 20 0.86 0.88 0.88 0.86 53 31
slp9 19 0.89 0.91 0.90 0.90 58 23
Intraclass Correlation
Based on the ANOVA/mixed-effect model framework.
- Independent variables: Raters and ratees
- Dependent variables: scores
Sources of variance:
- Ratees (variance of interest)
- Raters
- Ratees \(\times\) raters interaction*
- Random error*
The variances due to ratees \(\times\) raters interaction and random error can only be separated when each rater rates each ratee more than once, which is uncommon.
Reliability = ratee variance / (ratee variance + error variance)
However, what is considered error depends on multiple considerations, including
- whether the design is crossed (i.e., all raters rate each ratee) or nested (different sets of raters rate each ratee);
- whether raters are considered fixed (not generalizable) or random (generalizable to a “population of raters”); this is only relevant when rater \(\times\) ratee interaction can be estimated;
- consistency (relative decision; only rank-ordering is of interest) vs. absolute agreement (absolute decision; actual score is of interest).
One-way ANOVA for nested design
Let’s say we have data where each person is rated by two different raters:
<- slp_dat |>
slp_vas_nested mutate(SpeakerID = as.numeric(as.factor(Speaker))) |>
# Select only 10 speakers
filter(SpeakerID <= 10) |>
group_by(Speaker) |>
# Filter specific raters
filter(row_number() %in% (SpeakerID[1] * 2 - (1:0)))
We have a design with raters nested within ratees. With this design, we cannot distinguish rater effect from random error. We can now run a one-way random-effect ANOVA, which is the same as a random-intercept multilevel model:
<- lmer(slp_VAS ~ 1 + (1 | Speaker), data = slp_vas_nested)
m1 summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: slp_VAS ~ 1 + (1 | Speaker)
Data: slp_vas_nested
REML criterion at convergence: 179.6
Scaled residuals:
Min 1Q Median 3Q Max
-1.50049 -0.77609 -0.05948 0.87584 1.35589
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 308.9 17.58
Residual 816.8 28.58
Number of obs: 19, groups: Speaker, 10
Fixed effects:
Estimate Std. Error t value
(Intercept) 53.613 8.627 6.214
# Extract variance components (Ratee, error)
<- as.data.frame(VarCorr(m1)) vc_m1
ICC for single rating: \(\frac{\sigma^2_\text{ratee}}{\sigma^2_\text{ratee} + \sigma^2_E}\)
$vcov[1] / (vc_m1$vcov[1] + vc_m1$vcov[2]) vc_m1
[1] 0.2744076
ICC for average rating: \(\frac{\sigma^2_\text{ratee}}{\sigma^2_\text{ratee} + \sigma^2_E / k}\), where \(k\) is the number of raters per ratee
$vcov[1] / (vc_m1$vcov[1] + vc_m1$vcov[2] / 2) vc_m1
[1] 0.4306434
Two-way ANOVA
For consistency (relative decision), rater effect is not error, because the rater bias is applying to everyone and does not change the rank order.
For agreement (absolute decision), rater effect is error as it changes the absolute scores.
<- lmer(slp_VAS ~ 1 + (1 | Speaker) + (1 | slpID), data = slp_dat)
m2 summary(m2)
Linear mixed model fit by REML ['lmerMod']
Formula: slp_VAS ~ 1 + (1 | Speaker) + (1 | slpID)
Data: slp_dat
REML criterion at convergence: 3544.1
Scaled residuals:
Min 1Q Median 3Q Max
-3.3281 -0.5813 0.0862 0.6611 2.5747
Random effects:
Groups Name Variance Std.Dev.
slpID (Intercept) 132.8 11.53
Speaker (Intercept) 585.7 24.20
Residual 291.2 17.07
Number of obs: 403, groups: slpID, 21; Speaker, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 61.882 6.028 10.27
<- as.data.frame(VarCorr(m2)) vc_m2
ICC for single rating:
- Agreement: \(\frac{\sigma^2_\text{ratee}}{\sigma^2_\text{ratee} + \sigma^2_\text{rater} + \sigma^2_E}\)
# Note: ratee is in position 2, but may be different in different data sets
$vcov[2] / (vc_m2$vcov[1] + vc_m2$vcov[2] + vc_m2$vcov[3]) vc_m2
[1] 0.5800175
- Consistency: \(\frac{\sigma^2_\text{ratee}}{\sigma^2_\text{ratee} + \sigma^2_E}\)
# Note: ratee is in position 2, but may be different in different data sets
$vcov[2] / (vc_m2$vcov[2] + vc_m2$vcov[3]) vc_m2
[1] 0.6678742
ICC for average rating:
- Agreement: \(\frac{\sigma^2_\text{ratee}}{\sigma^2_\text{ratee} + (\sigma^2_\text{rater} + \sigma^2_E) / k}\)
# Note: ratee is in position 2, but may be different in different data sets
$vcov[2] / (vc_m2$vcov[2] + (vc_m2$vcov[1] + vc_m2$vcov[3]) / 21) vc_m2
[1] 0.966669
- Consistency: \(\frac{\sigma^2_\text{ratee}}{\sigma^2_\text{ratee} + \sigma^2_E / k}\)
$vcov[2] / (vc_m2$vcov[2] + vc_m2$vcov[3] / 21) vc_m2
[1] 0.9768674
You can use the psych::ICC()
function for these calculations. This requires wide-format data.
::ICC(slp_vas_wide[-1]) psych
Call: psych::ICC(x = slp_vas_wide[-1])
Intraclass correlation coefficients
type ICC F df1 df2 p lower bound upper bound
Single_raters_absolute ICC1 0.58 30 19 400 1.6e-64 0.43 0.75
Single_random_raters ICC2 0.58 43 19 380 1.0e-82 0.43 0.75
Single_fixed_raters ICC3 0.67 43 19 380 1.0e-82 0.53 0.81
Average_raters_absolute ICC1k 0.97 30 19 400 1.6e-64 0.94 0.98
Average_random_raters ICC2k 0.97 43 19 380 1.0e-82 0.94 0.98
Average_fixed_raters ICC3k 0.98 43 19 380 1.0e-82 0.96 0.99
Number of subjects = 20 Number of Judges = 21
See the help file for a discussion of the other 4 McGraw and Wong estimates,
Be aware that the terminology in the above output may cause confusion. Here is the translation:
Single_raters_absolute
, ICC(1, 1): ICC for single rating for one-way ANOVA (assuming each ratee is rated by a different set of raters)Average_raters_absolute
, ICC(1, k): ICC for average rating for one-way ANOVA (assuming each ratee is rated by a different set of raters)Single_random_raters
, ICC(2, 1): ICC for interrater agreement for single rating for two-way ANOVA (“random” here really means agreement)Average_random_absolute
, ICC(2, k): ICC for interrater agreement for average rating for two-way ANOVA (“random” here really means agreement)Single_fixed_raters
, ICC(3, 1): ICC for interrater consistency for single rating for two-way ANOVA (“fixed” here really means consistency)Average_fixed_absolute
, ICC(3, k): ICC for interrater consistency for average rating for two-way ANOVA (“fixed” here really means consistency)
For this example, ICC(1, 1) and ICC(1, k) are not relevant, as it is a crossed (not nested) design.
Derivation
\[ \newcommand{\sigr}{\sigma_\text{ratee}} \newcommand{\sige}{\sigma_E} \newcommand{\ssb}{\mathit{SS}_\text{ratee}} \newcommand{\sse}{\mathit{SS}_E} \newcommand{\msb}{\mathit{MS}_\text{ratee}} \newcommand{\mse}{\mathit{MS}_E} \newcommand{\flo}{F_{\frac{\alpha}{2}, N - 1, N(k - 1)}} \newcommand{\fhi}{F_{1 - \frac{\alpha}{2}, N - 1, N(k - 1)}} \]
Each score consists of grand mean + ratee effect + error. For \(i = 1, \ldots, k\) and \(j = 1, \ldots, N\),
\[Y_{ij} = \beta_j + e_{ij}\]
with \(\beta_j \sim N(\mu, \sigr^2)\) and \(e_{ij} \sim N(0, \sige^2)\)
With \(N\) ratees, the mean rating for ratee \(j\) is
\[\bar Y_{.j} = \frac{\sum_i Y_{ij}}{k} = \beta_j + \frac{\sum_i e_{ij}}{k}\]
The grand mean is
\[\bar Y_{..} = \frac{\sum_j \sum_i Y_{ij}}{kN} = \frac{\sum_j \beta_j}{N} + \frac{\sum_j \sum_i e_{ij}}{kN}\]
The sum of squares for the ratee effect is
\[ \begin{aligned} \ssb & = \sum_j k (\bar Y_{.j} - \bar Y_{..})^2 \\ & = \sum_j k \left(\beta_j + \frac{\sum_i e_{ij}}{k} - \frac{\sum_j \beta_j}{N} - \frac{\sum_j \sum_i e_{ij}}{kN}\right)^2 \\ & = \sum_j k [(\beta_j - \bar \beta) + (\bar e_{.j} - \bar e_{..})]^2 \end{aligned} \]
Taking the expected value, we have
\[ \begin{aligned} E[\sum_j (\beta_j - \bar \beta)^2] & = E[\sum_j (\beta_j - \mu + \mu - \bar \beta)^2] \\ & = E[\sum_j (\beta_j - \mu)^2] - 2 E[(\mu - \bar \beta) \sum_j (\beta_j - \mu)] + E[\sum_j (\mu - \bar \beta)^2] \\ & = N \sigr^2 - 2 E[N (\mu - \bar \beta)^2] + E[N (\bar \beta - \mu)^2] \\ & = N \sigr^2 - N \sigr^2 / N \\ & = (N - 1) \sigr^2 \end{aligned} \]
Similarly,
\[ E[\sum_j (\bar e_{.j} - \bar e_{..})^2] = (N - 1) \sige^2 / k \]
and given \(\beta_j\) and \(e_{ij}\) are independent, we have
\[ E(\ssb) = k (N - 1) \sigr^2 + (N - 1) \sige^2. \]
The mean square is the sum of squares divided by the degrees of freedom, which is \(N - 1\) for the rater effect. Therefore, the expected mean square is
\[ E(\msb) = k \sigr^2 + \sige^2. \]
For sum of squares error, we have
\[ \begin{aligned} \sse & = \sum_j \sum_i k (Y_{ij} - \bar Y_{.j})^2 \\ & = \sum_j \sum_i \left(\beta_j + \frac{\sum_i e_{ij}}{k} - \beta_j - e_{ij}\right)^2 \\ & = \sum_j \sum_i (e_{ij} - \bar e_{.})^2 \end{aligned} \]
And the expected sum of squares (and mean squares) error can be shown as
\[ \begin{aligned} E(\sse) & = N (k - 1) \sige^2 \\ E(\mse) & = \sige^2 \end{aligned} \]
Therefore, using method of moments, the estimators for \(\sige^2\) and \(\sigr^2\) are
\[ \begin{aligned} \tilde \sige^2 & = \mse \\ \tilde \sigr^2 & = (\msb - \mse) / k \\ \end{aligned} \]
Using the above, you can obtain the ICC formulas for “one-way random” of Table 9.5 of your text.
Confidence intervals
From standard ANOVA results Cochran’s theorem, with a balanced design, one gets
\[ \begin{aligned} \sse & \sim \sige^2 \chi^2_{N (k - 1)} \\ \ssb & \sim (k\sigr^2 + \sige^2) \chi^2_{N - 1} \end{aligned} \]
and the two sum of squares are independently distributed, and
\[ \frac{\sige^2}{k\sigr^2 + \sige^2}\frac{\msb}{\mse} \sim F_{N - 1, N (k - 1)} \]
Note that the distribution of \(\frac{\sige^2}{k\sigr^2 + \sige^2}\frac{\msb}{\mse}\) does not depend on the parameters, \(\sigr^2\) and \(\sige^2\). Therefore, we can invert a two-tailed \(F\) test to consider values that are “non-significant” to obtain a confidence interval. Let \(\flo\) and \(\fhi\) be the \(\alpha / 2\) and \(1 - \alpha / 2\) quantiles of a central \(F\) distribution with df1 = \(N - 1\) and df2 = \(N (k - 1)\), and \(F_0 = \msb / \mse\), then because ICC(1, \(k\)) for average rating is
\[ \begin{aligned} \frac{\sigr^2}{\sigr^2 + \sige^2 / k} & = \frac{k \sigr^2}{k \sigr^2 + \sige^2} \\ & = 1 - \frac{\sige^2}{k \sigr^2 + \sige^2}, \end{aligned} \]
We have
\[ \frac{\sige^2}{k\sigr^2 + \sige^2}\frac{\msb}{\mse} = [1 - \rho(1, k)]F_0 \sim F_{N - 1, N (k - 1)} \]
Define \(F = [1 - \rho(1, k)]F_0\), which follows an \(F\) distribution. Because \(\rho(F) = 1 - F / F_0\) is a monotonic decreasing function with known \(F_0\), we can obtain 95% CI by transforming \(\flo\) and \(\fhi\), so an analytic 95% CI for ICC(1, \(k\)) is
\[ \left\{1 - \frac{\fhi}{F_0} \leq \rho(1, k) \leq 1 - \frac{\flo}{F_0}\right\} \]
Similarly, for ICC(1, 1) = \(\sigr^2 / (\sigr^2 + \sige^2)\), as \(F_0 / F - 1\) = \(k \sigr^2 / \sige^2\) = \(k \rho(1, 1) / [1 - \rho(1, 1)]\), we have
\[ \rho(F) = \frac{F_0 / F - 1}{F_0 / F - 1 + k}, \]
which is monotonic decreasing in \(F\). Therefore, we can again obtain a 95% CI by transforming \(\flo\) and \(\fhi\)
\[ \left\{\frac{\frac{F_0}{\fhi} - 1}{\frac{F_0}{\fhi} - 1 + k} \leq \rho(1, 1) \leq \frac{\frac{F_0}{\flo} - 1}{\frac{F_0}{\flo} - 1 + k}\right\} \]
For 95% CIs with two-way crossed designs, see the following papers:
Fleiss, J.L., Shrout, P.E. Approximate interval estimation for a certain intraclass correlation coefficient. Psychometrika 43, 259–262 (1978). https://doi.org/10.1007/BF02293867
McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30–46. https://doi.org/10.1037/1082-989X.1.1.30
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420–428. https://doi.org/10.1037/0033-2909.86.2.420
See also this paper for some updated guidelines and recent development on ICC:
ten Hove, D., Jorgensen, T. D., & van der Ark, L. A. (2022). Updated guidelines on selecting an intraclass correlation coefficient for interrater reliability, with applications to incomplete observational designs. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000516
Demonstrating Analytic CIs in R
Let’s look at the results by treating the raters as different for each ratee (which is incorrect, but just to show the calculation for CIs):
<- lmer(slp_VAS ~ 1 + (1 | Speaker), data = slp_dat)
m1_full <- as.data.frame(VarCorr(m1_full))
vc_m1f # Obtain mean squares from the variance components
<- 21
k <- k * vc_m1f$vcov[1] + vc_m1f$vcov[2]
msr <- vc_m1f$vcov[2]
mse <- msr / mse
f0 # 95% CI (df1 = 19, df2 = 20 * 20 = 400)
# For ICC(1, k)
c("LL" = 1 - 1 / f0 * qf(.975, 19, 400),
"UL" = 1 - 1 / f0 * qf(.025, 19, 400))
LL UL
0.9410646 0.9844892
# For ICC(1, 1)
c("LL" = (f0 / qf(.975, 19, 400) - 1) / (f0 / qf(.975, 19, 400) - 1 + k),
"UL" = (f0 / qf(.025, 19, 400) - 1) / (f0 / qf(.025, 19, 400) - 1 + k))
LL UL
0.4319368 0.7513950
Bootstrap Confidence Intervals
Because the models we fitted are random-effect or mixed-effect models (i.e., multilevel models), we’ll need to do multilevel bootstrap. We could use the bootMer()
available in lme4
for parametric bootstrap; the bootmlm
package provides alternative bootstrap methods that are useful with assumption violations.
From my observation, the bootstrap CIs seem wider than the analytic CIs for ICCs, at least for this example. This observation is not unique for bootstrap CIs.
One-Way Random
# One-way random-effect
<- function(x) {
icc_11 # Function for ICC(1, 1)
<- as.data.frame(VarCorr(x))
vc_x $vcov[1] / sum(vc_x$vcov)
vc_x
}<- bootMer(m1_full, icc_11, nsim = 1999,
boo # change to .progress = "txt" for progress bar
.progress = "none")
::boot.ci(boo, index = 1, type = c("norm", "basic", "perc")) boot
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL :
boot::boot.ci(boot.out = boo, type = c("norm", "basic", "perc"),
index = 1)
Intervals :
Level Normal Basic Percentile
95% ( 0.4310, 0.7592 ) ( 0.4486, 0.7729 ) ( 0.3857, 0.7101 )
Calculations and Intervals on Original Scale
Two-Way Random
# Two-way random-effect
<- function(x) {
icc_2k # Function for ICC(2, k), which is for consistency
<- as.data.frame(VarCorr(x))
vc_x $vcov[2] / (vc_x$vcov[2] + (vc_x$vcov[1] + vc_x$vcov[3]) / 21)
vc_x
}<- bootMer(m2, icc_2k, nsim = 1999,
boo2 # change to .progress = "txt" for progress bar
.progress = "none")
::boot.ci(boo2, index = 1, type = c("norm", "basic", "perc")) boot
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1999 bootstrap replicates
CALL :
boot::boot.ci(boot.out = boo2, type = c("norm", "basic", "perc"),
index = 1)
Intervals :
Level Normal Basic Percentile
95% ( 0.9427, 0.9989 ) ( 0.9518, 1.0054 ) ( 0.9279, 0.9816 )
Calculations and Intervals on Original Scale