Scaling and Standard Errors in SEM
In this post, I demonstrate why rescaling a coefficient (i.e., multiplied/divided by a constant) is different from “standardizing” a coefficient (i.e., multiplied/divided by the sample standard deviation, which is a random variable) in SEM.
See also this discussion on StackExchange and the article When constraints interact: A caution about reference variables, identification constraints, and scale dependencies in structural equation modeling.
Load Packages
library(lavaan)
## This is lavaan 0.6-12
## lavaan is FREE software! Please report any bugs.
# library(performance)
# library(parameters)
library(modelsummary)
Example
Different Scaling Choices
In the following, I manipulate the scaling constraints on (1) the average loading, (2) the first loading, (3) the second loading, and (4) the latent variance, so that they all have the same sample unit for the latent variable.
1. Effect coding (average loading = 1)
model <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
# regression
dem60 ~ ind60
'
fit <- sem(model, data = PoliticalDemocracy, effect.coding = "loadings")
## Warning in lav_partable_add_bounds(partable = lavpartable, lavpta = lavpta, :
## lavaan WARNING: automatic bounds not available (yet) if effect.coding is used
## Warning in lav_partable_add_bounds(partable = lavpartable, lavh1 = lavh1, :
## lavaan WARNING: automatic bounds not available (yet) if effect.coding is used
parameterEstimates(fit)
## lhs op rhs est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 0.600 0.030 19.833 0.000 0.541 0.660
## 2 ind60 =~ x2 1.308 0.044 29.622 0.000 1.222 1.395
## 3 ind60 =~ x3 1.091 0.050 21.634 0.000 0.993 1.190
## 4 dem60 =~ y1 0.809 0.072 11.309 0.000 0.669 0.950
## 5 dem60 =~ y2 1.148 0.099 11.575 0.000 0.954 1.343
## 6 dem60 =~ y3 0.887 0.094 9.444 0.000 0.703 1.071
## 7 dem60 =~ y4 1.156 0.080 14.400 0.000 0.998 1.313
## 8 dem60 ~ ind60 1.067 0.263 4.056 0.000 0.552 1.583
## 9 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 10 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 11 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 12 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 13 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 14 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 15 y4 ~~ y4 2.137 0.719 2.973 0.003 0.728 3.546
## 16 ind60 ~~ ind60 1.245 0.215 5.778 0.000 0.823 1.667
## 17 dem60 ~~ dem60 5.271 1.031 5.111 0.000 3.249 7.292
2. Fix first loading
# Get estimates from first model
coef(fit)[c("ind60=~x1", "dem60=~y1")]
## ind60=~x1 dem60=~y1
## 0.6002121 0.8094457
model <- '
# latent variable definitions
ind60 =~ 0.6002121 * x1 + x2 + x3
dem60 =~ 0.8094457 * y1 + y2 + y3 + y4
# regression
dem60 ~ ind60
'
fit1 <- sem(model, data = PoliticalDemocracy)
parameterEstimates(fit1)
## lhs op rhs est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 0.600 0.000 NA NA 0.600 0.600
## 2 ind60 =~ x2 1.308 0.084 15.633 0.000 1.144 1.472
## 3 ind60 =~ x3 1.091 0.091 11.966 0.000 0.913 1.270
## 4 dem60 =~ y1 0.809 0.000 NA NA 0.809 0.809
## 5 dem60 =~ y2 1.148 0.164 7.005 0.000 0.827 1.470
## 6 dem60 =~ y3 0.887 0.139 6.396 0.000 0.615 1.158
## 7 dem60 =~ y4 1.156 0.139 8.341 0.000 0.884 1.427
## 8 dem60 ~ ind60 1.067 0.281 3.801 0.000 0.517 1.618
## 9 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 10 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 11 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 12 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 13 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 14 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 15 y4 ~~ y4 2.137 0.719 2.973 0.003 0.728 3.546
## 16 ind60 ~~ ind60 1.245 0.241 5.170 0.000 0.773 1.717
## 17 dem60 ~~ dem60 5.271 1.335 3.947 0.000 2.653 7.888
3. Fix second loading
# Get estimates from first model
coef(fit)[c("ind60=~x2", "dem60=~y2")]
## ind60=~x2 dem60=~y2
## 1.308404 1.148334
model2 <- '
# latent variable definitions
ind60 =~ NA * x1 + 1.308404 * x2 + x3
dem60 =~ NA * y1 + 1.148334 * y2 + y3 + y4
# regression
dem60 ~ ind60
'
fit2 <- sem(model2, data = PoliticalDemocracy)
parameterEstimates(fit2) # SE changed!
## lhs op rhs est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 0.600 0.038 15.633 0.000 0.525 0.675
## 2 ind60 =~ x2 1.308 0.000 NA NA 1.308 1.308
## 3 ind60 =~ x3 1.091 0.083 13.149 0.000 0.929 1.254
## 4 dem60 =~ y1 0.809 0.116 7.005 0.000 0.583 1.036
## 5 dem60 =~ y2 1.148 0.000 NA NA 1.148 1.148
## 6 dem60 =~ y3 0.887 0.146 6.059 0.000 0.600 1.173
## 7 dem60 =~ y4 1.156 0.151 7.674 0.000 0.860 1.451
## 8 dem60 ~ ind60 1.067 0.284 3.761 0.000 0.511 1.623
## 9 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 10 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 11 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 12 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 13 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 14 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 15 y4 ~~ y4 2.137 0.719 2.973 0.003 0.728 3.546
## 16 ind60 ~~ ind60 1.245 0.218 5.705 0.000 0.817 1.673
## 17 dem60 ~~ dem60 5.271 1.447 3.642 0.000 2.434 8.107
4. Fix latent variances
# Get estimates from first model
coef(fit)[c("ind60~~ind60", "dem60~~dem60")]
## ind60~~ind60 dem60~~dem60
## 1.245068 5.270542
model3 <- '
# latent variable definitions
ind60 =~ NA * x1 + x2 + x3
dem60 =~ NA * y1 + y2 + y3 + y4
# latent variances
ind60 ~~ 1.245068 * ind60
dem60 ~~ 5.270542 * dem60
# regression
dem60 ~ ind60
'
fit3 <- sem(model3, data = PoliticalDemocracy)
parameterEstimates(fit3) # SE changed again!
## lhs op rhs est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 0.600 0.058 10.340 0.000 0.486 0.714
## 2 ind60 =~ x2 1.308 0.115 11.410 0.000 1.084 1.533
## 3 ind60 =~ x3 1.091 0.115 9.477 0.000 0.866 1.317
## 4 dem60 =~ y1 0.809 0.103 7.893 0.000 0.608 1.010
## 5 dem60 =~ y2 1.148 0.158 7.285 0.000 0.839 1.457
## 6 dem60 =~ y3 0.887 0.134 6.606 0.000 0.624 1.150
## 7 dem60 =~ y4 1.156 0.126 9.169 0.000 0.909 1.403
## 8 ind60 ~~ ind60 1.245 0.000 NA NA 1.245 1.245
## 9 dem60 ~~ dem60 5.271 0.000 NA NA 5.271 5.271
## 10 dem60 ~ ind60 1.067 0.295 3.619 0.000 0.489 1.645
## 11 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 12 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 13 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 14 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 15 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 16 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 17 y4 ~~ y4 2.137 0.719 2.973 0.003 0.728 3.546
msummary(list(`Effect coding` = fit,
`Fix item 1` = fit1,
`Fix item 2` = fit2,
`Fix variance` = fit3))
Effect coding | Fix item 1 | Fix item 2 | Fix variance | |
---|---|---|---|---|
ind60 =~ x1 | 0.600 | 0.600 | 0.600 | 0.600 |
(0.030) | (0.000) | (0.038) | (0.058) | |
ind60 =~ x2 | 1.308 | 1.308 | 1.308 | 1.308 |
(0.044) | (0.084) | (0.000) | (0.115) | |
ind60 =~ x3 | 1.091 | 1.091 | 1.091 | 1.091 |
(0.050) | (0.091) | (0.083) | (0.115) | |
dem60 =~ y1 | 0.809 | 0.809 | 0.809 | 0.809 |
(0.072) | (0.000) | (0.116) | (0.103) | |
dem60 =~ y2 | 1.148 | 1.148 | 1.148 | 1.148 |
(0.099) | (0.164) | (0.000) | (0.158) | |
dem60 =~ y3 | 0.887 | 0.887 | 0.887 | 0.887 |
(0.094) | (0.139) | (0.146) | (0.134) | |
dem60 =~ y4 | 1.156 | 1.156 | 1.156 | 1.156 |
(0.080) | (0.139) | (0.151) | (0.126) | |
dem60 ~ ind60 | 1.067 | 1.067 | 1.067 | 1.067 |
(0.263) | (0.281) | (0.284) | (0.295) | |
Num.Obs. | 75 | 75 | 75 | 75 |
AIC | 1906.2 | 1906.2 | 1906.2 | 1906.2 |
BIC | 1940.9 | 1940.9 | 1940.9 | 1940.9 |
Note that the parameter estimates are all the same, but the standard errors are all different. This has implications for statistical power when doing latent variable analysis.
Standardized Coefficients
Delta method
Directly in lavaan
msummary(list(`Fix item 1` = fit,
`Fix item 2` = fit2,
`Fix variance` = fit3),
standardize = TRUE)
Fix item 1 | Fix item 2 | Fix variance | |
---|---|---|---|
ind60 =~ x1 | 0.920 | 0.920 | 0.920 |
(0.023) | (0.023) | (0.023) | |
ind60 =~ x2 | 0.973 | 0.973 | 0.973 |
(0.017) | (0.017) | (0.017) | |
ind60 =~ x3 | 0.872 | 0.872 | 0.872 |
(0.031) | (0.031) | (0.031) | |
dem60 =~ y1 | 0.804 | 0.804 | 0.804 |
(0.051) | (0.051) | (0.051) | |
dem60 =~ y2 | 0.757 | 0.757 | 0.757 |
(0.058) | (0.058) | (0.058) | |
dem60 =~ y3 | 0.704 | 0.704 | 0.704 |
(0.066) | (0.066) | (0.066) | |
dem60 =~ y4 | 0.898 | 0.898 | 0.898 |
(0.039) | (0.039) | (0.039) | |
dem60 ~ ind60 | 0.460 | 0.460 | 0.460 |
(0.100) | (0.100) | (0.100) | |
Num.Obs. | 75 | 75 | 75 |
AIC | 1906.2 | 1906.2 | 1906.2 |
BIC | 1940.9 | 1940.9 | 1940.9 |
1. Define new parameters in the model
model_std <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
# latent variances
ind60 ~~ v1 * ind60
dem60 ~~ ev2 * dem60
# regression
dem60 ~ b * ind60
# define standardized coefficients
v2 := b^2 * v1 + ev2
beta := b * sqrt(v1 / v2)
'
fit_std <- sem(model_std, data = PoliticalDemocracy)
parameterEstimates(fit_std) # same standardized coefficient
## lhs op rhs label est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 1.000 0.000 NA NA 1.000 1.000
## 2 ind60 =~ x2 2.180 0.139 15.633 0.000 1.907 2.453
## 3 ind60 =~ x3 1.818 0.152 11.966 0.000 1.520 2.116
## 4 dem60 =~ y1 1.000 0.000 NA NA 1.000 1.000
## 5 dem60 =~ y2 1.419 0.203 7.005 0.000 1.022 1.816
## 6 dem60 =~ y3 1.095 0.171 6.396 0.000 0.760 1.431
## 7 dem60 =~ y4 1.428 0.171 8.341 0.000 1.092 1.763
## 8 ind60 ~~ ind60 v1 0.449 0.087 5.170 0.000 0.279 0.619
## 9 dem60 ~~ dem60 ev2 3.453 0.875 3.947 0.000 1.738 5.168
## 10 dem60 ~ ind60 b 1.439 0.379 3.801 0.000 0.697 2.181
## 11 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 12 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 13 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 14 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 15 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 16 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 17 y4 ~~ y4 2.137 0.719 2.973 0.003 0.728 3.546
## 18 v2 := b^2*v1+ev2 v2 4.382 1.089 4.024 0.000 2.248 6.517
## 19 beta := b*sqrt(v1/v2) beta 0.460 0.100 4.593 0.000 0.264 0.657
2. Use constraints to fix variances
model_con <- '
# latent variable definitions
ind60 =~ NA * x1 + x2 + x3
dem60 =~ NA * y1 + y2 + y3 + y4
# latent variances
ind60 ~~ 1 * ind60
dem60 ~~ ev2 * dem60
# regression
dem60 ~ beta * ind60
# constraints
ev2 == 1 - beta^2
'
fit_con <- sem(model_con, data = PoliticalDemocracy)
parameterEstimates(fit_con) # same standardized coefficient
## lhs op rhs label est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 0.670 0.065 10.340 0.000 0.543 0.797
## 2 ind60 =~ x2 1.460 0.128 11.410 0.000 1.209 1.711
## 3 ind60 =~ x3 1.218 0.129 9.477 0.000 0.966 1.470
## 4 dem60 =~ y1 2.093 0.260 8.049 0.000 1.584 2.603
## 5 dem60 =~ y2 2.970 0.401 7.406 0.000 2.184 3.756
## 6 dem60 =~ y3 2.293 0.342 6.696 0.000 1.622 2.964
## 7 dem60 =~ y4 2.989 0.315 9.494 0.000 2.372 3.606
## 8 ind60 ~~ ind60 1.000 0.000 NA NA 1.000 1.000
## 9 dem60 ~~ dem60 ev2 0.788 0.092 8.535 0.000 0.607 0.969
## 10 dem60 ~ ind60 beta 0.460 0.100 4.593 0.000 0.264 0.657
## 11 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 12 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 13 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 14 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 15 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 16 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 17 y4 ~~ y4 2.137 0.719 2.973 0.003 0.729 3.546
3. (Incorrect) constraining the first indicators
coef(fit_con)[c("ind60=~x1", "dem60=~y1")]
## ind60=~x1 dem60=~y1
## 0.6697334 2.0934489
model_conl <- '
# latent variable definitions
ind60 =~ 0.6697334 * x1 + x2 + x3
dem60 =~ 2.0934489 * y1 + y2 + y3 + y4
# regression
dem60 ~ beta * ind60
'
fit_conl <- sem(model_conl, data = PoliticalDemocracy)
parameterEstimates(fit_conl) # same standardized coefficient
## lhs op rhs label est se z pvalue ci.lower ci.upper
## 1 ind60 =~ x1 0.670 0.000 NA NA 0.670 0.670
## 2 ind60 =~ x2 1.460 0.093 15.633 0.000 1.277 1.643
## 3 ind60 =~ x3 1.218 0.102 11.966 0.000 1.018 1.417
## 4 dem60 =~ y1 2.093 0.000 NA NA 2.093 2.093
## 5 dem60 =~ y2 2.970 0.424 7.005 0.000 2.139 3.801
## 6 dem60 =~ y3 2.293 0.359 6.396 0.000 1.590 2.996
## 7 dem60 =~ y4 2.989 0.358 8.341 0.000 2.286 3.691
## 8 dem60 ~ ind60 beta 0.460 0.121 3.801 0.000 0.223 0.698
## 9 x1 ~~ x1 0.081 0.020 4.123 0.000 0.043 0.120
## 10 x2 ~~ x2 0.120 0.072 1.676 0.094 -0.020 0.261
## 11 x3 ~~ x3 0.467 0.091 5.153 0.000 0.289 0.644
## 12 y1 ~~ y1 2.404 0.516 4.663 0.000 1.394 3.415
## 13 y2 ~~ y2 6.552 1.289 5.082 0.000 4.025 9.078
## 14 y3 ~~ y3 5.363 0.996 5.384 0.000 3.410 7.315
## 15 y4 ~~ y4 2.137 0.719 2.973 0.003 0.728 3.546
## 16 ind60 ~~ ind60 1.000 0.193 5.170 0.000 0.621 1.379
## 17 dem60 ~~ dem60 0.788 0.200 3.947 0.000 0.397 1.179
msummary(list(`Define new par` = fit_std,
`Constraints` = fit_con,
`Fix loading` = fit_conl))
Define new par | Constraints | Fix loading | |
---|---|---|---|
ind60 =~ x1 | 1.000 | 0.670 | 0.670 |
(0.000) | (0.065) | (0.000) | |
ind60 =~ x2 | 2.180 | 1.460 | 1.460 |
(0.139) | (0.128) | (0.093) | |
ind60 =~ x3 | 1.818 | 1.218 | 1.218 |
(0.152) | (0.129) | (0.102) | |
dem60 =~ y1 | 1.000 | 2.093 | 2.093 |
(0.000) | (0.260) | (0.000) | |
dem60 =~ y2 | 1.419 | 2.970 | 2.970 |
(0.203) | (0.401) | (0.424) | |
dem60 =~ y3 | 1.095 | 2.293 | 2.293 |
(0.171) | (0.342) | (0.359) | |
dem60 =~ y4 | 1.428 | 2.989 | 2.989 |
(0.171) | (0.315) | (0.358) | |
dem60 ~ ind60 | 1.439 | 0.460 | 0.460 |
(0.379) | (0.100) | (0.121) | |
v2 × = b^2*v1+ev2 | 4.382 | ||
(1.089) | |||
beta × = b*sqrt(v1/v2) | 0.460 | ||
(0.100) | |||
Num.Obs. | 75 | 75 | 75 |
AIC | 1906.2 | 1906.2 | 1906.2 |
BIC | 1940.9 | 1940.9 | 1940.9 |
As can be shown, while the first two methods give the same standardized beta in terms of estimates and standard errors, the third does not as it does not account for the uncertainty in the standard deviation.