设为首页 登录 注册
首页 中人社区 中人博客
查看: 31459|回复: 5
打印 上一主题 下一主题

不知道amos和spss做出来的区分效度有什么区别?

[复制链接]

3

主题

4

听众

989

积分

秀才

Rank: 5Rank: 5

注册时间
2005-11-22
最后登录
2014-6-24
积分
989
精华
0
主题
3
帖子
73
跳转到指定楼层
楼主
发表于 2013-3-28 08:56:31 |只看该作者 |倒序浏览
尊敬的kenny及各位好:
% d! [1 X. b5 Q/ T; \7 i# c3 U
对量表的区分效度(discrimination validity)检验时,发现有人用SPSS,主要是检验平均提取方差(Average variance extractedAVE)与该因子与任何其他因子的共同方差(highest shared variance)的值;而有人则用AMOS,检验修正指数(modification indexMI)的显著性,通过x2/dfNNFIGFIAGFICFIRMSEA等拟合优度检验。

3 H* w5 `# t9 H+ p! Y# C
不知这两种检验方法的适用前提条件是什么?采用不同方法得出的结果是否有差异?哪种情况下更适合选择SPSSAMOS进行分析?
或许问题有点弱弱的,还是请您及各位大侠能在百忙之中予以解答,感谢了!
. W$ y' _! d# p
( H6 L6 L2 g  Q( o
mnczj    

7

主题

6

听众

489

积分

书生

Rank: 3Rank: 3Rank: 3

注册时间
2009-3-1
最后登录
2014-6-18
积分
489
精华
0
主题
7
帖子
56
沙发
发表于 2013-3-28 11:00:50 |只看该作者
对我来说,SPSS与AMOS只是两种不同的统计分析软件。我想你更关心的是,应该用什么样的指标来判断区分效度。以下是一些常用的指标。
& q% a. i4 K* n5 Z5 d/ L6 \* s& h2 H; z, k2 n$ J! R1 Y  E6 D
Noncentrality χ² (NCP) “serves as a natural measure of badness-of-fit of a covariance structure model” (Steiger, 1990: 177) (reasonable fit if NCP < .10), also χ²/df. (Wheaton et al., 1977) (good fit if smaller than 5)
4 x+ F3 U4 T6 f6 Y/ B+ s2 _$ T/ P9 d" l& Z# \9 {# X( e
RMSEA (Root Mean Square Error of Approximation) asks “How well would the model, with unknown but optimally chosen parameter values, fit the population covariance matrix if it were available?” (Browne & Cudeck, 1993: 137-138) (good fit if RMSEA < .05; reasonable fit if RMSEA < 08)
7 X& c: i- l& E7 x; {, p6 |
3 t0 A' Y0 a' F& @  n* [0 ip-value tests whether “the error of approximation has an associated probability of less than .05” (Byrne, 1998: 113) (p-value should > .50) (Joreskog & Sorbom, 1996)
$ i& S: W, @0 M8 M1 T2 N; j+ r" |9 ~, a" F9 s
GFI (Goodness-of-Fit Index) measures the relative amount of variance and covariance in the unrestricted sample. Adjusted GFI adjusts for the number of degrees of freedom. Both of them represent “absolute indices of fit because they basically compare the hypothesized model with no model at all” (Byrne, 1998: 116) (better fit if closer to 1)3 o4 E6 I) N7 N$ m2 y2 b! `* q
0 K/ f0 @" @6 ~) }3 u; L
PGFI (Parsimony Goodness-of-Fit Index) “takes into account the complexity (i.e., number of estimated parameters) of the hypothesized model in the assessment of overall model fit” (Byrne, 1998: 116). Usually it has lower values than the threshold level for other indices of fit. Sometimes, even around .50 for reasonable fit (Mulaik et al., 1989).$ _5 A; Z0 f( X( k. g* |0 Q2 v

5 d% f& Y+ X; J* T" ]' A& ?, hNFI (Normed Fit Index) (Bentler & Bonett, 1980) and CFI (Comparative Fit Index)compared the hypothesized model with the independence model (null model where all correlations among variables are zero) (NFI & CFI >. 90 for reasonable fit) (Bentler, 1992). NFI has shown a tendency to underestimate fit in small samples (Byrne, 1992). . i. z1 \' j# [* H. D: s
# h0 M6 h7 w' R" d5 L) n
NNFI  takes into account the complexity of the hypothesized model in comparison with the independence model. As NNFI is not normed, its value can be greater than 1 and becomes difficult to interpret (Byrne, 1998).  Thus, Bentler (1990) suggested that CFI “should be the index of choice” (Byrne, 1998: 117).
9 N$ ]9 `( r. v- N7 z, s( P1 f( R0 \) Z( ^
IFI (Incremental Index of Fit) (Bollen, 1989) addresses the issues of parsimony and sample size associated with NFI. Thus, its computation and interpretation is similar to NFI.
8 p" i9 q& R2 N$ W, m9 E
& Z* ]1 z1 ]: d3 Z: ]9 [, i+ hRFI (Relative Fit Index) or RNI (Relative Noncentrality Index) is algebraically equivalent to CFI.
* q9 W( ]0 _6 L9 l0 @3 y
9 x* f. Y- i1 RDavid Kenny有一个网页,专门在讲model fit。以下是他2012年修改后的内容。我想看完后,应该就基本可以解决你的问题了。
* a- i7 }7 S1 o' F, S! J# F
$ d& [# d0 Y3 q0 O; DMeasuring Model Fit3 O+ X, X0 q# V! ?4 P

- F- E& y( d# h! \& H- I. k3 ^3 p: `Fit refers to the ability of a model to reproduce the data (i.e., usually the variance-covariance matrix).  A good-fitting model is one that is reasonably consistent with the data and so does not require respecification.  Also a good-fitting measurement model is required before interpreting the causal paths of the structural model.
7 n. z: u& F$ H! N7 V" f5 O! @5 p9 I! L6 t$ ?* _
It should be noted that a good-fitting model is not necessarily a valid model. For instance, a model all of whose parameters are zero is of a "good-fitting" model. Additionally, models with nonsensical results (e.g., paths that are clearly the wrong sign) and models with poor discriminant validity or Heywood cases can be “good-fitting” models.  Parameter estimates must be carefully examined to determine if one has a good model as well as a good-fitting model.  Also it is important to realize that one might obtain a good-fitting model, yet it is still possible to improve the model and remove specification error.  Finally, having a good-fitting model does not prove that the model is correctly specified.  Conversely, it should be noted that a model all of whose parameters are statistically significant can be from a poor fitting model.
1 H7 _. x+ i$ ]" i7 L" s8 ]& L; h2 f/ \# T1 N  M' `
How Large a Sample Size Do I Need?" ]/ q' \0 X& R( f; K6 p  o1 Z. R
- N' a* U  p0 j+ q1 v, t. ?' p
Rules of Thumb
! k" k3 }$ [' @' q% o          Ratio of Sample Size to the Number of Free Parameters
0 G" N7 X2 z) L  ^9 `1 E                    Tanaka (1987): 20 to 1, but that is unrealistically high.# \( P) s, I4 c& h2 ~
                    Goal: Bentler & Chou (1987): 5 to 1  W9 x, a6 V# V1 j, ?, ?  T6 L
                    Several published studies do not meet this goal.4 O% L/ j' Z- |+ y" _5 Q4 g
          Sample Size9 z4 X7 m# k3 Z8 `* j/ Q
                    200 is seen as a goal for SEM research( [+ s3 e6 }2 A
                    Lower sample sizes can be used for
/ k* A$ F$ v0 \1 a1 K4 U                              Models with no latent variables
. F  i' U4 g& N                              Models where all loadings are fixed (usually to one)5 J6 _4 s. J6 M1 K/ n. `( J! E5 Y
                              Models with strong correlations3 v( k7 v5 R8 R  a7 l8 a/ d
                              Simpler models, z  Z5 T2 t0 A3 w
                     Models for which there is an upper limit on N (e.g., countries or years as the unit), 200 might be an unrealistic standard.1 Q1 k$ o. c& T9 c8 g
. v5 d, R6 q: r1 O( U$ |
Power Analysis* E. o) o7 t3 c+ A
                       Best way to determine if you have a large enough sample is to conduct a power analysis.
0 o1 |" S7 o+ w9 J4 f5 e                       Either use the Sattora and Saris (1985) method or conduct a simulation.
' r- z: O3 \) E, R( b% m8 Y3 Z  {- L. D' a3 p. a  H
The Chi Square Test: χ2$ f0 Y5 j: i$ Y. `9 }! _

& i( l% m5 s2 q  D7 e9 xFor models with about 75 to 200 cases, the chi square test a reasonable measure of fit.  But for models with more cases (400 or more), the chi square is almost always statistically significant.  Chi square is also affected by the size of the correlations in the model: the larger the correlations, the poorer the fit.  For these reasons alternative measures of fit have been developed.  (Go to a website for computing p values for a given chi square value and df.); ]) c( ?/ {) F; Q9 [
& g3 L/ F0 t1 f( X; N+ y
Sometimes chi square is more interpretable if it is transformed into a Z value.  The following approximation can be used:
5 r, j6 ^+ v. v) V$ G0 k  J8 O4 m
9 d; ]0 z2 O+ h- b4 I5 b" v) MZ =  √(2χ2) - √(2df - 1)2 ?0 P. m1 [4 c! c" P) V
7 J" {* d# D; U6 s2 c
An old measure of fit is the chi square to df ratio or χ2/df.  A problem with this fit index is that there is no universally agreed upon standard as to what is a good and a bad fitting model.  Note, however, that two very popular fit indices,  TLI and RMSEA, are largely based on this old-fashioned ratio.
, n! S# @, s; M+ R% M
5 M3 G4 I4 X9 z6 D+ r  KThe chi square test is too liberal (i.e., too many Type 1) errors when variables have non-normal distributions, especially distributions with kurtosis.  Moreover, with small sample sizes, there are too many Type 1 errors.6 E5 g  b2 S  E. [/ \' d! B% Y$ W
8 h+ ^2 I- T' i  [  |
Introduction to Fit Indices
7 L$ k, A9 u& \
0 J2 V6 a$ w% S. D7 rThe terms in the literature used to describe fit indices are confusing, and I think confused.  I prefer the following terms: incremental, absolute, and comparative which are used on the pages that follow. 3 l6 v( m. C/ Y1 X' i( r& W3 u6 S

/ _  ?3 O, }  MIncremental Fit Index+ f5 ?: u! P) n8 F
* M! M8 ]. P0 h! L* N
An incremental (sometimes called relative) fit index is analogous to R2 and so a value of zero indicates having the worst possible model and a value of one indicates having the best possible.  So my model is placed on a continuum:* M9 r- S& P! @" z: T

" h) J2 |6 J) F/ ^4 PIn terms of a formula, it is
0 V  A$ O4 \1 b( l  Y$ B( d: p( Q3 m# N0 b3 D8 J0 \
              Worst Possible Model – My Model                   & e1 u7 P  x: t- |) O- z
Worst Possible Model – Fit of the Best Possible Model
! c8 ^5 p& L5 f; |
& r; c. |+ H/ |* \+ mThe worst possible model is called the null or independence model and the usual convention is to allow all the variables in the model to have variation but no correlation.  (The usual null model is to allow the means to equal their actual value.  However, for growth curve models, the null model should set the means as equal, i.e., no growth.)  The degrees of freedom of the null model are k(k – 1)/2 where k is the number of variables in the model.  Amos refers to the null model as the independence model.
5 e* A; `8 `* a0 w4 m
* C: k6 Y5 j0 r- M- |) |, `Alternative null models might be considered (but almost never done).  One alternative null model is that all latent variable correlations are zero and another is that all exogenous variables are correlated but the endogenous variables are uncorrelated with each other and the exogenous variables.  O’Boyle and Williams (2011) suggest two different null models for the measurement and structural models.
9 n$ T# _$ o; k. i8 d  \
% D" U+ i' ~  N0 {* r. c! OAbsolute Fit Index8 ]# C. u: n( Z# w  r
; c& |; p, u' U) C9 ~, L. \* ^
An absolute measure of fit presumes that the best fitting model has a fit of zero.  The measure of fit then determines how far the model is from perfect fit.  These measures of fit are really a “badness” measure of fit in that bigger is worse.  : @, C6 V! p  M

9 v' _  _, D( c# D$ m* E& N5 I/ C2 M! mComparative Fit Index: g7 c* G# y0 @6 Z

; a$ c4 b+ }; p" V1 X1 S" LA comparative measure of fit is only interpretable when comparing two different models.  This term is unique to this website in that these measures are more commonly called absolute fit indices.  However, it is helpful to distinguish absolute indices that do not require a comparison between two models.
  O8 b# P/ R; o5 C
1 K3 \/ U* C9 B& X# A+ VControversy about Fit Indices
$ W: i" c$ h0 V, c- N" U/ U9 Z* v/ z( s( a+ Z6 t. F5 A, b6 q
Recently considerable controversy has flared up concerning fit indices.  Some researchers do not believe that fit indices add anything to the analysis (e.g., Barrett, 2007) and only the chi square should be interpreted.  The worry is that fit indices allow researchers to claim that a miss-specified model is not a bad model.  Others (e.g., Hayduk, Cummings, Boadu, Pazderka-Robinson, & Boulianne, 2007) argue that cutoffs for a fit index can be misleading and subject to misuse.  Most analysts believe in the value of fit indices, but caution against strict reliance on cutoffs.
" e( X0 D2 \/ f
5 Q* c6 a( c! p) Y/ f+ U6 DAlso problematic is the “cherry picking” a fit index.  That is, computing a many fit indices and picking the one index that allows you to make the point that you want to make.  If you decide not to report a popular index (e.g., the TLI or the RMSEA), you need to give a good reason why you are not.! S1 V0 x7 b* t' W' P

2 l8 {/ Y8 o, o$ c+ u# dFinally, Kenny, Kaniskan, and McCoach (2011) have argued that fit indices should not even be computed for small degrees of freedom models.  Rather for these models, the researcher should locate the source of specification error.
( r/ u. `- ]+ ^! w  O. Z3 b5 v& x/ C* b
Catalogue of Fit Indices( P' o+ n' s1 V2 R5 \$ K

2 m5 Q8 X9 l- I5 z- [$ mThere are now literally hundreds of measures of fit. This page includes some of the major ones currently used in the literature, but does not pretend to include all the measures.  Though a bit dated, the book edited by Bollen and Long (1993) explains these indexes and others.  Also a special issue of the Personality and Individual Differences in 2007 is entirely devoted to the topic of fit indices.! ]! l- y( K! n: T" D& S

0 y( _, |5 q& F7 _/ GA key consideration in choice of a fit index is the penalty it places for complexity.  That penalty for complexity is measured by how much chi square needs to change for the fit index not to change.
) f, V* Y3 M* d& E5 r  p* C& O' @+ L- E. B% Y
Bentler-Bonett Index or Normed Fit Index (NFI)
, z) T+ N5 J: T( b& m& uThis is the very first measure of fit proposed in the literature (Bentler & Bonett, 1980) and it is an incremental measure of fit.  The best model is defined as model with a χ2 of zero and the worst model by the χ2 of the null model.   Its formula is:
; s& a. ?+ H1 n$ O
2 ~, a7 k& p6 ^; L9 n7 kχ2(Null Model) - χ2(Proposed Model)1 D5 _' G1 y0 {& f, _
______________________________
! O2 K0 m% `& c9 x1 l# Z0 A) Aχ2(Null Model)
. [( X3 H1 R! V  m; [5 N6 y; A( S: i9 }' M; B8 H  W3 d2 g
A value between .90 and .95 is now considered marginal, above .95 is good, and below .90 is considered to be a poor fitting model.  A major disadvantage of this measure is that it cannot be smaller if more parameters are added to the model.  Its “penalty” for complexity is zero.  Thus, the more parameters added to the model, the larger the index.  It is for this reason that this measure is not recommended, but rather one of the next two is used.
% U$ I3 h" K+ ~. r+ f5 N
0 Z. l. D: k5 s5 V6 \, KTucker Lewis Index or Non-normed Fit Index (NNFI)
1 P6 x: u# t1 R1 j8 x# M% K# gA problem with the Bentler-Bonett index is that there is no penalty for adding parameters.  The Tucker-Lewis index, another incremental fit index, does have such a penalty.  Let χ2/df be the ratio of chi square to its degrees of freedom, and the TLI is computed as follows:; E7 h) M3 e6 |8 E. r

8 D$ U$ V4 a. z3 d8 O' v& W: Cχ2/df(Null Model) - χ2/df(Proposed Model)
( Q1 D3 q& W$ w  N8 y% j5 V' y8 |, w" b, f- o' P$ s- i/ T. ]9 Z
_________________________________
2 Y2 ~# K/ Y$ ]
2 E# @' `( i2 A* W. Iχ2/df(Null Model) - 1
+ V' n9 s/ S* T5 M/ u" X' ^1 H2 v7 V
If the index is greater than one, it is set at one.  It is interpreted as the Bentler-Bonett index.  Note than for a given model, a lower chi square to df ratio (as long as it is not less than one) implies a better fitting model.   Its penalty for complexity is χ2/df.  That is, if the chi square to df ratio does not change, the TLI does not change.
  `# _. X, Y( f9 z, h& s5 u
" _! H( P2 w1 ZNote that the TLI (and the CFI which follows) depends on the average size of the correlations in the data.  If the average correlation between variables is not high, then the TLI will not be very high. Consider a simple example.  You have a 5-item scale that you think measures one latent variable. You also have 3 dichotomous experimental variables that you manipulate that cause those two latent factors.  These three experimental variables create 7 variables when you allow for all possible interactions. You have equal N in the conditions, and so all their correlations are zero.  If you run the CFA on just the 5 indicators, you might have a nice TLI of .95.  However, if you add in the 7 experimental variables, your TLI might sink below .90 because now the null model will not be so "bad" because you now have added to the model 7 variables who have zero correlations with each other.  
5 p! _1 R3 ?# u2 L; G$ U! a$ ~, g0 J3 G# ^- C
A reasonable rule of thumb is to examine the RMSEA for the null model and make sure that is no smaller than 0.158. An RMSEA for the model of 0.05 and a TLI of .90, implies that the RMSEA of the null model is 0.158.  If the RMSEA for the null model is less than 0.158, an incremental measure of fit may not be that informative.
" h0 m0 N) d" B2 d: X$ x) Z' p+ l! s4 ?! Y! _' W
Comparative Fit Index (CFI)/ e" z8 K( q* Z8 M5 z0 {
This incremental measure of is directly based on the non-centrality measure.  Let d = χ2 - df where df are the degrees of freedom of the model.  The Comparative Fit Index or CFI equals
) B3 G3 ~9 ^% {9 z, @- B: V: [2 E; e3 W& v7 H1 m( g- [* q
d(Null Model) - d(Proposed Model)
" y2 t8 V4 `  l3 K( ]d(Null Model)
+ W7 t7 O% _6 y3 u4 l- D
4 [/ r8 z3 }4 o# i2 |If the index is greater than one, it is set at one and if less than zero, it is set to zero. It is interpreted as the previous incremental indexes. / u9 o) q( ^. o9 z1 c/ ]" G
& \3 Y" |' r7 Z8 R; }2 D4 m( K
If the CFI is less than one, then the CFI is always greater than the TLI.  CFI pays a penalty of one for every parameter estimated.  Because the TLI and CFI are highly correlated only one of the two should be reported.  The CFI is reported more often than the TLI, but I think the CFI’s penalty for complexity of just 1 is too low and so I prefer the TLI even though the CFI is reported much more frequently than the TLI.2 C+ m1 e, T4 P# o! i+ F
) }/ {& }- ^/ S
Again the CFI should not be computed if the RMSEA of the null model is less than 0.158 or otherwise one will obtain too small a value of the CFI.
* c7 P& \$ D  e( u% n" A+ A2 y' ^# J
Root Mean Square Error of Approximation (RMSEA)  o, V& ^9 E; i
This absolute measure of fit is based on the non-centrality parameter.  Its computational formula is:
# b( `+ ?9 J% d) W% D
% Y$ S6 M  Y7 [" s1 V√(χ2 - df)
% a3 H7 _- h6 x6 o
2 ]& A  w" E$ ^* |5 L, k__________  ( d" P+ m# W: L8 ]1 [# ~2 r
3 G2 s: ?$ b5 [8 ?
√[df(N - 1)]7 O, n6 m9 n5 Y3 N) Y* g8 l* r3 K$ ^

0 E3 x7 n  P: [/ Swhere N the sample size and df the degrees of freedom of the model.  If χ2 is less than df, then the RMSEA is set to zero.  Like the TLI, its penalty for complexity is the chi square to df ratio.  The measure is positively biased (i.e., tends to be too large) and the amount of the bias depends on smallness of sample size and df, primarily the latter.  The RMSEA is currently the most popular measure of model fit and it now reported in virtually all papers that use CFA or SEM and some refer to the measure as the “Ramsey.”
7 a! x" r3 r; f& U! C/ K7 S& s! [
MacCallum, Browne and Sugawara (1996) have used 0.01, 0.05, and 0.08 to indicate excellent, good, and mediocre fit respectively. However, others have suggested 0.10 as the cutoff for poor fitting models.  These are definitions for the population.  That is, a given model may have a population value of 0.05 (which would not be known), but in the sample it might be greater than 0.10.  Use of confidence intervals and tests of PCLOSE can help understand the sampling error in the RMSEA. There is greater sampling error for small df and low N models, especially for the former.  Thus, models with small df and low N can have artificially large values of the RMSEA.  For instance, a chi square of 2.098 (a value not statistically significant), with a df of 1 and N of 70 yields an RMSEA of 0.126.  For this reason, Kenny, Kaniskan, and McCoach (2011) argue to not even compute the RMSEA for low df models.
' G" W" f$ K: B
; I, r7 z9 t9 s9 I0 @A confidence interval can be computed for the RMSEA. Ideally the lower value of the 90% confidence interval includes or is very near zero (or no worse than 0.05) and the upper value is not very large, i.e., less than .08.   The width of the confidence interval is very informative about the precision in the estimate of the RMSEA.
- ~. m- G" ~, m+ w/ \
9 f( }& ?3 v* t( A; O" Qp of Close Fit (PCLOSE): n! s7 j3 }1 H7 V) o# u
# ?6 R. o9 ]+ e1 d+ H2 T
This measure provides is one-sided test of the null hypothesis is that the RMSEA equals .05, what is called a close-fitting model. Such a model has specification error, but “not very much” specification error.  The alternative, one-sided hypothesis that the RMSEA is greater than 0.05. So if the p is greater than .05 (i.e., not statistically significant), then it is concluded that the fit of the model is "close."  If the p is less than .05, it is concluded that the model’s fit is worse than close fitting (i.e., the RMSEA is greater than 0.05). As with any significance test, sample size is a critical factor, but so also is the model df, with lower df there is less power in this test.
( ^1 @; H5 E: v' P) u2 Y
5 `) F5 C6 r8 T* [  X) m5 FYou can use a Preacher and Coffman webpage to test any null hypothesis about the RMSEA.  Note that the standard chi square test takes as the null hypothesis that the RMSEA equals zero." @2 ~' m/ d% ]4 g* `

* o% H% w7 A5 `2 h# ]7 X0 XStandardized Root Mean Square Residual (SRMR)
# B* e2 M4 d1 QThe SRMR is an absolute measure of fit and is defined as the standardized difference between the observed correlation and the predicted correlation.  It is a positively biased measure and an absolute measure of fit.  The bias is greater for small N and for low df studies.  This measure tends to be smaller as sample size increases and as the number of parameters in the model increases. The SRMR is an absolute measure of fit and a value of zero indicates perfect fit.  The SRMR has no penalty for model complexity.  A value less than .08 is generally considered a good fit (Hu & Bentler, 1999).
+ R8 `5 B. e7 s8 a6 F7 U/ U7 {
/ U/ H, ~: s+ O# O9 ZAkaike Information Criterion (AIC)
* |! k1 y- }1 a; z) IThe AIC is a comparative measure of fit and so it is meaningful only when two different models are estimated.  Lower values indicate a better fit and so the model with the lowest AIC is the best fitting model.  There are somewhat different formulas given for the AIC in the literature, but those differences are not really meaningful as it is the difference in AIC that really matters:
( f, P0 j* g& c3 p( `" {3 d% g
7 O4 [8 T3 r0 U' Sχ2 + k(k - 1) - 2df8 D4 f: B/ f. J/ y7 j+ u/ F4 P

" ?! |% J0 _0 _, A3 X) P4 Bwhere k is the number of variables in the model and df is the degrees of freedom of the model.  Note that k(k - 1) - 2df equals the number of free parameters in the model.   The AIC makes the researcher pay a penalty of two for every parameter that is estimated.
. M. {. o, }  o/ w6 _) e4 X9 F8 |* Q2 H2 z+ w) I. }, H
Bayesian Information Criterion (BIC)+ Y8 f0 ?: ?2 O/ S' |0 _
Two other comparative fit indices are the BIC and the SABIC. Whereas the AIC has a penalty of 2 for every parameter estimated, the BIC increases the penalty as sample size increases' y+ l% o% u0 l9 R4 |/ i
$ |* [8 k$ Y1 `, j
χ2 + ln(N)[k(k + 1)/2 - df]
! F- z, ]) A& G$ P) m4 T) E
$ F  [6 I6 V9 _where ln(N) is the natural logarithm of the number of cases in the sample. (If means are included in the model, then replace k(k + 1)/2 with k(k + 3)/2.  The BIC places a high value on parsimony (perhaps too high).
7 y. [; W" Z2 a& q
3 L2 {/ q* J9 U) S0 wThe Sample-Size Adjusted BIC (SABIC), f. P; a7 w" h4 d5 E
The Sample-size adjusted BIC or SABIC like the BIC places a penalty for adding parameters based on sample, size but not as high a penalty as the BIC.  The SABIC is not given in Amos, but is given in Mplus. Several recent simulation studies (Enders & Tofighi, 2008; Tofighi, & Enders, 2007) have suggested that the SABIC is a useful tool in comparing models. Its formula is
/ R# S$ J% b$ I: g' E4 p! ?. ^0 N$ D, z, i
χ2 +[(N + 2)/24][k(k + 1)/2 - df]8 u6 I1 u) t8 g6 b

  v: a4 X' v8 f7 s; @; o' g. MGFI and AGFI (LISREL measures)  e4 c/ k6 x6 @8 \5 x8 U
These measures are affected by sample size. The current consensus is not to use these measures (Sharma, Mukherjee, Kumar, & Dillon, 2005).
) `! o8 Z& O& d, ~6 F
- a& F: x4 g% R; z5 ~1 o' @, `' ~Hoelter Index
  J, b' L# @4 d4 ]8 A5 e3 q5 mThe index states the sample size at which chi square would not be significant (alpha = .05), i.e., that is how small one's sample size would have to be for the result to be no longer significant.  The index should only be computed if the chi square is statistically significant.  Its formula is:
3 I2 D6 G5 w: f2 b1 G' ^  `% L( \: ]. c: n- s1 m
[(N - 1)χ2(crit)/χ2] + 1
) {: K/ {% f, B- j( Q4 ]4 j, e
4 E1 ]  H: q. D( Z5 A3 ywhere N is the sample size, χ2 is the chi square for the model and χ2(crit) is the critical value for the chi square.  If the critical value is unknown, the following approximation can be used:: E3 C7 g/ p8 a  A
" n1 o. J- p% _
[1.645 + √(2df - 1)]2   + 1+ e. B5 N' q" f5 w9 w: }3 g6 l
* x) h. {) d- i5 _& M  J/ p
________________________        
0 B4 Y- s' G4 R+ s4 j2 x2 d" _$ ^6 ^0 }( u
2χ2/(N - 1) + 1   
$ ?2 X* g4 s/ _- o; u6 y( h/ I7 R' k( i% s% H9 P( j2 W) g
where df are the degrees of freedom of the model.  For both of these formulas, one rounds down to the nearest integer value.   Hoelter recommends values of at least 200.  Values of less than 75 indicate very poor model fit.% v  a% K$ s/ {. \% G! Q; D! _: o

, q6 c" _2 s& ~2 m( l+ WThe Hoelter only makes sense to interpret if N > 200 and the chi square is statistically significant.  It should be noted that Hu and Bentler (1998) do not recommend this measure./ z; H; F' Q" e6 f- C
8 U* l0 m* t) S; A
Factors that Affect Fit Indices
- ?8 @& }7 t+ ]3 K8 N4 p
5 S# A$ c8 q$ i& pNumber of Variables% D8 C1 y$ p2 `0 _/ P$ Q& C  {
          Anecdotal evidence that models with many variables do not fit
, |! N0 x  [3 @1 u$ W+ G          Kenny and McCoach (2003) show
% n' G3 N# i! q# T" g' m8 p5 n                    RMSEA improves as more variables are added to the model
% {3 z+ q8 i6 z9 C4 M                    TLI and CFI are relatively stable, but tend to decline slightly
" W- `) V& a& x          We still do not understand why it is that models with more variables tend to have poor fit.
4 [& l* T0 M7 S2 _- N- |# G$ G. H( |. ]) B' `9 z4 W
Model Complexity
6 a) ~) j$ U; x: T" j- g9 [          How much chi square needs to change per df for the fit index not to change:
% N1 o5 E. q9 C" [
0 `* G1 D4 {- ~                                                            Theoretical Value     A&M*     Reis**
. y0 v! D* k1 P                    Bentler and Bonett                   0                         0           0' }5 \% ^. a4 m; q2 P
                    CFI                                           1                         1           1
6 @. a+ [3 u# \6 y  o7 {                    AIC                                           2                         2           2  J& w' o: @8 R" f- h# X: ^/ }, f
                    Adjusted BIC           ln[(N +2)/24]                        1.96      1.76
5 a; l1 _8 p0 M8 ~6 Z5 V0 [                    Tucker Lewis                       χ2/df                        2.01      1.46
) f1 x. _3 F% \* X! X1 x$ i                    RMSEA                                χ2/df                        2.01      1.46* D6 q6 I4 d5 c0 i
                    BIC                                      ln(N)                        5.13      4.93: y6 m/ b. g$ H6 ^
0 F$ U/ [$ Q  _9 w
*A&M: Ajzen, I., & Madden, T. J. (1986). Prediction of goal-directed behavior: Attitudes, intentions, and perceived behavioral control. Journal of Experimental Social Psychology, 22, 453-474.5 M% R! r/ u* x+ }- Y
**Reis: Reisenzein, R. (1986). A structural equation analysis of Weiner's attribution-affect model of helping behavior. Journal of Personality and Social Psychology, 50, 1123-1133.8 s2 {# a9 W! X9 P7 h
6 R* v0 ~4 r+ g, n' D% G# n4 t
Note that" q# G1 f  Y  c9 O- ~' n8 E" k
                Each changes by a constant amount, regardless of the df change.
' v2 G1 r, v+ p0 C6 e4 g                Larger values reward parsimony and smaller values reward complexity.  For A&M, the BIC rewards parsimony most, and the CFI (after the Bentler and Bonett, the least.* n' ^2 ~( N" ]: |% ?

+ p" u# }- A) N6 h, |  Y. `Sample Size( {9 ^+ k; k$ w) {- O% w+ b5 K
Bentler-Bonett fails to adjust for sample size: models with larger sample sizes have smaller values.  The TLI and CFI do not vary much with sample size.  However, these measures are less variable with larger sample sizes.
) p  P" w% Z, y7 _! D
) u) R/ J- R4 l4 h  f9 K9 L) NThe RMSEA and the SRMR are larger with smaller sample sizes.
, [3 Y7 _9 p4 w# Y; J4 Q1 Q- j8 f3 R$ M/ m
Normality
2 k, G' |) V* \; zNon-normal data (especially high kurtosis) inflates chi square and absolute measures of fit.  Presumably, incremental and comparative measures of fit are less affected.
, g, w+ @# y" T( D  h5 T, m5 O* f$ m

) N& n" Q6 g' ]
: e0 L* Z; J5 ~6 M6 \5 QReferences$ H# `% J# ~$ E

7 u$ p. L6 x, S' _1 yBarrett, P. (2007). Structural equation modelling: adjudging model fit. Personality and Individual Differences, 42, 815–824.
+ U  g  S- f, B0 |! C: }6 _) C- u
2 S$ X0 J3 o" `Bentler, P. M., & Chou, C. P. (1987) Practical issues in structural modeling.  Sociological Methods & Research, 16, 78-117.
. U/ r% x" |" A' K  r9 ^5 K" R% }- @6 `9 B
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness-of-fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-600.2 ?: `( w# f/ n% ?' C; E

# R! N( q  _% H: X( JBollen, K. A., & Long, J. S., Eds. (1993). Testing structural equation models. Newbury Park, CA: Sage.
0 [! D1 ?  z" F3 y2 J2 ?/ q4 z  s8 A% M8 m  Z# Y
Enders, C.K., & Tofighi, D. (2008). The impact of misspecifying class-specific residual variances in growth mixture models. Structural Equation Modeling: A Multidisciplinary Journal, 15, 75-95.+ D, ?2 ?1 ?/ j2 s

$ _8 z$ i/ c9 i2 M+ K* tHayduk, L., Cummings, G. G., Boadu, K., Pazderka-Robinson, H., & Boulianne, S. (2007). Testing! Testing! One, two three – Testing the theory in structural equation models! Personality and Individual Differences, 42, 841-50.3 k3 Z8 |% O2 p

4 X% I) l7 K" `8 @5 u) _8 q3 Z) vHu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424–453.
. ?. d: I% h# \' y: p7 L. n" m) E9 J! {& ?8 ~
Kenny, D. A., Kaniskan, B.,  & McCoach, D. B. (2011).  The performance of RMSEA in models with small degrees of freedom.  Unpublished paper, University of Connecticut.
1 H' M2 @5 v% O6 G% }
/ v/ P& M; ?' q' M; q3 jKenny, D. A., ,  & McCoach, D. B. (2003). Effect of the number of variables on measures of fit in structural equation modeling. Structural Equation Modeling, 10, 333-3511.1 c0 I" v/ A" @* B
# v" W" z: z) s9 @0 L: ~+ `# _
MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1, 130-149.( Q7 ?6 v% }  M( j* z6 x
% |! d( d4 |$ y8 K
O'Boyle, E. H., Jr., Williams, L. J. (2011). Decomposing model fit: Measurement vs. theory in organizational research using latent variables. Journal of Applied Psychology, 96, 1-12.- }' B; Z% u# J" b! Z# s
! d5 n! M0 j$ ?6 I9 l
Satorra, A., & Saris,W. E. (1985). The power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 83–90.: q" ^/ }9 Z- i/ f  Q/ F/ D% W0 y( r
/ g2 Z. s5 f# G8 l& p
Sharma, S., Mukherjee, S., Kumar, A., & Dillon, W.R. (2005). A simulation study to investigate the use of cutoff values for assessing model fit in covariance structure models. Journal of Business Research, 58, 935-43.1 Z7 l) ?9 J6 S

5 h+ S4 O; ?9 k, Y: HTanaka, J.S. (1987). "How big is big enough?": Sample size and goodness of fit in structural equation models with latent variables. Child Development, 58, 134-146.  2 R5 ^- r$ d8 W* |. V2 |$ H" s

' A. n' ?+ P1 j9 `Tofighi, D., & Enders, C.K. (2007). Identifying the correct number of classes in growth mixture models. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 317-341). Greenwich, CT: Information Age.
2 ~) w+ z% T. T0 h4 L
回复

使用道具 举报

69

主题

220

听众

2万

积分

中人网专家

Rank: 50Rank: 50Rank: 50Rank: 50Rank: 50

注册时间
2003-1-21
最后登录
2016-11-27
积分
29016
精华
0
主题
69
帖子
1438

2009年度勋章

板凳
发表于 2013-3-28 14:13:16 |只看该作者
zj 果然劲!!. j+ Q9 a3 \) I6 J1 f! h$ X
$ v, l$ P, Q9 `6 T1 F7 R& E
最近在修改自己写的方法书。看看写了一段关于聚合效度和区分效度的。与大家分享。
1 k9 F4 p, J8 Q7 L5 Z9 z' B
# E3 O) S! ]8 O+ n: a# H聚合效度和区分效度 (convergent validity and discriminant validity)
6 D9 {8 A) R' T& R 这两个概念也许是被误用最多的两个概念了。我们常常看到不少学生和学者对数据分析时,把几个构念的测量放在一起作验证性因子分析,结果很好,便说这些构念的测量有区分效度。为了表明这个推论的错误,我们用一下的一个例子说明。假设我们在测量两个构念:员工满意度和企业承诺。我们用四道题来作为测量:
9 Y& a5 w- f1 I$ D0 H1.        我很满意我的同事。9 d! l4 \6 [& b# ]
2.        我很满意我的主管。$ c* {+ Y3 j9 c1 E6 V& y, ]
3.        我不像一头猫。
- R  D8 u+ r% _# b* r. Q6 V4.        我不像一头狗。1 a; N, j9 I9 ?1 d
用这四道题的答案做一个因子分析,我们必然会得到两个因子。于是我叫第一个做员工满意度。第二个因子就叫做企业承诺。同时,我也说数据提供了聚合效度和区分效度的证据。这个例子自然是夸张了很多。不过,它却表明了单单一个因子分析的结果,是不可以表明聚合效度和区分效度的。因为效度的定义是一个测量是否正确地代表了它背后的构念。上面这个例子的应用,说明我们并没有真正理解这些概念原来的意思。我们建议遇到上面这种情况时,可以说这些量表可以将不同的构念区分开,而不要再用区分效度这个词了。其实,如果因子分析的结果表明了指标与背后的构念与我们的设计相似,不如把它叫做内部结构效度更为合适。那到底什么才叫做聚合效度和区分效度呢?简单来说,「聚合效度」是一个测量会与代表同样构念的测量的相关很大。「区分效度」是一个测量不会与代表其他构念的测量的相关很大。
回复

使用道具 举报

33

主题

6

听众

7709

积分

贡士

Rank: 15Rank: 15Rank: 15Rank: 15Rank: 15Rank: 15

注册时间
2010-4-15
最后登录
2013-9-16
积分
7709
精华
0
主题
33
帖子
610
地板
发表于 2013-3-28 21:19:12 |只看该作者
本帖最后由 jkliang 于 2013-3-28 21:22 编辑 6 e8 E- s2 Z2 ^( H' O5 X

/ g- u0 e  h) Z"「聚合效度」是一個測量會與代表同樣構念的測量的相關很大。「區分效度」是一個測量不會與代表其他構念的測量的相關很大。"還不是很令人明白阿??還請Kenny再多些解釋,感謝!$ E! Y" ]7 v) }. B# k3 J: t
/ ~' y. O5 `! z' B  [
關於構念效度,我覺得最重要的應該是設計構念的理論視角及定義。統計指標只是用以驗證而已,若沒有清楚的理論視角與定義,那只是在玩統計遊戲而已。* q, g4 T9 l# C  b+ ?; |0 W

6 Y7 o4 ]  H+ X% L3 O2 O9 d在統計方法方面,我最近使用了PLS-based SEM來做CFA,它的指標分別是For test convergence validity: loading factor > 0.70; AVE > 0.50; communality > 0.50. For test discriminant validity: crosss-loading > 0.70; square root AVE > correlation. Journal of Marketing Theory and Practice(JMTP)最近出了special issue探討PLS-based SEM的應用。相關資訊供您參考:http://www.metapress.com/content ... 7989a5f9b72ba8&pi=0。Hair等也有些文章:Hair, Joe F., Marko Sarstedt, Christian M. Ringle, and Jeannette A. Mena. 2012. "An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research." Journal of the Academy of Marketing Science 40 (3): 414-433.http://pls-institute.org/uploads ... ingle_Mena_2012.pdf 或 Hair, Joseph F., Christian M. Ringle, and Marko Sarstedt. 2011. "PLS-SEM: Indeed a Silver Bullet." Journal of Marketing Theory and Practice 19 (2): 139-151. http://www.mesharpe.com/JMTP/01%20hair.pdf
回复

使用道具 举报

3

主题

4

听众

989

积分

秀才

Rank: 5Rank: 5

注册时间
2005-11-22
最后登录
2014-6-24
积分
989
精华
0
主题
3
帖子
73
5
发表于 2013-3-28 23:40:36 |只看该作者
先好好谢谢mnczj,Kenny,& jkliang这么丰富和热心的回答!
2 h2 J7 d' S' ?  p8 ?我好好学习消化以后再讨论!
" p" X7 `4 N1 @' I5 P. [3x again!
回复

使用道具 举报

rwxld    

11

主题

4

听众

6296

积分

贡士

Rank: 15Rank: 15Rank: 15Rank: 15Rank: 15Rank: 15

注册时间
2006-2-22
最后登录
2018-4-25
积分
6296
精华
0
主题
11
帖子
188
6
发表于 2013-3-29 10:23:41 |只看该作者
Kenneth 发表于 2013-3-28 14:13
! Q4 B! x. E8 u/ dzj 果然劲!!
/ o/ R% r0 H' g
7 Z; R" ^' F' b7 z5 @% {最近在修改自己写的方法书。看看写了一段关于聚合效度和区分效度的。与大家分享。

4 Y9 I& H! s& Z; V7 j5 S$ W这就是问题的本质!很多书和论文将区别效度定义在“两个不同的概念”这个层次,结果只做一个CFA或者只要检验两个概念是否完全相关,就认为具有区别效度。其实“此区别效度”非“彼区别效度”,二者对区别效度的定义根本就不一样。
回复

使用道具 举报