Chapter Categorization and measurement quality. The choice between Pearson and Polychoric correlations

Page created by Peter Tucker

Uncategorized

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

7. Chapter

Categorization and measurement quality. The choice
between Pearson and Polychoric correlations

Germà Coenders and Willem E. Saris*

ABSTRACT This chapter first reviews the major consequences of ordinal
measurement in Confirmatory Factor Analysis (CFA) models and introduces the
reader to the Polychoric correlation coefficient, a measure of association designed to
avoid these consequences. Next, it presents the results of a Monte Carlo experiment
in which the performance of the estimates of a CFA model based on Pearson and
Polychoric correlations is compared under different distributional settings. The
chapter concludes that, in general, it does not matter too much which measure of
association one uses, as long as one is aware that factor loadings should be
interpreted differently, depending on whether Pearson or Polychoric correlations are
analysed.

Introduction
In Chapter 6 it was pointed out that categorization errors deriving from
crude ordinal measurement can distort the estimates of Multitrait-Multimethod
(MTMM) models, which constitute particular cases of Confirmatory Factor
Analysis (CFA) models. These categorization errors are likely to be of some
magnitude if the data are collected using methods offering only a few response
options, like the 5-point response scales used in some studies reported in this book.

The population study in Chapter 6 suggests that Polychoric correlations are not
affected by categorization errors and may be preferred to Pearson correlations. That
simulation, though, only considered normally distributed latent factors, whereas it is
known that Polychoric correlations are biased under non-normality (O'Brien &
Homer, 1987; Quiroga, 1992). This fact has not prevented other studies carried out
in the literature, also using normally distributed artificial data, from recommending

*
The work of the first author was partly supported by the grant CIRIT BE94/I-151 from the Catalan
Autonomous Government.

126                                 Coenders and Saris

the use of Polychoric correlations whenever the data are categorized, which has led
to a widespread use of this measure of association.

This chapter reports a Monte Carlo experiment in which both normal and non-
normal underlying factors are considered. In order to interpret the outcome of the
experiment, the different types of categorization errors and their consequences have
first to be reviewed, as well as the assumptions and performance of the Polychoric
correlation coefficient.

The categorization problem
         Linear statistical models constitute a convenient and parsimonious tool to
analyse relationships among variables. In particular, Structural Equation Models
(SEM) with Latent Variables (see Jöreskog & Sörbom, 1989; or Bollen, 1989) allow
the researcher to fit simultaneous regression equations taking measurement errors
into account and are becoming increasingly popular. They have been widely applied
to analyse survey and other types of data in many different research fields. The True
Score MTMM model used in this book belongs to the family of CFA models, which
constitute particular cases of SEM with latent variables.

If SEM are estimated using covariances or Pearson correlations, which is usual,
they assume that the data are continuous and have an interval level of measurement.
While the social sciences are often interested in measuring and relating variables
which are conceptually continuous, the measurement instruments used in these
disciplines often fail to yield continuous and interval-level measures. Many
categorical response scales which are frequently used in questionnaires can at most
be assumed to be ordinal. This leads to many problems, which arise from different
sorts of categorization errors.

When modeling ordinally measured continuous variables, it is usually assumed that
the range of the continuous variables is divided into as many regions as scale points
the ordinal measurements have, and that the ordinal measurements y are related to
the continuous underlying variables y* through the non-linear step function:

                         y=k iff τ k-1

The choice between Pearson and Polychoric correlations         127

assumed throughout this chapter.

An example of such step function can be seen in Figure 1a. Figure 1b shows a
sample histogram of a continuous y* variable and Figure 1c the resulting sample bar
chart of the 5-category y variable resulting from the step function in Figure 1a.
Figure 1a also shows the linear regression line of y on y* which is implied in linear
statistical models. The vertical distance between the step and linear functions can be
interpreted as categorization error.

This discrepancy between the linear and step functions involves two types of
categorization errors, which are discussed in Johnson and Creech (1983). Grouping
errors derive from discrete measurement, i.e. from collapsing several values of y*
into the same y value. Transformation errors arise from non interval measurement.
The arbitrary values 1,...,m may not be linearly related to the within category
expectation of y*, specially if the thresholds τ i are not equally spaced. In this case,
there are transformation errors (O'Brien, 1985) and the shapes of the distributions of
y and y* differ strongly.

In Figure 2a a step function leading to large transformation errors is shown. Note
that the thresholds are far from being equally spaced. Figure 2c shows a sample bar
chart of y after applying the step function in figure 2a to the y* variable in Figure 2b.
Note the differences in the distributional shapes of y* (symmetric) and y (skewed).
Transformation errors can also be high in case of a very high or very low mean of
y*, which can lead to a high frequency of an extreme category even if the thresholds
are equally spaced.

A skewed y distribution does not by itself imply the existence of transformation
errors. Note that the same skewed observed y distribution in Figure 2c could have
been obtained from a skewed underlying distribution through a low-transformation
categorization. An example of such situation can be seen in Figures 3a to 3c.

The literature refers to three major consequences of grouping and transformation
errors. The first one refers to bias in the covariances and correlations themselves.
In absolute value, the Pearson correlation between two observed y variables is
usually lower than the correlation between the continuous underlying y* variables
(Bollen & Barb, 1981; Neumann, 1982; O'Brien & Homer, 1987; Quiroga, 1992).
The difference can be large and easily exceed 0.1 if the number of categories is
small.

The remaining two consequences are better understood in the context of CFA

128 Coenders and Saris

models and SEM with latent variables. Equation (1) is also applicable in this
framework, but the underlying y*i variable is assumed to be an indicator of a latent
factor ηk (The notation in Jöreskog & Sörbom, 1989; and Bollen, 1989 will be used
throughout):

y*i=λikηk+ε i (2)

where λik is the factor loading of y*i on ηk and ε i is the random error term associated
to y*i. When squared and standardised, λik can be interpreted as the measurement
quality of y*i.

In this framework, the two additional consequences of categorization are the
reduction of measurement quality and the emergence of correlated measurement
errors. The reduction of measurement quality can be understood as follows: the
correlation between yi and ηk is usually lower in absolute value than the correlation
between y*i and ηk.

This reduction is related to both transformation and grouping errors. Quite
intuitively, the measurement quality decreases as the number of scale-points
decreases (O'Brien, 1985).

To some extent, then, categorization errors have the same consequences as random
errors. They attenuate the covariances and correlations among variables and they
reduce the measurement quality.

However, these categorization errors deviate somewhat from the random behaviour.
In Figure 1a it can be seen that the categorization error depends on the score on the
underlying continuous variable. Given the value of y*, the categorization error
(vertical distance from the step function and the regression line) is fixed. For
instance, if y*=3, the categorization error is always negative, if y*=-0.25, always
positive, and so on.

It can also be seen that categorization errors corresponding to two yi variables can be
correlated (Johnson & Creech, 1983) which constitutes the third major consequence
of categorization. The two vertical broken lines in the graph in Figure 1a represent
the scores of one case on two y*i variables which are closely related, so that both
values are similar. Note that both categorization errors are also similar. This would
not be the case if the y*i variables had been independent.

The choice between Pearson and Polychoric correlations    129

Figure 1a. Step and linear functions. Low transformation
errors

Broken lines show the scores and the categorization errors of an individual on two
related variables.

Figure 1b. Histogram of a symmetric continuous variable

Figure 1c. Bar chart of the variable in Figure 1b
categorized with the step function in Figure 1a

130                                  Coenders and Saris

      Figure 2a. Step and linear functions. High transformation
      errors

      Figure 2b. Histogram of a symmetric continuous variable

      Figure 2c. Bar chart of the variable in Figure 2b
      categorized with the step function in Figure 2a

The choice between Pearson and Polychoric correlations   131

Figure 3a. Step and linear functions. Low transformation
errors

Figure 3b. Histogram of a skewed continuous variable

Figure 3c. Bar chart of the variable in Figure 3b
categorized with the step function in Figure 3a

132 Coenders and Saris

Reasonably reliable indicators of the same factor or of different but correlated
factors are correlated. So will be their associated errors. These correlated errors
constitute a misspecification in most SEM. From Chapter 6 it is clear that this
misspecification can lead to bias in the estimates (for instance in the method
loadings, which play precisely the role of accounting for correlated errors) and to a
bad model fit. Chapter 6 also suggests that correlated errors are of considerable
magnitude only if transformation errors are present.

The Polychoric correlation coefficient
Because of all the problems referred in the section above, Polychoric
correlations have been suggested as an alternative to covariances and Pearson
correlations (Olsson, 1979a; Jöreskog, 1990).

Assuming that yi and yj are the ordinal measurements of the continuous underlying
variables y*i and y*j according to Equation (1), the Polychoric correlation coefficient
r(y*i,y*j) is the correlation coefficient between y*i and y*j estimated from the scores of
yi and yj.

The form of the distribution of the yi* variables has to be assumed so as to specify
the likelihood function of the Polychoric correlation. The bivariate normality
assumption is usually made (See Olsson, 1979a). The programs PRELIS and
PRELIS2 (Jöreskog & Sörbom, 1988, 1993a) make the estimation of the Polychoric
correlation coefficient by Limited Information Maximum Likelihood. The programs
follow a two-step procedure. The thresholds are first estimated from the cumulative
relative frequency distributions of yi and yj and the inverse of the N(0,1) distribution
function. The Polychoric correlation coefficient is next estimated by Restricted
Maximum Likelihood conditional on the threshold values.

In recovering the relationship among the y* variables, the Polychoric correlation
coefficient corrects both grouping and transformation errors. However, it constitutes
no panacea because of its reliance upon the normality assumption, which may lead
to bias if this assumption is not fulfilled.

Previous research on the choice between Pearson and Polychoric
correlations
There is a consensus that covariances and Pearson Correlations are
seriously biased by categorization errors (Quiroga, 1992; O'Brien & Homer, 1987;

The choice between Pearson and Polychoric correlations        133

and many others). Polychoric correlations, on the contrary, are unbiased if the
underlying variables are normally distributed (Jöreskog & Sörbom, 1988) but biased
otherwise. Skewness seems to be the most serious type of departure from normality,
specially when the degree or direction of the skewness differs from variable to
variable.

Even if normality does not hold, the bias is usually higher for covariances and
Pearson correlations than for Polychoric correlations (Quiroga, 1992). It seems,
then, well established that Polychoric Correlations are always preferable if the
researcher is interested in relationships between pairs of variables or when the
researcher uses SEM without latent variables.

The topic of the robustness to categorization of SEM on Pearson correlations has
often been dealt with in the literature considering that the model estimates should
account for the correlation structure among the y*i variables (Babakus et al., 1987;
Olsson, 1979b). Since the correlations differ depending on whether the y*i or the yi
variables are considered, studies carried out from this point of view report high
biases in the parameter estimates and lead to a recommendation to use Polychoric
correlations. Particularly, the factor loadings are reported to be too low with respect
to the measurement quality of the y*i variables.

In our opinion, when covariances or Pearson correlations are used, factor loadings
should be related to the measurement quality of the observed yi variables instead,
because the use of such measures of association implicitly ignores the intervening y*i
variables but concentrates on the observed variables and the latent factors alone.
Considered from this point of view, the biases due to categorization are much lower.
Seemingly, SEM with latent variables can appropriately interpret the lower
covariances or Pearson correlations as a lower measurement quality and can take it
into account so as to yield approximately correct estimates for the factor correlations
(Johnson & Creech, 1983; Homer & O'Brien, 1988). From this point of view it is
precisely the lower loadings which make the models work properly.

In our opinion, when it comes to SEM with latent variables, the choice of measure
of association is not straightforward. Authors dealing with normally distributed
underlying y* variables (Ridgon & Ferguson, 1991; Jöreskog & Sörbom, 1988;
Babakus et al., 1987; Olsson, 1979b) recommend the use of Polychoric correlations.
Homer and O'Brien (1988) consider both normal and non-normal underlying
variables and show that in the context of latent-variable models the choice between
Pearson and Polychoric correlations is not so clearly balanced in favour of the latter
as when only bivariate correlations are concerned. The authors conclude that in

134                                 Coenders and Saris

many cases it does not matter very much which measure of association one uses.
However, they do not state under which circumstances which measure should be
preferred. Johnson and Creech (1983) point at categorization error correlations as
the reason why latent-variable models may fail to perform completely well when
Pearson correlations are used, thus implying that transformation errors should
constitute the main source of distortion for these measures of association. In the
following sections we carry out a Monte Carlo experiment to shed further light on
this point.

Monte Carlo comparison of covariances and Polychoric
correlations in CFA models
         The previous section has pointed at transformation errors and non-normality
as the key issues regarding the performance of covariances and Pearson correlations
on the one side and Polychoric correlations on the other side. In this section, a
Monte Carlo experiment is designed to separately explore the effects of these two
main sources of distortion on covariances and Polychoric correlations. A
Confirmatory Factor Analysis (CFA) model, a particular case of SEM with latent
variables which is very relevant to the studies carried out in this book, is considered.

Emphasis will be made on the point estimates of the type of parameters the book is
concerned with, namely factor correlations and measurement quality. The
experiment is then complementary to that carried out in Chapter 6 which explores
only several patterns of transformation errors, but not non-normality; and which
concentrates on measurement quality alone .

The experiment consists in the simulation of 4 continuous variables y1* to y4*, which
follow a standardised CFA model with two factors (see path-diagram in Figure 4);
in their categorization into 5 categories (a number of scale points which is
frequently used in this book); and in the analysis of the categorized data with the
same CFA model which has been used in the simulation of the continuous data.

The simulated data are generated through the following equations:

                                     y*1=λ11η1+ε 1
                                     y*2=λ21η1+ε 2                                  (3)
                                     y*3=λ32η2+ε 3
                                     y*4=λ42η2+ε 4

The choice between Pearson and Polychoric correlations          135

                          yi=k iff τ ik-1

136 Coenders and Saris

The two latter distributions are obtained from weighted sums of three
independent χ2 variables1.

The design also considers three alternative marginal frequency distributions for the
yi variables. A symmetric distribution, a distribution where all yi variables are
skewed in the same direction and a distribution in which the indicators of different ηk
factors are skewed in opposite directions.

These 3 yi distributions can be obtained from the three possible distributions for the
ηk factors with equally spaced thresholds (obtained by dividing the range containing
the central 98% of probability of yi* into 5 equal-length intervals). In this case,
transformation errors are low (DC's 1 to 3). Unlike the case was in Chapter 6 the
design allows for skewed observed data and low transformation errors
simultaneously. The two types of skewed yi distributions can also be obtained from
normal ηk factors if unequally spaced thresholds are used. In this case,
transformation errors are high (DC's 4 and 5). Table 2 also shows the population
frequency distribution of the y variables and their measurement quality (percentage
of their variance explained by η).

The number of replications was 200 for all DC's. The sample size was chosen to be
high (n=1000) so as to avoid mixing the problems of interest with the ones derived
from small sample sizes. The simulation was made with the program MINITAB8
(Minitab Inc., 1991).

For each DC the program PRELIS 2.02 (Jöreskog & Sörbom, 1993a) was used to
compute 200 covariance matrices and the program PRELIS 1.10 (Jöreskog &
Sörbom, 1988) to compute 200 Polychoric correlation matrices from the raw data
for the variables y1 to y4. Asymptotic sampling variances and covariances of the
elements of all matrices were also computed. The use of old versions of the program
is due to some anomalies found in the asymptotic variances/covariances of the
Polychoric correlations computed by PRELIS2.02.

1
The weights allow for different signs and degrees of skewness and are shown below:

η1=0.58x1+0.81x2
η2=0.87x1+0.40x2+0.29x3

Where x1 to x3 are standardised variables distributed as a χ2 with 1 degree of freedom. A reversal in the
sign of x2 leads to negative skewness in η1 and positive skewness in η2.

The choice between Pearson and Polychoric correlations                  137

Table 2. Description of 5 distributional conditions (DC)
        description                                      characteristics of the y variables
DC      distrib.    distrib.  transfor-                  vars.         Freqs. (%)             % variance
        ηfactors    y vars    mation                                  1 2 3 4 5            expl. by η
1       normal      symmetric low                        y1, y2       8 24 36 24 8            0.69
                                                         y3, y4       8 24 36 24 8            0.69

2       skewsame        skewsame low                     y1, y2       26 46 19 6 3          0.66
                                                         y3, y4       23 46 21 7 3          0.65

3       skewdiff        skewdiff       low               y1, y2        4 11 44 35 6         0.61
                                                         y3, y4       12 47 30 8 3          0.62

4       normal          skewsame high                    y1, y2       26 46 19 6 3          0.64
                                                         y3, y4       23 46 21 7 3          0.64

5       normal          skewdiff       high              y1, y2        4 11 44 35 6         0.66
                                                         y3, y4       12 47 30 8 3          0.65
skewsame: skewed in the same direction; skewdiff: skewed in opposite directions.

The Program LISREL8.02 (Jöreskog & Sörbom, 1989, 1993b) was used to fit the
model in Figure 4 by Weighted Least Squares (WLS) on all matrices. WLS is the
best available large-sample estimation procedure for both measures of association.
WLS is the only method claimed to be correct for Polychoric correlations (Jöreskog,
1990; Jöreskog & Sörbom, 1993a). As regards covariances, WLS is equivalent
(Jöreskog & Sörbom, 1988; Jöreskog, 1990) to the Asymptotic Distribution-Free
(ADF) procedure (Browne, 1984), which is asymptotically optimal even if the data
are non-normal. Muthén and Kaplan (1985, 1989) report a good large-sample
behaviour of ADF estimates for a variety of distributions, at least for small models.
This possibility of dealing with non-normality is currently available for covariances
only and has to be taken into account when deciding which of the measures of
association to adopt.

The specification of the model in LISREL left all loadings, factor variances-
covariances and error variances in figure 4 free, except λ11 and λ32 which were
constrained to be equal to 1 so as to fix the scale of the η factors. 100.0% of the
replications converged to an empirically identified and admissible LISREL solution
for both measures of association and all DC's. Graphical analyses of the Monte
Carlo results revealed no anomalous estimates.

138                                 Coenders and Saris

Estimates of the correlation coefficients
          In this section the performance of the measures of association themselves,
i.e. their ability to reflect the relationships between pairs of variables correctly, is
evaluated.

In order to make the results to be comparable, the covariance matrices were rescaled
to correlations. The results are presented in the first columns of Table 3. The rows
of the table are the elements of the correlation matrix for a given DC and the
columns contain criteria for the quality of the estimates for a given measure of
association: bias (Monte Carlo average of the correlation estimates minus the value
in Table 1), SD (Monte Carlo standard deviation of the correlation estimates) MSE
(Mean Squared Error, Monte Carlo average of the squared deviation of the
correlation estimates from the values in Table 1). Bias and SD are multiplied by 100
(for instance, a bias of 2 means that the average correlation is 0.02 higher than the
true value, and a SD of 2 means a standard deviation of 0.02), and to keep coherence
MSE is multiplied by 10,000. Because of the symmetry of the model, (y1 is
generated in the same way as y2, y3 is generated in the same way as y4) the results for
some of the correlations are identical except for sampling fluctuations, which allows
us to show only 3 of them. Bias is omitted if it is not significantly different from
zero.

For all DC's we find that covariances have the largest bias (around 0.10 in absolute
value), which is of negative sign (Quiroga, 1992). Note that the bias is larger for
DC's 4 and 5 than for DC 1 whilst the distribution of the y* variables is the same.
This is due to the transformation errors.

For DC's 1, 4 an 5, we find that the assumptions of Polychoric correlations are all
met, and they show no significant bias. For the rest of the DC's, Polychoric
correlations are biased, but less than covariances (Biases keep around 0.05 in
absolute value).

The results in the literature are confirmed, that Polychoric correlations reflect the
relationships between pairs of variables better than covariances or Pearson
correlations.

Estimates of factor correlations and measurement quality
        The results in this section reflect the situation in which several indicators
per factor are available and the researcher is mainly interested in the relationships

The choice between Pearson and Polychoric correlations                      139

among the factors and the measurement quality. In this situation, getting wrong
correlations among the observed variables is fairly irrelevant as long as the
measurement quality and factor correlation estimates are unbiased.

Table 3. Results of the Monte Carlo experiment
                   bivariate correlations                                   CFA model estimates
               covariances         Polychoric                             covariances     Polychoric
DC vars.       bias SD MSE       bias SD MSE               par.         bias SD MSE    bias SD MSE
      y2 -y1     -6 2      37             2      3         ρ (η1,η2)       -0 2      4            2      3
1     y3 -y1     -5 2      30             2      5         λ211             1 3      9            3      9
      y4 -y3     -6 2      37             2      3         λ232             1 3      7            3      7

      y2 -y1     -8 2      65        -4 2       24         ρ (η1,η2)       1 2       5        1 2       6
2     y3 -y1     -6 3      45        -3 3       18         λ211            2 3      11       -4 3      28
      y4 -y3     -8 2      75        -5 2       30         λ232            2 3      13       -5 4      38

      y2 -y1   -12 2 139             -5 2       33         ρ (η1,η2)       -2 2      7        2 2       9
3     y3 -y1   -10 2 109             -3 2       17         λ211             3 3     17       -5 3      39
      y4 -y3   -10 2 110             -6 2       41         λ232             3 3     16       -6 3      45

      y2 -y1     -8 2      62             2      3         ρ (η1,η2)       -1 2      6            2      4
4     y3 -y1     -7 2      55             2      5         λ211             4 3     21            3      8
      y4 -y3     -8 2      66             2      3         λ232             3 3     18            3      8

      y2 -y1     -9 2      76             2      3         ρ (η1,η2)       -2 2      8            2      4
5     y3 -y1     -8 2      74             2      5         λ211             1 3      8            3      8
      y4 -y3     -8 2      70             2      3         λ232             2 3     11            3      9
Covariances have been rescaled to correlations.
bias:     Monte Carlo average minus population value (´100)
SD:       Monte Carlo standard deviation (´100)
MSE:      Monte Carlo mean squared deviation from the population value (´10,000)
λ2i:      Squared standardised loadings. Bias and MSE are computed with respect to a true value of 0.75 for
          Polychoric correlations and with respect to the true values in Table 2 for covariances.

The results are presented in the last columns of Table 3 and concern the factor
correlation and the squared standardised λ estimates (Because of symmetry of the
model, it is enough to show the results for λ11 and λ32).

First of all it has to be considered that the squared standardised λ's mean something
different depending on whether one uses Polychoric correlations or one uses
covariances. In both cases, the squared standardised λ's can be interpreted as

140                                 Coenders and Saris

measurement quality. In particular, given the fact that the model in Figure 4 is not a
True Score model, the squared standardised λ's can be interpreted as the product of
reliability and validity.

If Polychoric correlations are used, the underlying yi * variables are focused upon, so
that the squared standardised λik is an estimate of the percentage of variance of yi*
explained by ηk., i.e. of the measurement quality of the hypothetical underlying
continuous variable yi*. Therefore, it has to be close to 0.872=0.75 as it can be seen
from the parameter values of the simulated model.

If covariances are used, the ordinal observed yi variables are focused upon, so that
the squared standardised λik is an estimate of the percentage of variance of yi
explained by ηk, i.e. of the measurement quality of the ordinal yi variable. The yi
variables are contaminated by grouping and transformation errors, so that their
measurement quality is reduced and the population percentage of variance of yi
explained by ηk is less than 0.75. The true percentages are not known but they can be
closely approximated as the average of the 200 squared correlations between the yi
and ηk simulated data. These values are shown in the last column of Table 2 and
were used instead of 0.75 as a reference to compute the bias and MSE of the
squared standardised λ estimates based on covariances.

For DC 1 all approaches perform very well. There is no bias for the Polychoric
approach and biases in the estimates based on covariances are very slight (around
0.01 in absolute value), which contrasts with the large bias in the correlations
themselves.

For DC's 2 and 3 (non-normality) the bias in the estimates based on covariances
keeps low while it gets larger for Polychoric correlations. This is so specially for DC
3, in which skewness in opposite directions is involved and biases can exceed 0.05
in absolute value. We can now come to a conclusion which the experimental design
in Chapter 6 did not allow us to reach: covariances and Pearson correlations can be
applied to skewed data, as long as the latent factors are also skewed.

For DC's 4 and 5 (transformation errors), the result is reversed. Polychoric
correlations lead to unbiased estimates whereas covariances lead to some bias
(sometimes close to 0.05 in absolute value), which confirms the results in Chapter 6.

It is then confirmed that Polychoric correlations rely on the normality assumption. It
is also confirmed that the use of covariances correctly results in lower measurement

The choice between Pearson and Polychoric correlations        141

quality estimates, which partially makes up for the biased covariance estimates, so
that parameters linking factors with each other are not so biased as it could be
expected from the bias in the covariances themselves. Covariances seem to be fairly
robust unless large transformation errors are present. Polychoric correlations, on the
contrary, are correct regardless of the transformation errors as long as the underlying
variables are normal.

However, the biases which can be expected are altogether of reasonable magnitude
(usually below 0.05 in absolute value for standardised parameters), so that the
consequences of a wrong choice of measure of association will often not be
dramatic.

Determinants of the choice
         The choice between covariances or Pearson correlations and Polychoric
correlations should be based upon the interest of the researcher and upon the
existence or not of large transformation errors and deviations from normality of the
yi* variables.

If the interest of the researcher is concentrated upon the y* variables and their
measurement quality, Polychoric correlations should be used. If it is concentrated
upon the y variables, covariances or Pearson correlations should be used. If the
researcher is interested in measurement errors which are not related to
categorization, then Polychoric correlations should be used. If the researcher is
interested in measurement-quality altogether (including the effects of
categorization), or in assessing the effects of categorization on measurement quality,
then Pearson correlations should be used.

It has been shown that the measurement quality estimates mean something different
depending on whether Pearson or Polychoric correlations are used because the
measurement quality of y and y* differ. These measurement quality estimates are not
interchangeable. In this book it is suggested that correlation matrices be corrected
for attenuation and method effects by using appropriate measurement quality
estimates. Measurement quality estimates obtained from covariances or Pearson
correlations can only be used to correct covariances Pearson correlation matrices for
measurement error. Measurement quality estimates obtained from Polychoric
correlations can only be used to correct Polychoric correlation matrices.

So, the results of the meta-analyses done in Chapters 12 and 13 can only be
compared to the results of other measurement quality studies in which the same type

142                                Coenders and Saris

of measure of association has been used.

As regards deviations from normality and transformation errors, the researcher
should not worry too much. The results in Table 3 suggest that if m=5, the
distortions are usually not very large. The bias in the relevant estimates of CFA
models keeps below 0.05 in most cases, even though the skewness and
transformation errors present in some DC’s were quite high compared to what can
be expected in applied research. Coenders and Saris (1995) carried out another
simulation study using m=3 and obtained similar results. Therefore, in line with
Homer and O'Brien (1988), we suggest that the choice of measure of association has
no critical consequences on the substantive conclusions which may be drawn from
the estimates of a CFA model, apart from what has been said about the different
interpretation of the factor loadings.

In spite of what has been said in the previous paragraph, a researcher interested in a
fine-tuned choice of measure of association can take the following into account. If
normal underlying variables are categorized with about equally-spaced thresholds,
then Polychoric correlations should be preferred, but covariances and Pearson
correlations will also yield approximately correct results. If normal underlying
variables are subject to large transformation errors, then Polychoric correlations
should be preferred. If non-normal underlying variables are categorized with about
equally-spaced thresholds, then covariances or Pearson correlations should be
preferred.

The researcher is in a more difficult position than suggested by Chapter 6, as both
Pearson and Polychoric correlations can lead to wrong results on skewed y data
when y* is symmetric (Figure 2), and when y* is skewed (Figure 3) respectively. The
problem is that sometimes, when seeing a skewed ordinal variable, it is impossible
to know whether it comes from a normal underlying variable which has been
categorized with a set of unequally-spaced thresholds or from a skewed underlying
variable categorized with a set of equally-spaced thresholds because the
conventional normality tests lack in power. Quiroga (1992) reports powers which
are mostly below 50% and often about as low as the type I risk, even for large
sample sizes, and for deviations from normality capable to cause a bias as large as
0.1 in the Polychoric correlations.

Given this lack of power, exploratory data analysis may not help decide which
measure of association to use. Sometimes, theory or experience helps to determine
the shape of y*. For instance, it is known that data regarding the frequency of
behaviour are often positively skewed, whereas attitudinal variables (such as the

The choice between Pearson and Polychoric correlations      143

ones mostly dealt with in the book) can be closer to normality. Theory and
experience can also help to determine the kind of categorization generated by a
particular questionnaire response scale. Figure 5 shows two possible 5-point
response scales for the measurement of satisfaction. Large transformation errors
seem more likely to arise from the second scale with asymmetric category labels,
than from the first one, at least if the mean of the latent factor is not extremely high.

Figure 5. Two 5-point scales for the measurement of satisfaction leading to
different degrees of transformation
What is your degree of satisfaction concerning                ?

Scale 1:                                            Scale 2:
1) -Completely dissatisfied                         1) -Dissatisfied
2) -Dissatisfied                                    2) -Neutral
3) -Neither satisfied nor dissatisfied              3) -Fairly satisfied
4) -Satisfied                                       4) -Very satisfied
5) -Completely satisfied                            5) -Completely satisfied

References
Babakus, E, Ferguson, C. E., & Jöreskog, K. G. (1987). The Sensitivity of
        Confirmatory Maximum Likelihood Factor Analysis to Violations of
        Measurement Scale and Distributional Assumptions. Journal of Marketing
        Research, 24, 222-229.
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York:
        Wiley.
Bollen, K. A., & Barb, K. H. (1981). Pearson's r and Coarsely Categorized Data.
        American Sociological Review, 46, 232-239.
Browne, M. W. (1984). Asymptotically Distribution-Free Methods for the Analysis
        of Covariance Structures. British Journal of Mathematical and Statistical
        Psychology, 37, 62-83.
Coenders, G., & Saris, W. E. (1995). Alternative Approaches to Structural Modeling
        of Ordinal Data. A Monte Carlo Study. Paper presented at the 1995
        European Meeting of the Psychometric Society. Leiden.
Homer, P., & O'Brien, R. M. (1988). Using LISREL Models with Crude Rank
        Category Measures. Quality and Quantity, 22, 191-201.
Johnson, D. R., & Creech, J. C. (1983). Ordinal Measures in Multiple Indicator
        Models: A Simulation Study of Categorization Error. American
        Sociological Review, 48, 398-407.

144                               Coenders and Saris

Jöreskog, K. (1990). New Developments in LISREL. Analysis of Ordinal Variables
        using Polychoric Correlations and Weighted Least Squares. Quality and
        Quantity, 24, 387-404.
Jöreskog, K. G., & Sörbom D. (1988). PRELIS a Program for Multivariate Data
        Screening and Data Summarization. A Preprocessor for LISREL.
        Mooresville: Scientific Software, Inc.
Jöreskog, K. G., & Sörbom, D. (1989). LISREL7, a Guide to the Program and
        Applications. SPSS Publications.
Jöreskog, K. G., & Sörbom, D. (1993a). New Features in PRELIS2. Scientific
        Software International.
Jöreskog, K. G., & Sörbom, D. (1993b). New Features in LISREL8. Scientific
        Software International.
Minitab Inc. (1991). MINITAB Reference Manual. Release 8. PC Version. State
        College: Minitab Inc.
Muthén, B., & Kaplan, D. (1985). A Comparison of Some Methodologies for the
        Factor Analysis of Non-Normal Likert Variables. British Journal of
        Mathematical and Statistical Psychology, 38, 171-189.
Muthén, B., & Kaplan, D. (1989). A Comparison of Some Methodologies for the
        Factor Analysis of Non-Normal Likert Variables. A Note on the Size of the
        Model. British Journal of Mathematical and Statistical Psychology, 45, 19-
        30.
Neumann, L. (1982). Effects of Categorization on the Correlation Coefficient.
        Quality and Quantity, 16, 527-538.
O'Brien, R. M. (1985). The Relationship Between Ordinal Measures and their
        Underlying Values: Why all the Disagreement? Quality and Quantity, 19,
        265-277.
O'Brien, R. M., & Homer, P. (1987). Correction for Coarsely Categorized Measures.
        LISREL's Polyserial and Polychoric Correlations. Quality and Quantity, 21,
        349-360.
Olsson, U. (1979a). Maximum Likelihood Estimation of the Polychoric Correlation
        Coefficient. Psychometrika, 44, 443-460.
Olsson, U. (1979b). On the Robustness of Factor Analysis against Crude Cate-
        gorization of Observations. Multivariate Behavioral Research, 14, 485-500.
Quiroga, A. M. (1992). Studies of the Polychoric Correlation and Other
        Correlation Measures for Ordinal Variables. Doctoral dissertation.
        University of Uppsala.
Ridgon, E., & Ferguson Jr. C. E. (1991). The Performance of the Polychoric
        Correlation Coefficient and Selected Fitting Functions in Confirmatory
        Factor Analysis with Ordinal Data. Journal of Marketing Research, 28,
        491-497.

You can also read