Page 11 - Lovison_alii_2010
P. 11

G. LOVISON ET AL.


           5.2. GLMM’s as a solution to the ‘‘pseudo-replication’’ problem
           Hurlbert (1984) in a seminal paper, drew attention on the risks of applying standard statistical techniques to non-independent observations,
           which he called ‘‘pseudo-replications’’.
            The pseudo-replication problem is defined by Hurlbert as ‘‘ .. .the use of inferential statistics to test for treatment effects with data from
           experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent’’.
            Pseudo-replication most commonly results from wrongly treating multiple samples from one experimental unit as (independent) multiple
           experimental units. This improper statistical treatment of such data implies an overestimate of the true variation, an increase of Type II error
           risk and, as a consequence, the danger of invalid resulting conclusions.
            A partial, and often inadequate, solution to the problem is represented by ‘‘sub-sampling’’: given repeated measurements on the same unit,
           only a random sub-sample of such measurements is analyzed, in order to attenuate correlation and obtain approximately independently
           distributed observations, to which standard statistical methods can then be applied. In its most extreme version, only one measurement is
           randomly drawn for each unit, i.e., the sub-sampling size is one. As will be shown in Section 5.3, if on one hand sub-sampling attenuates
           correlation, on the other it implies a loss of information (due to the reduction of the total sample size) and then requires a higher number of
           sampling units to ensure a specified level of efficiency (in estimation) and power (in testing).



           5.3. Longitudinal analysis of Sicily PosiData-1
           To illustrate the different results that can be reached by dealing with dependence in longitudinal data through different approaches, we
           selected the data on the Petrosino site. At this site there are 112 shoots available, for a total of 880 observations, which gives an average
           number of about 7.86 lepidochronological years per shoot; actually, the number of years, i.e., the length of the longitudinal series, ranges
           from minimum of 1 to a maximum of 21. The three sampled meadows are located at 6, 15, and 25 m of depth, respectively.
            We set out to model the dependence of Rhizome elongation on Year, Age of the shoot and Depth.
            We first do it ignoring completely the longitudinal nature of the data, and treating them as a sample of 880 independent observations, by
           fitting a standard GLM with Gamma distribution and logarithmic link. We end up with the results in Table 5.
            These results suggest there are significant (negative) main effects of Year and Depth, while Age has no significant effect, neither by itself
           nor in interaction with Year.
            But if we take dependence into account, we get a completely different picture. The results of fitting a GLMM with Gamma distribution and
           logarithmic link, which accounts for dependence assuming random intercepts for the 112 shoots, are reported in Table 6.
            Not only the main effects of all the three explanatory variables are now highly significant, but also the interactive effects of Age with Year
           and Depth with Year result to be significant, suggesting that environmental conditions have worsened over time and have a negative effect on
           the growth performance of P. oceanica, not only directly but through the interaction with Age.
            The striking difference of the results obtained ignoring and considering intra-shoot dependence not only shows that such dependence
           exists, but also that its inclusion in the model strongly affects the substantive interpretation of the data. This conclusion contradicts that of


            Table 5. Petrosino—GLM Gamma-Log assuming independence

                                     Estimate              Std. Error             t-ratio               p-value
            Intercept                 5.0007                0.7046                  7.098                0.000
            Year                     0.0477                0.0164                2.907                 0.004
            Age                       0.0452                0.0655                  0.690                0.490
            Depth                    0.7715                0.3151                2.448                 0.014
            Year: Age                0.0011                0.0015                0.716                 0.474
            Year: Depth               0.0109                0.0073                  1.499                0.134




            Table 6. Petrosino—GLMM Gamma-Log assuming random intercepts


                                        Estimate              Std. Error            t-ratio             p-value
            Intercept                     6.4224               0.5910               10.866               0.000
            Year                        0.0824                0.0141               5.833               0.000
            Age                           0.1032               0.0396                2.603               0.009
            Depth                       1.5152                0.2237               6.773               0.000
            Year: Age                   0.0022                0.0008               2.574               0.010
            Year: Depth                   0.0274               0.0049                5.542               0.000
            St. Dev (Intercept)           0.4115



   380
           wileyonlinelibrary.com/journal/environmetrics  Copyright ß 2010 John Wiley & Sons, Ltd.  Environmetrics 2011; 22: 370–382
   6   7   8   9   10   11   12   13