Potential of interval partial least square regression in estimating leaf area index

FUNDING: None Leaf area index (LAI) is a critical parameter in determining vegetation status and health. In tropical grasslands, reliable determination of LAI, useful in determining above ground biomass, provides a basis for rangeland management, conservation and restoration. In this study, interval partial least square regression (iPLSR) in forward mode was compared to partial least square regression (PLSR) to estimate LAI from in-situ canopy hyperspectral data on a heterogeneous grassland at different periods (onset, mid and end) during summer. The performance of the two techniques was determined using the least relative root mean square error to the mean (nRMSEP) and the highest coefficients of determination (Rp) between the predicted and the measured LAI. Results show that iPLSR models could explain LAI variation with Rp values ranging from 0.81 to 0.93 and low nRMSEP from 9.39% to 24.71%. The highest accuracies for estimates of LAI using iPLSR were at midand end of summer (Rp = 0.93 and nRMSEP = 9.39%; Rp = 0.89 and nRMSEP = 10.50%, respectively). Pooling data sets from the three assessed periods yielded the highest prediction error (nRMSEP=24.71%). Results show that iPLSR performed better than PLSR, which yielded Rp and RMSEP values ranging from 0.36 to 0.65 and from 28.44% to 33.47%, respectively. Overall, this study demonstrates the value of iPLSR in predicting LAI and therefore provides a basis for more accurate mapping and monitoring of canopy characteristics of tropical grasslands.


Introduction
Measurement of spatio-temporal distribution of quantitative variables like leaf area index (LAI) and biomass are valuable for assessing the health and productivity of tropical grasslands. 1 Several studies (e.g.Prins and Beekman 2 ) have associated vegetation characteristics such as LAI and biomass with animal grazing patterns.Therefore, quantitative assessment of such characteristics offers great potential for determining grassland conditions, which is useful for generating optimal management guidelines for grazing and rangeland conservation and restoration.LAI has been recognised as a key biophysical parameter for determining vegetation characteristics. 3LAI determines vegetation biophysical processes such as photosynthesis, canopy water interception, transpiration, radiation extinction, carbon loads and nutrient sequestration. 4,5Consequently, LAI is commonly used as a key input for modelling vegetation foliage cover, growth and productivity and effects of disturbances such as drought and climate change on vegetation communities. 6evious studies in which LAI was estimated on tropical grasslands have emphasised their spatial variation. 7owever, LAI is a biophysical parameter that is spatially and temporally dynamic across a landscape.According to Shen et al. 8 , the performance of biophysical process models is highly sensitive to the temporal and spatial variation of LAI.For example, Xu and Baldocchi 9 note that well-timed data collection on changes in LAI could be used to explain more than 84% of the variance in gross primary production -an important input in the carbon cycle of an ecosystem.Therefore, analysis of temporal and spatial changes in LAI at the canopy level provides a valuable opportunity for modelling biophysical processes.
Traditionally, direct (e.g.destructive sampling) and indirect (e.g.use of a ceptometer canopy analyser and hemispherical canopy photography) methods are used to determine LAI in grasslands. 8,10,11Typically, the direct methods consist of manually determining LAI using planimetric or volumetric techniques.Although these approaches are simple and reliable, they involve destructive sampling, are labour intensive, costly and time consuming. 1,12These factors limit the application of direct methods for estimating LAI, particularly in large spatial extents that require frequent monitoring. 6Indirect methods, like the use of a spectrometer, quantify LAI by measuring spectral reflectance which is then used as a proxy for LAI.Generally, such indirect methods are quick and can be automatically processed, thus allowing their application in a larger sampling area. 10motely sensed spectral data present an opportunity to indirectly retrieve LAI in heterogeneous grasslands. 1 Techniques that rely on remotely sensed spectral data are non-destructive, relatively quick and cost-effective, and therefore valuable for large spatial and multi-temporal monitoring. 8,13,14The literature shows that canopy hyperspectral data, acquired using handheld spectrometers, have been widely adopted to derive LAI in heterogeneous grasslands. 15,16ccording to Hansen and Schjoerring 16 , such data provide hundreds or even thousands of spectral bands with information sensitive to specific vegetation variables valuable for modelling.Although Lee et al. 17 demonstrated that models generated from hyperspectral data predicted LAI better than those from broadband spectral data, the large amount of spectral information that characterises hyperspectral data makes derivation of LAI from heterogeneous grasslands data challenging. 7Additionally, hyperspectral data sets suffer from multicollinearity that often occurs when many adjacent spectral bands present a high degree of redundancy and correlation. 18Tropical grasslands

Research Article
iPLSR in leaf area index estimation Page 2 of 9 LAI retrieval using canopy reflectance is further complicated by varying species composition, phenology and proportions and complex canopy architecture.
A number of studies (e.g.Nguyen and Lee 19 ) that have adopted canopy reflectance hyperspectral data to derive LAI have demonstrated the superiority of partial least square regression (PLSR) over traditional regression techniques.The technique was introduced to solve multicollinearity and overfitting problems by reducing variables to fewer components. 18The PLSR technique is a full-spectrum method that simultaneously uses all available wavebands to create models.Compared to other algorithms, PLSR is less restrictive because it can be run on data for which the sample size is smaller than predictor variables. 20The technique is particularly useful for removing uninformative bands and retaining those useful for predicting response variables.Consequently, it has become valuable for improving, inter alia, model predictions by reducing data collection costs, interpretation complexity and data dimensionality. 5,21Moreover, PLSR combines the characteristics of popular statistical techniques such as stepwise mutiple regression and principal component regession.In several studies, PLSR turned out to be more robust than the regression techniques with which it was compared. 7,22,23Furthermore, similar performance was found between radiative transfer and PLSR models in estimating LAI. 24though the use of PLSR, a full-spectrum technique, has gained popularity in hyperspectral data modelling, 18,19,25 studies in fields like chemometrics have suggested that interval partial least squares (iPLSR), a variant of PLSR, can reduce hyperspectral data into band portions valuable for more accurate prediction. 26,27Developed by Norgaard et al. 26 , iPLSR is a graphically oriented technique for local regression modelling of spectral data.Unlike PLSR, it visually provides a general overview of relevant information in different spectral regions, thereby screening out important portions of the electromagnetic spectrum and discarding interference from irrelevant portions.Norgaard et al. 26 , for instance, used spectra for beer samples to retrieve original extract concentration by comparing iPLSR, PLSR and other algorithms.They found that iPLSR improved determination coefficient and root mean square error of prediction of full-spectrum PLSR from 0.993 and 0.40% to 0.998 and 0.17%, respectively.Although this approach offers great promise in improving landscape modelling accuracy, no studies have used iPLSR on ground-based hyperspectral data collected from heterogeneous landscapes such as tropical grasslands.
To determine the value of specific spectral bands or regions to our models, we applied iPLSR to the entire electromagnetic spectrum.However, several studies have identified different spectral regions to relate to LAI variations.For example, Darvishzadeh et al. 7 and Zhao et al. 28 found that LAI-related bands were between near infrared (NIR) and short-wave infrared (SWIR) spectral regions.The same studies also noted that bands in the visible region (e.g.440 nm) were valuable in LAI modelling.The relationship between LAI and red-edge bands has been established in several studies. 18,29,30Generally, the value of a spectral band or region in estimating LAI depends on the vegetation status.For instance, at the senescence, the amount of chlorophyll drops, thus increasing the radiation of NIR and SWIR spectral bands and their contribution in modelling biochemical or biophysical parameters. 28onsequently, we sought to pursue three objectives: (1) to identify useful bands for modelling LAI using iPLSR, (2) to compare heterogeneous tropical grasslands LAI estimates using iPLSR and PLSR models based on hyperspectral data and (3) to evaluate the robustness of the two models in estimating multi-temporal tropical grassland LAI (i.e.onset of, mid-and end summer) and pooled reflectance data during summer.

The study area
The study area was located in the Ukulinga Research Farm at the University of KwaZulu-Natal in Pietermaritzburg (30°24'S, 29°24'E) (Figure 1).The area is characterised by warm to hot summers and mild winters which often are accompanied by occasional frost.Mean monthly temperature ranges from 13.2 °C to 21.4 °C, with an annual mean of 17 °C. 31,32The farm receives over 106 days of rain with an annual precipitation of about 680 mm.Soils originate from shallow marine shales of Lower Permian Ecca Group classified as Westleigh forms.The area is under the Southern Tall Grassveld and is predominately herbaceous as a result of frequent mowing and long-term burnings. 32Themeda triandra Forssk, Heteropogon contortus (L.) P. Beauv.ex Roem.Schult.and Tristachya leucothrix Trin.ex Nees dominate the area. 33

Field sampling
Data were collected during the southern hemisphere summer (October 2014 to March 2015).Stratified random sampling with clustering was adopted to select sampling sites.The grassland area was first digitised from an aerial photograph (Figure 1) and stratified into north, south, east and west aspects.To select the plots, 10 x-y coordinates were randomly generated from the stratum using the Hawth tool.In total, 40 plots (30 m x 30 m) were selected and located in the field using a GPS (Trimple GEO XT, with an estimated 100-mm accuracy).Two or three subplots of 1 m x 1 m were randomly chosen within each plot to generate a final sample size of 100 plots.Spectral and LAI data were then collected within the subplots at the onset of, mid-and end of summer.

Data collection
At each sampling point, LAI was acquired with a LAI-2200C Plant Canopy Analyzer using the procedure described by Darvishzadeh et al. 7 Canopy reflectance was acquired using an analytical spectral device (ASD FieldSpec ® 3 spectrometer, Boulder, CO, USA).The spectral resolution of the ASD FieldSpec ® 3 spectrometer ranges from 350 nm to 2500 nm with 1.4-nm and 2-nm sampling intervals for the ultraviolet to visible and NIR region (350-1000 nm) and the SWIR region (1000-2500 nm) respectively.To normalise the spectra collected, the radiance of a white standard panel coated with barium sulfate and of known reflectivity was first recorded.Canopy reflectance measurements were made under clear sky between 10:00 and 14:00 local time to minimise atmospheric effects.To account for any changes in the atmospheric condition and the sun irradiance, reflectance measurements were recorded with frequent normalisation using the standard panel. 34In total, 15 replicates of canopy reflectance within each subplot were collected and averaged, allowing for elimination of measurement noise arising from soil background.

Pre-processing of hyperspectral data
To separate overlapping bands, thereby amplifying fine differences in the electromagnetic spectrum, the first-order derivative at three nanometres was applied on the resulting mean spectral data. 35,36First-order derivative is also known to be useful in minimising atmospheric and background noise. 14,20A number of researchers 7,37,38 have applied first-order derivative on hyperspectral data for LAI estimation.The spectral regions of 350-399 nm, 1355-1420 nm, 1810-1940 nm and 2470-2500 nm (Figure 2) are known to be noisy and were discarded from the spectra. 5,39alysis of variance and Brown-Forsythe tests

Partial least squares regression
Partial least squares regression was originally an econometric technique created by Herman Wold in the 1960s to construct predictive models from highly collinear explanatory variables. 25The principle of PLSR is to firstly decompose explanatory variables (X) into a few non-correlated latent variables or components using information contained in the response variable (Y); then to regress the new components against the response variable. 23,43According to Tan and Li 44 , Wang et al. 45 and Yeniay and Goktas 25 , the model that underlies PLSR consists of three phases.In the first phase, explanatory (X) and response (Y) variables are decomposed based on the expression: where T and U are respective matrices of scores of X and Y; P and Q stand for the matrices of loadings; and E and F for errors of X and Y matrices.In the second phase, the Y-scores (U) are predicted using the X-scores (T) based on the expression: where b represents the regression coefficient and e the error matric of the relationship between Y-scores and X-scores.In the final phase, the predicted Y-scores are used to build predictive models of response variable using the expression: where G is the error matrix related to estimating Y.
In the present study, we used the PLS Toolbox (Eigenvector Research Inc.) with MATLAB (version R2013b) to build PLSR models.Before running PLSR, pre-processed hyperspectral data along with LAI data were autoscaled. 11This procedure scales mean-centres of each waveband to unit standard deviation. 46The PLSR was then run on data using a leave-one-out cross-validation method.The least root mean square error (RMSE) and the highest coefficients of determination (R 2 ) between the predicted and the measured Y variable were the two criteria used to select the best model with optimal number of components.The best model was suggested by the software.

Interval partial least squares regression
Interval partial least squares regression (iPLSR) is a variant of PLS that locally develops PLS models on equidistant portions of the full spectrum. 26,27To predict a Y variable from spectra using iPLSR, the spectrum is split into a number of intervals of equal distance.A PLSR model is then built on each spectral interval.Thereafter, all the models built on the wavebands of different intervals are compared to the fullspectrum model based on calibration parameters such as root mean square error of cross-validation (RMSECV).Finally, the local model with the lowest RMSECV is selected. 21,47The iPLSR can operate in two modes or variable selection directions: backward and forward mode.In forward mode, the algorithm starts without any variable selection and then develops the best PLSR model from the interval with the lowest RMSECV.This process can be repeated by including more intervals to enhance the model.In backward mode, the algorithm starts by selecting all variables and then discards the interval with the largest RMSECV. 48 this study, iPLSR in forward mode was used to select best spectral intervals.As predictive bands of LAI are known to spread across the entire electromagnetic spectrum as mentioned above, the interval size was set to a single variable.This approach is recommended when there is uniqueness of information in variables. 46After several adjustments, the process was repeated 40 times.Therefore, the output local model had 40 intervals or bands.The iPLSR in forward mode was implemented using the PLS Toolbox.

Validation
A leave-one-out cross-validation method was implemented to calibrate models using 70% of the data and to find the optimal number of components.Then, the performance of trained models was validated using 30% of the data (independent data set).To assess model performance for prediction at the three sampling periods, relative root mean square regression to the mean (nRMSEP) and coefficient of determination (R 2 P ) were used.
Data splitting into training and independent test data sets was performed using an onion algorithm. 43An onion algorithm was chosen in this study to avoid arbitrary data splitting which may cause biased results. 7The principle of onion algorithm is to keep outside covariant data plus those that are randomly inner spaced. 49

Variation in LAI and spectra data
The values of skewness (between 0.40 and -0.45) and kurtosis (between 0.86 and -0.11) indicate that the LAI of grass species canopy in the sampling plots had a normal distribution.Therefore the LAI data were suitable for the ANOVA and Brown-Forsythe tests.LAI variation in grass species canopy was significant among the three multi-temporal periods (p<0.01).Samples in mid-summer had the highest mean (3.63 m 2 /m 2 ) and variability (standard deviation= 1.10 m 2 /m 2 ).Samples at the end of summer had the second highest mean (2.01 m 2 /m 2 ) and lowest variability (standard deviation = 0.705 m 2 /m 2 ).Samples at the beginning of summer had the least mean value of LAI (1.667 m 2 /m 2 ) in grass species canopies, with the second least variability (0.821 m 2 /m 2 ) in LAI.
To assess the change in reflectance at the different sampling periods, the mean spectra of all the sampling plots were averaged and upper and lower 95% confidence limits were derived.Results show that there was a change in averaged reflectance during the sampling periods (Figure 2).Visually, averaged reflectance was noticeably different across the electromagnetic spectrum.Canopy reflectance at the end, beginning and mid-summer presented the highest mean reflectance in the visible, NIR and SWIR regions, respectively.Figure 2 shows that first-derivative spectra differed in some spectral portions at the different sampling periods.The highest values of first-order derivative of reflectance are located in the NIR and SWIR region of the electromagnetic spectrum.

PLSR and iPLSR models
Table 1 presents results of the model performance of PLSR and iPLSR for the training data set at each of the sampling periods within summer.
Based on RMSECV and R 2 , results show that the iPLSR models perform better than the PLSR models.At each period, iPLSR models were able to explain more than 85% of LAI variability (88.8% at the beginning, 90.3% of mid-and 89.6% at the end of summer) with RMSECV values that vary from 0.24 m 2 /m 2 to 0.32 m 2 /m 2 .Although iPLSR had a slightly higher RMSECV value (0.53 m 2 /m 2 ) it had a better estimation of LAI variability across the entire summer (R 2 cv = 0.81).PLSR models on the other hand yielded high RMSECV values (0.55-0.77m 2 /m 2 ) and poorly explained the LAI variation (31.3-67.1%).
The contribution of each waveband in the selected PLSR factors is displayed in Figure 3.The most valuable bands for estimating LAI were distributed across the electromagnetic spectrum.However, the highest peaks for all the periods within summer, including all the periods combined, were mostly located in the NIR and SWIR regions.
Using iPLSR models with 40 intervals, Table 2 and Figure 4 present the selected bands and their location within the four regions of the electromagnetic spectrum, respectively, while Figure 5 provides a per cen tage of predictive bands in relation to the regions within the electromagnetic spectrum.

Model validation
Figure 6 shows the performance of PLSR and iPLSR (40 intervals) models on the independent test data set.PLSR models of all the periods within summer (including all the periods combined) increased the coefficient of determination for prediction (R 2 p ) and slightly decreased the relative root mean square error for prediction (nRMSEP).The values of R 2 p and nRMSEP, respectively, varied from 0.36 to 0.65 and from 28.44% (0.69 m 2 /m 2 ) to 33.47% (0.56 m 2 /m 2 ).However, iPLSR models performed better than the full-spectrum PLRS models for all the sampling periods in summer.The predictive power of iPLSR models did not change much on the validation data set.More than 80% of new data of LAI could be explained by the iPLSR models at all periods within summer (including all the periods combined).

Discussion
We sought to determine the performance of two multivariate regression models (PLSR and iPLSR) in estimating canopy level LAI on tropical grassland during summer.Comparisons were determined using the coefficient of determination (R 2 ) and the RMSE.Specifically, we examined the possibility of developing a model that can estimate LAI at different periods within summer (beginning, mid-and end) and across the entire summer period.Use of iPLSR to select the optimal bands for predicting LAI was also investigated.
Results showed that the PLSR algorithm run on first-derivative spectra to assess LAI variation at different periods did not perform well.The values of R 2 p and nRMSEP, respectively, ranged from 0.36 to 0.65 and 34.53% to 28.44%.Although PLSR is known to reduce the dimensionality of data to a few uncorrelated (orthogonal) components, inclusion of all the wavebands was not useful in the predictive performance of PLSR models -results consistent with Liu 50 , Chung and Keles 51 and Filzmoser et al. 52 However, when data dimensionality was reduced to useful bands using iPLSR, the performance of models (R 2 and RMSE) significantly improved.Overall, there were very close relationships between measured and predicted LAI values, with low values of RMSE and higher values of determination coefficients (R 2 ) (Figure 6).Consistent with the findings of Zou et al. 53 , Norgaard et al. 26 and Navea et al. 27 , our findings confirm the superiority of iPLSR over full-spectrum PLSR.
The best predictive performance was derived from canopy reflectance at mid-(R 2 p = 0.93 and nRMSEP = 9.39%) and end summer (R 2 p = 0.89 and nRMSEP = 10.50%).The models performed the worst at the beginning of summer (R 2 p = 0.88 and nRMSEP = 17.37%) and for all the sampling periods combined (R 2 p = 0.81 and nRMSEP = 24.71%).The lower early summer prediction in comparison to the two other sampling periods can be attributed to higher soil background noise.According to Darvishzadeh et al. 7 , soil background often has a negative effect on the predictive power of hyperspectral data when LAI is low.The lower performance at the end of summer in comparison to mid-summer might also be caused by soil background reflectance emanating from litters.
Adoption of iPLSR was useful in identifying relevant wavebands for predicting LAI.In total, 40 intervals were identified for all the sampling periods.The success of iPLSR for band selection in this study may be attributed to successful separation of overlapping bands performed by the first-derivative technique on the spectra.The spectral regions (NIR and SWIR) of bands selected by iPLSR are consistent with the findings by Darvishzadeh et al. 7 , Thenkabail et al. 38 , Brown et al. 54 and Gong et al. 55 Within ±12 nm, the bands chosen (Figure 4) in this study showed a consistency with the known bands for estimating LAI.For example, bands near 793 nm, 1061 nm, 1062 nm, 1633 nm, 442 nm, 443 nm, 535 nm, 551 nm, 732 nm and 2190 nm were also identified by Wang et al. 37 for estimating rice LAI at different growth phases.Furthermore, Gong et al. 55 found that bands centred near 1201 nm, 1240 nm, 1062 nm, 1640 nm, 2097 nm and 2259 nm were useful for estimating forest LAI.
It is worth noting that the contribution of different spectral regions along with their wavebands to LAI estimation depends on a particular period within summer (Figure 4).This dependence might be explained by the fact that the positions of selected wavebands are sensitive to changes in LAI, as indicated by ANOVA and Brown-Forsythe test results.Thus, the positions vary when factors like biochemical (e.g.chlorophyll) and biophysical (e.g.canopy closure) parameters and background effects change with canopy growth phases. 37For example, at the end of summer, as the canopy senesces and the amount of chlorophyll declines, NIR and SWIR become more important in predicting LAI. 28Furthermore, in the combined period, the selected bands can be explained by the fact that they were insensitive to changes in LAI (see Table 2).Delegido et al. 56 found that vegetation indices combining bands at 674 nm and 712 nm could overcome the aforementioned saturation problem while Kim et al. 57 found similar results with the ratio of 550 nm and 700 nm, which were insensitive to changes in chlorophyll concentration.
In this study, iPLSR models have proved to outperform full-spectrum PLSR models.However, model performance has shown to depend on the period within summer, on vegetation and on site conditions.These limitations are expected because PLSR and its variants (e.g.iPLSR), which are linear regression techniques, empirically relate to LAI and spectral reflectance, which makes the models non-transferable when environmental conditions of grassland (or vegetation cover in general) change. 24Further work should look at comparing iPLSR with other robust and flexible methods, such as physically based radiative transfer models, particularly for the combined period.Models for the combined period used physical laws to explicitly relate biophysical variables and spectral variation of canopy reflectance.Consequently, these models are known to be more reproducible than linear regression models such as PLSR. 58urrently, rapid development is being undertaken on physically based radiative transfer models for application in the field of remote sensing. 59urther studies should also compare iPLSR with non-linear machine learning (e.g.random forest, support vector machine) techniques as they are able to cope with non-linear relationships between biophysical variables and canopy reflectance in dense grasslands. 60

Conclusions
The following conclusions can be drawn: • iPLSR can be used to simplify the relationship between LAI and canopy reflectance transformed using first-derivative technique better than PLSR can.The best iPLSR relationship is at the beginning and end of summer.
• By including all the variables, full-spectrum PLSR models yield a higher prediction error.
• iPLSR used as a single variable selection algorithm for LAI estimation can generate stable and reliable models with 40 bands.
• The period within summer, which is associated with vegetation growth, determines the selection and accuracy of LAI predictive bands.
Results show that appropriate band selection on in-situ hyperspectral data using iPLSR can overcome the challenge faced by remotely sensed data to accurately estimate LAI in a heterogeneous grassland.
The findings pave the way to more accurate mapping and monitoring of canopy characteristics in a tropical grassland from airborne and spaceborne hyperspectral data.However, the development of a iPLSR model for all the periods combined within summer needs further investigation, as its prediction error was higher than those for the periods separately.

Figure 1 :
Figure 1: The Ukulinga Research Farm near the city of Pietermaritzburg in the province of KwaZulu-Natal, South Africa.

Figure 2 :Figure 3 :
Figure 2: Mean and respective first-order derivative of canopy spectra of all grass subplots at the (a) beginning of, (b) mid-and (c) end of summer.
, root mean square error of cross-validation

Figure 4 : 6 South
Figure 4: Optimal bands (in dark bars) selected by interval partial least square regression in developing leaf area index models at the (a) beginning of, (b) mid-and (c) end of summer and (d) pooled data.

Figure 5 :
Figure 5: Summary of predictive bands of leaf area index in different spectral regions.

Figure 6 :
Figure 6: One-to-one relationship (m 2 /m 2 ) between measured and predicted leaf area index (LAI) for validating partial least square regression (PLSR) and interval partial least square regression (iPLSR) models on an independent test data set in (a) early summer, (b) mid-summer, (c) end of summer and for the (d) pooled data.
42e combined test of skewness and kurtosis was first employed to evaluate the distribution of the collected LAI data.The test of normality is a prerequisite to assessing data variability.A perfect normal distribution has skewness and kurtosis values equal to zero.40To assess LAI variations between periods within summer, one-way analysis of variance (ANOVA) and Brown-Forsythe tests (α=0.05) were implemented.The use of the Brown-Forsythe test, in addition to ANOVA, was justified by the smaller sample size at the end of summer (n=73) because of spectrometer failure.According to Maxwell and Delaney 41 and Sheskin42, the Brown-Forsythe test is preferred over ANOVA when sample sizes are heterogeneous and is less affected by data that are not normally distributed.

Table 1 :
22 http://www.sajs.co.za Volume 113 | Number 9/10 September/October 2017 cv , root mean square error (RMSE) and the number of compo nents of training partial least square regression (PLSR) and interval PLSR (iPLSR) models prediction for the three sampling periods in summer and the pooled data