Estimating soil organic carbon stocks under commercial forestry using topo-climate variables in KwaZulu-Natal, South Africa

Volume 116| Number 3/4 March/April 2020 Research Article https://doi.org/10.17159/sajs.2020/6339 © 2020. The Author(s). Published under a Creative Commons Attribution Licence. Estimating soil organic carbon stocks under commercial forestry using topo-climate variables in KwaZulu-Natal, South Africa AUTHORS: Omosalewa Odebiri1 Onisimo Mutanga1 John Odindi1 Kabir Peerbhay1.2 Steven Dovey2 Riyad Ismail1,2


Introduction
Commercial forests represent a large carbon pool with the potential to reduce net greenhouse gas emissions. 1 The world's commercial forests increased by over 105 million ha between 1990 and 2015, while natural forests decreased by 234 million ha. 2 Federici et al. 3 notes that carbon held by commercial forests was comparable to that of natural forests (i.e. 1.08 vs. 1.44 gigatonne CO 2 per year) during this period. Recent projections have also shown that commercial forest plantations will increase by 20 to 50% by the year 2030. 4,5 Hence, commercial forests are increasingly becoming valuable environmental assets. Specifically, commercial forests could be used to reduce harvest pressure on remnant natural forests, restore degraded ecosystems, conserve natural resources and design climate change mitigation policies. 4 Carbon in commercial forest ecosystems is stored in five different pools: above-and below-ground live tree biomass, dead wood, litter and soils. 6 These pools are dynamic and controlled by a wide range of environmental factors such as climate, topography, forest type, moisture, temperature, soil type and land use. 7 Within these pools, soil is the largest carbon reservoir and constitutes 50-80% of total carbon stocks. 6 As indicated by Liu et al. 8 , the global soil organic carbon (SOC) stock has been evaluated to be about 1500 Pg carbon in the upper 100-cm soil layer, which is approximately double the measure of carbon in the atmosphere and thrice the amount stored in terrestrial vegetation. 9 However, whereas the above-ground biomass carbon pools have been widely investigated, the spatial variability of SOC within forest landscapes is still poorly understood. 10 Hence, an in-depth understanding of SOC and its variability in relation to topographical and environmental factors is crucial in quantifying regional and global carbon balances and in examining the responses and feedbacks of the terrestrial ecosystem to climate change. 8 Furthermore, knowledge of SOC variability is useful for developing suitable management strategies to improve carbon assimilation 11 and to achieve total annual national and global carbon accounting objectives as well as Intergovernmental Panel on Climate Change and Kyoto Protocol objectives.
Environmental variables are critical to the spatial distribution of SOC within a forest landscape. 12 Some of these environmental factors occur naturally while others are human-induced. 13 Variables such as topography (e.g. elevation, slope and aspect) and climate (e.g. temperature and rainfall) significantly influence SOC distribution within a forest landscape. 14 Studies have also shown that climate and topography exert a strong influence on the amount of vegetation density within a forest landscape, which in turn determines SOC distribution. 15,16 However, Chaplot et al. 11 and Liu et al. 8 note that large-scale investigations on the connection between SOC distribution and the influence of multiple topo-climatic variables at a regional and global level are not sufficient. Explicitly significant relationships and models that link topo-climatic variables with SOC processes at large spatial extents are still very necessary. 17 Traditionally, SOC has been quantified through, among others, field surveys and wet soil analysis. 18 Although these strategies are highly accurate, they are difficult to conduct over large areas, expensive, timeconsuming and labour-intensive, and may lead to the generation of toxic waste such as chromate oxidation, which requires careful and proper disposal. 19 Geospatial techniques, on the other hand, offer more practical and economical means of predicting and quantifying soil parameters at local, regional and even global scales. 8,20 Recently, the adoption of spatial techniques in SOC studies has attracted significant attention 8,14,21,22 , particularly because spatially continuous topographic metrics that affect SOC distribution (e.g. slope, aspect, elevation, curvatures, catchment area) can now be easily generated from digital elevation models (DEMs) and satellite imagery. 14 Climatic variables such as rainfall and temperature can also be derived from WorldClim database (global climate layers) at a spatial resolution of about 1 km 2 . 23 The use of DEMs derived from satellite missions such as Light Detection and Ranging (LiDAR) and Shuttle Radar Topography Mission (SRTM) for terrain analysis has become particularly popular due to their relatively high resolutions. 24 Hence, the development of models with SRTM and LiDAR-derived topographic metrics in concert with selected bioclimatic variables could benefit studies characterised by limited observations and can be used to produce continuous SOC distribution. 21 Additionally, remotely sensed derived models (e.g. DEMs, Landsat and Sentinel sensors) are useful in determining regional and global land conditions, impacts and efficacy of land management and better at characterising forest landscapes. 14 Therefore, the objective of this study was to explore the use of remotely sensed SRTM DEM derived topographic metrics and other bioclimatic variables in the prediction of SOC stocks under commercial forestry in KwaZulu-Natal, South Africa.

Study site
Commercial forest plantations in South Africa cover approximately 1.4 million hectares, representing approximately 1.1% of the country's landmass. 6,25 South African commercial forest plantations are mostly found in the eastern part of the country, stretching from the Western Cape through the Eastern Cape, to the KwaZulu-Natal, Mpumalanga and Limpopo Provinces. 6,25,26 South Africa's commercial forest plantations are of great economic and environmental value. Commercial forestry contributes about 1.27% to South Africa's gross domestic product, provides direct and indirect employment to about half a million people, supplies South African forest and fibre needs of approximately 17 million air dry tons annually, and sequestrates about 4.1 million tons of CO 2 per year. 6,27 The country's forest plantations are predominantly pine (49.8%), eucalyptus (42.7%) and wattle (7.1%) species. 25 This study was conducted across commercial forest plantations in KwaZulu-Natal Province -a region located along the east coast of South Africa ( Figure  1). The Province is an important agricultural and forestry production area and is characterised by different forest species. The dominant commercial forest species in the region are Eucalyptus (hardwood) and Pinus (softwood) species, which cover approximately 371 034 and 115 922 ha, respectively. 25 The area experiences sub-tropical climatic conditions, with the main rainfall from October all through to March. The mean annual temperature of the study region is approximately 21.7 °C while the mean annual rainfall ranges from 700 mm to 1500 mm; thus conditions are Predicting soil organic carbon stocks Page 2 of 8 favourable for commercial forestry farming. 20,28 The topography of the area is generally characterised by undulating plains and the altitude rises from 800 m to 1400 m above mean sea level. The zone is underlain by geological formations including granite, arenite, basalt, tillite, shale, sandstones and mudstones, which result in clay to pure sandy soils. 19 Figure 1: Commercial plantations located within KwaZulu-Natal Province, South Africa.

Field data collection and analysis
SOC analysis and quantification is an important part of forest resource management. 6 The procedure of soil sample collection in the field should be typical of the area being sampled because the utilisation of the obtained results from the laboratory analysis relies on the sampling accuracy. 29,30 Commercial forest sites are categorised into broad regions of site quality (in reference to productivity), i.e. poor, medium and good 26 , hence soil samples must be taken to represent each identified site quality in the field. In this study, field data were collected during the rainy summer season (January and March 2013) -a period characterised by high above-ground biomass productivity. A stratified random sampling method was adopted, in which the area was divided into three different strata (i.e. low, medium and high productivity) based on prior knowledge of the commercial forest stands. The purpose of using a stratified random sample is to get a more representative sample of all the regions of commercial forest site qualities. A Hawth's Analysis tool in ArcGIS version 10.4.1 was used to generate random sample plots on a commercial forest cover map across all site qualities. These sample plots were allotted to the three various strata of homogeneous vegetation according to the stratified random sampling strategy and then uploaded into a handheld GPS (TRIMBLE GEO-7X) which was used to navigate to the field sites. Once the sample plots were located on the sites, soil was dug at each sample points to a depth of 30 cm using a soil auger, which is the recommended depth in spatial SOC inventories. 10,29,31 In total, 81 accessible soil samples were collected and transported to the laboratory where they were processed and analysed. SOC concentration was determined in the laboratory using the Walkley-Black32 dichromate oxidation method.

Topographic metrics
Previous studies have identified topographic variables as critical drivers to SOC distribution. 14,21 According to Li et al. 14 , spatial topographic variables are categorised into three main groups: local, non-local and combined topographical variables. Local topographical variables examine the surface geometry at a point on the land surface such as slope, elevation and curvatures while non-local attributes portray relative locations of selected points, such as relief, catchment area, openness and flow accumulation. Combined attributes are an integration of both the local and non-local topographic variables such as slope length factor, topographic wetness index (TWI) and stream power index. 14 In this study, 29 different topographical variables ( Table 1) that cut across the three classes (i.e. local, no-local and combined) were selected. These variables were derived from a 30-m resolution DEM created from SRTM data in SAGA GIS (2.3.2) and ArcGIS 10.4 software.

Bioclimatic data
Bioclimatic data sets including rainfall and temperature are critical determinants of SOC. 13,23 In this study, mean temperature and rainfall bioclimatic variables were used in concert with topographic variables to predict SOC occurrence. The bioclimatic variables were obtained from the 1-km 2 30 arc-seconds spatial resolution of the WorldClim archives (http://www.worldclim.org/) of the global climate conditions. The WorldClim climatic data sets are long-term (30-year) mean annual measurements (containing grids including rainfall and temperature as well as other climatic layers summaries such as the wettest, driest, coldest and hottest quarters and months of the year. The derived temperature (Bio01) and rainfall (Bio12) bioclimatic layers used in this study were resampled in order to match the SRTM-derived DEM spatial resolution (i.e. from 1 km 2 to 30 m).

Statistical analysis
To determine the relationship between SOC and the derived environmental variables, SOC within the commercial forest stands was predicted using the maximum entropy (Maxent) regression algorithm software (version 3.3.3). 33

Maximum entropy model
The maximum entropy model (Maxent) is a machine-learning algorithm proposed by Phillips et al. 33 Maxent models the probability of species presence based on environmental constraints and estimates the likelihood distribution with the maximum entropy, i.e. the most spread out distribution. Typically, Maxent produces an estimate of a probability of occurrence that ranges from 0 to 1, with 1 being the highest and 0 the lowest. It is a concise mathematical definition, and therefore can be adjusted to analyse data with efficient deterministic algorithms that are certain to give optimal probability distribution. Maxent performs efficiently, even with small sample sizes, is resistant to errors in occurrence data, and applies the presence-only data sets to estimate the suitability of habitat or the likelihood of target occurrence. 34 At the point where absence data exist for the species, a conditional model can be used to enable presence/ absence modelling. 33 Maxent uses background/pseudo-absence and presence points that evaluate the environmental space for model testing.
Environmental variables (continuous and categorical) and speciespresence data are used to run the model and the influence of each variable can be determined from the jackknife tool in Maxent. 33 In SOC modelling, the Maxent model begins with equal distribution and performs a number of repetitions based on the most important predictor variable until no further improvements in the spatial estimation of SOC are made. 35 The Maxent model aims to identify the maximum likelihood variability of SOC occurrences within the commercial forest stands in the study area. 36 The most uniform spread of the SOC occurrences is consequently identified by the model and selected from among many possible distributions. 37

Predictor variables selection
One common limitation with regression is the issue of multicollinearity, which occurs when two or more predictor variables are highly correlated. Hence, it is often advisable to use the best and fewest number of predictor variables useful in building a model. 38 Stepwise regression was adopted to solve any possible multicollinearity and to select the best and fewest predictive variables for the Maxent model. Stepwise regression identifies the statistical importance of a subset of predictors through forward selection, backward elimination or a combination of both. 39 In this study, the stepwise backward elimination method was conducted using the 'stepAIC' function in the 'MASS' package of R statistic software 3.5.1 to select the best predictor variables which were then used for the Maxent final model. Backward elimination works by removing predictor variables (n=31) that are not significant to the model until the ideal number of predictor variables is obtained. The ideal number of variables was selected after the backward elimination process. These selected variables were then used to predict SOC using Maxent (version 3.3.3).

Model calibration, evaluation and validation
Settings are an integral part of building any model in order to get the best results. The SOC data (n=81) used in this study were partitioned into training (70%) and test (30%) data sets. 28 The training data (n=54) were used in the model building while the test data (n=23) were used for model validation as proposed by Phillips et al. 35 The default Maxent Evaluation and validation are critical steps in any model building process.
In this study, we used the random test percentage settings in Maxent, i.e. 70% of the data set was used to train and 30% to test the performance of the model. The receiver operating characteristic (ROC) curve was used to evaluate and validate the model. The ROC curve is a method that describes the performance level of probabilistic and deterministic detection of forecast systems. 36 It shows the likelihood that presence (sensitivity) is correctly ordered by the classifier (in our case predictors) as compared to the absence (specificity) of SOC. A two-dimensional space is used to generate the ROC curve by plotting the sensitivity as Y and the specificity as X. Generally, models with high accuracy have an AUC value close or equal to 1, whereas a value equal or less than 0.5 shows a model that performs no better than random. 33 Previous studies have broadly and successfully demonstrated the application of the ROC curve to quantitatively evaluate and validate the effectiveness of probability modelling. 14,21,36 Table 2 shows the results of the stepwise backward elimination method conducted to remove redundant and highly correlated predictor variables. The procedure to select the best predictor variables was done using the 'stepAIC' function in the 'MASS' package of the R statistic software (version 3.5.1). The Akaike information criterion (AIC) acts like an examiner of the relative quality of models for a given set of data by assigning a value to the model (in this case 5.61), while the stepwise backward elimination method eliminates predictor variables that are less significant at each stage of the model until the lowest AIC value is attained (in this case -15.85). The general rule is that the lower the AIC value, the better the model. As shown in the table, 11 out of the 31 predictor variables used produced the lowest AIC value of -15.85 after the elimination procedure. Table 2 additionally shows the AIC value attached to each of the eleven selected variables, indicating the degree to which the lowest AIC value (-15.85) will increase should any of the variables be removed. Hence these selected variables are regarded as the best subset of predictors to be used in the prediction of SOC variability using the Maxent algorithm.

Analysis of Maxent model omission/commission and ROC
The omission/commission and the ROC analysis is useful in examining the performance of the Maxent model. The omission/commission rate is calculated on both the training and test data sets. The general rule of Maxent is that a good model will produce omission rates that are in close proximity with the predicted omission due to the definition of the cumulative threshold. The predicted omission rate is demonstrated by a straight black line on the cumulative threshold result output of the Maxent model. As shown in Figure 2a, the omission on test data sets (turquoise line) is in close proximity with the predicted omission rate (black line) from the Maxent distribution, which signifies a good Maxent model for SOC occurrence. Figure 2b shows the model evaluation for predicting SOC occurrence using the ROC of the randomly selected training and test data sets. The AUC ranges between 0 and 1. A good model will be closer or equal to 1 while a model with an AUC less than or equal to 0.5 signifies a poor model and was no better than random. As shown in Figure 2b,

Predictor variables contribution
A major strength of the Maxent algorithm is that it permits the assessment of individual predictor variables in order of their influence in the model. Table 3 shows the estimated percentage contribution of each predictor variable to the Maxent model. The increase in regularised gain is added to the contribution of the corresponding variable or subtracted from it in order to determine the estimates in each iteration of the training algorithm. The higher the percentage contribution of any of the variables, the higher its impact in predicting the SOC occurrence. The percentage contribution table generated by the Maxent model shows that 'rainfall (mean annual)' with a percentage contribution of 30.2% had the highest predictive contribution and was the most influential in predicting SOC occurrence. Also, variables such as temperature (22.9%), slope (16.1%), elevation (11.6%), TWI (8.9%) and direct insolation (3.1%) were contributors to SOC occurrence and accounted for over 60% of the total SOC occurrence. Predictor variables such as aspect, profile and longitudinal curvature had little influence on the Maxent model. Figure 3 shows the jackknife test results; the jackknife test is used to assess each predictor variable contribution in order of importance to the Maxent model. The turquoise bars indicate the overall model accuracy when each of the predictor variables is excluded from the model, while the blue bars show the individual performance and accuracy of predictors when used in isolation. The red bars depict the overall gain of the model when all the variables are used together. As shown in Figure 3, 'rainfall (mean annual)' is the environmental predictor variable that has the highest gain when utilised in isolation and, as a result, appears to have the most valuable information independent of any other variable. The environmental variable that most reduced the Maxent model overall gain when omitted was 'temperature (mean annual)', which as a result appears to have the bulk of information that is absent in other variables. Variables such as slope, DEM and TWI also had an impact on the Maxent model gain as there was a significant decrease in the final model gain (red bar) when they were excluded. Other variables such as aspect, catchment area, curvatures, direct insolation and positive openness, had little or no significant contribution to the final model and hence are regarded as unimportant in predicting SOC occurrence in this study.

Figure 3:
The jackknife of variable importance in modelling the spatial distribution of soil organic carbon.

Map generation of SOC distribution
The Maxent model generated a map that shows the spatial distribution of SOC occurrence within the study area. Figure 4 shows the likely occurrence of SOC based on the field observation points using topographic metrics and bioclimatic data as predictor variables. Areas with darker colours indicate better-predicted conditions of SOC occurrence while areas with lighter colours indicate moderate or low predicted conditions for SOC occurrence. A general assessment of the resultant SOC distribution map shows that the possible occurrence of SOC within the study region is relatively high in the southern and central plantations as compared to plantations located at the northernmost and the far eastern regions of the Province.

Discussion
The global expansion of commercial forestry and its potential to sequestrate carbon is increasingly becoming important in climate change mitigation. 6 Hence, there is a growing interest in commercial forest management practices to further enhance the ability to sequestrate carbon because of the carbon assimilation efficiency of commercial forests. 41 Although commercial forest soils sequestrate and store more carbon than other carbon pools, their role in carbon sequestration remains largely unexplored. 6 Rainfall, which contributed 30.2% to the Maxent model, was the most significant determinant of SOC occurrence in the study area. Forest stands with higher SOC occurrence are characterised by higher mean annual rainfall. This finding is consistent with that of Meier and Leuschner 42 who observed a substantial reduction (25%) in the SOC pool in forests stands that received less than 600 mm rainfall annually compared with stands that received more than 900 mm rainfall annually. Other studies, such as those of Jobbágy and Jackson 43 , Bhandari and Bam 7 , Chen et al. 2 , Ramifehiarivo et al. 13 , Hewins et al. 23 and Soucémarianadin et al. 12 , also noted that mean annual rainfall within a forest landscape strongly influences the amount of SOC. The relationship between rainfall and SOC occurrence can be attributed to the influences of rainfall on soil moisture and hydrological processes such as surface run-off and groundwater infiltration. 44 Zhou et al. 45 13,23 have also reported temperature to be strongly linked with SOC, as it can promote plant productivity, which in turn promotes the presence of SOC. Furthermore, the influence of temperature as one of the major drivers of SOC can also be attributed to its direct relationship with rainfall, because rainfall may influence soil moisture, and by extension the surface temperature, by controlling the partitioning between sensible and latent heat fluxes. 50 In addition, the amount of rainfall within a forest landscape can also be influenced by temperature because higher temperatures signify more water vapour in the atmosphere, thus increasing the chances of heavy downpours. 51 Slope (16.1%), elevation (11.9%) and topographic wetness index (8.9%) also showed influence on SOC. Areas with relatively steep slopessuch as the northernmost and the far eastern parts of the study area -had lower SOC occurrence than areas with relatively gentle slopes, such as the southernmost and central parts of the commercial forest areas. This finding can be attributed to higher erosion rates, which are commonly observed in areas of steeper slope as opposed to areas with less steep or gentle slopes. Erosion transports soil properties such as SOC from the upland to low-lying areas. 52 Previous studies by Li et al. 14 , Fissore et al. 21 , Young et al. 52 and García-Ruiz 53 , among others, all noted slope as a strong determinant of SOC distribution. Elevation was also a significant influence on SOC occurrence in commercial forestry. Low likelihood SOC occurrence was predicted for areas with higher altitude (>1300 m) while high probabilities were predicted for low lying areas (<1300 m). These findings are consistent with another study 54 in which it was noted that with increasing altitude, the amount of soil and vegetation became less abundant, which was also reflected in the amount of SOC. The variability of SOC due to elevation can be attributed to the fact that low-lying areas favour vegetation growth due to optimal soil development conditions that may include erosion of nutrient-rich topsoil from higher grounds that are deposited in low-lying areas. Additionally, most lowlying areas are characterised by higher soil moisture content, nutrients and deeper depth as opposed to higher grounds characterised by extreme environmental conditions, hindering vegetation growth due to limited soil microorganisms. 55 Furthermore, the influence of elevation on microclimate could also be the reason for SOC occurrence in our study as altitude influences temperature, wind flows and soil moisture peculiar to a region, which in turn impacts the amount of vegetation and by extension SOC.
Although TWI contributed less than 10% in the Maxent model, it can still be regarded as an important driver of SOC occurrence due to its influence on the overall gains shown in the jackknife results ( Figure 3). TWI determines soil moisture distribution along slopes 14 , hence areas with higher TWI (soil moisture) indicate higher SOC density than areas with low TWI. Previous studies [56][57][58] have also reported the significance of TWI in SOC distribution by observing a strong correlation between TWI and soil organic matter. For instance, Li et al. 14 identified TWI as the most significant topographic predictor variable in SOC variability.
Results indicated relatively little impact of longitudinal and profile curvatures, catchment area, aspect, direct insolation and positive openness to SOC variability. This finding is in contrast to other investigations that demonstrated their importance to SOC formation and distribution. 14 One possible reason for the significance of these environmental factors in other studies could be due to landscape variability in relation to our study area. 21,22,39 For instance, curvatures are generally more sensitive to places of higher relief than relatively moderate landscape and have been broadly used to depict flow acceleration. 14 As a result, curvature plays a significant role in the variability patterns of SOC in regions of higher landscapes.
The present study showed that the utility of Maxent based on key topographic and bioclimatic variables provides a useful and effective methodology for predicting SOC under commercial forest landscapes. Maxent also demonstrated the ability to show the percentage contribution of each predictor variable and their influence through the analysis of the jackknife results (Table 3; Figure 3). It also automatically generated a visually appealing SOC distribution map ( Figure 4) and accuracy assessments by producing the ROC curve, which is used to determine its predictive performance. Regardless of the benefits of the Maxent algorithm, it has some limitations. Maxent's logistic output relies on an assumption and not an estimation of SOC prevalence. It is also hard to compare the outputs of Maxent with other regression algorithms as it gives environmental suitability rather than predicted likelihood of SOC occurrence.

Conclusion
In this study, we investigated the impact of topographic and bioclimatic data on SOC distribution under commercial forestry using the Maxent algorithm. Results indicate that rainfall and temperature, as well as topographic variables such as slope, elevation and TWI, are effective in mapping SOC distribution. Our model was useful in predicting SOC occurrence and yielded an effective framework for continuous monitoring and assessment of SOC. The method developed in this study is cost-effective and suggests the use of other readily available climatic and topographic information for the prediction of SOC under commercial forestry in South Africa and indeed globally. Results from this study are important to achieve the national carbon accounting objective and are also valuable to forest managers, ecologists and relevant stakeholders in understanding the spatial distribution of SOC. We recommend further experiments be conducted using higher-resolution data sets to assess the performance of the Maxent algorithm in predicting SOC.