Abstract
Recent advances in forest ecosystem modeling allow the simulation of a suite of dynamics from site- to landscape-scale. In order to scale models efficiently from trees to landscapes, different model reduction strategies are employed. Yet, the results of these strategies and the assumptions they entail are rarely compared. Here, we conducted a model intercomparison exercise using two such forest biogeochemistry models, PPA-SiBGC and LANDIS-II NECN. We simulated past-decade conditions at flux tower sites in Harvard Forest, MA, USA and Jones Ecological Research Center, GA, USA. We mined the wealth of field data available for both sites to perform model parameterization, validation, and intercomparison. We assessed model performance using the following time-series metrics: net ecosystem exchange, aboveground net primary production, aboveground biomass, C, and N, belowground biomass, C, and N, soil respiration, and, species total biomass and relative abundance. We also assessed static observations of soil organic C and N, and concluded with an assessment of general model usability, performance, and transferability. Despite substantial differences in design, both models achieved good accuracy across the range of metrics. While LANDIS-II NECN performed better for interannual NEE fluxes due to its basis in the Century model, the PPA-SiBGC model indicated better overall correspondence to observational data for both sites across the 11 temporal and 2 static metrics tested .
1. Introduction
Forest models are thought to have began 350 years ago in China with yield tables known as the Lung Ch’uan codes, invented by a women of the Kuo family in Suichuan county, Jiangxi [1]. It was not until the 20th century that the first complex mathematical models of forests emerged. Digital computers enabled researchers, for the first time, to explicitly model forest dynamics. Following the development of matrix models [2] and empirical growth-and-yield models such as Prognosis [3,4], a vast array of gap [5], forest landscape [6–10], and terrestrial biosphere models [11–13] have been developed. Models of forest ecosystems vary substantially in their representation of crown geometry and biogeochemical processes.
Representation of canopy geometry varies from implicit to a single ‘big-leaf’ and detailed three-dimensional crown and root geometry (e.g., modern gap models such as MAESPA [14] and LES [15]). Models of growth range from simple allometric equations (e.g., growth-and-yield models) to light-use efficiency models [16] and first-principles mechanistic models of photosynthesis [17]. Belowground process models similarly vary in structure, from simple stoichiometric relations to carbon and nitrogen cycling with microbial dynamics to a fully mechanistic representation of energetic and biogeochemical processes based on thermodynamics. Current belowground models vary considerably in their process representation and accuracy, with much improvement left to be made [18]. Most belowground models in use globally rely on a variant of the classical Century model [19,20].
Model specialization and generalization ranges from pure research applications in narrowly defined areas (e.g., [14]) to simulating multiple loosely coupled landscape processes to simulating biogeochemical fluxes throughout the world’s forests. A trade-off is thought to exist between realism, precision, and generality [21], with more detailed models requiring higher parameterization costs. Yet, little is known about the net effects of variation in the structure of these models on the precision and accuracy of their predictions across temporal and spatial scales. While such model intercomparisons are common within classes of models such as terrestrial biosphere models, they are seldom applied to gap or forest landscape models. Models operating at different scales are seldom compared within sites. Yet, much can be learned by comparing models that differ in assumptions and structure.
Existing forest model intercomparison projects, or MIPs, in Europe include the stand-level Intersectoral Impact MIP (ISIMIP) [22] and landscape-level Comparison of Forest Landscape Models (CoFoLaMo) [23], the latter conducted under the European Union Cooperation on Science and Technology (COST) Action FP1304 “Towards robust projections of European forests under climate change” (ProFoUnd). Previous efforts in the United States include the Throughfall Displacement Experiment (TDE) Ecosystem Model Intercomparison Project at Walker Branch Watershed in Oak Ridge, Tennessee [24]. Presently, no other forest model intercomparison project is evident for North America. There is a critical need to conduct ongoing forest biogeochemistry model comparisons in this and other regions of the world in order to establish the regional foundation for robust global C cycle projections. In this work, we aim to begin this process for North America with a comparison of the Perfect Plasticity Approximation with Simple Biogeochemistry (PPA-SiBGC) and Landscape Disturbance and Succession with Net Ecosystem Carbon and Nitrogen (LANDIS-II NECN) models, which provide contrasting model structures for representing stand dynamics.
Modern forest landscape models are the result of five key model development phases, listed in chronological order: (1) growth-and-yield models; (2) fire models; (3) gap models; (4) physiological models; (5) hybrid models combining design principles from each [5,25,26]. Terrestrial biosphere models similarly trace their roots back to early one-dimensional physiological models, with land surface models currently in their third generation and dynamic global vegetation models in their second generation [27]. This latest generation of models was intended to address the lack of explicit representation of vegetation dynamics – a critical source of model uncertainty in future climate scenarios [28]. This inspired the aforementioned forest ecosystem model intercomparisons as well as new terrestrial biosphere model designs based on gap models, bypassing the trade-offs of medium-resolution forest landscape models.
Collectively, these efforts yielded a number of new terrestrial biosphere models based on the classical gap model, including the Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS) [29], the Ecosystem Demography model (ED/ED2) [30,31], and Land Model 3 with PPA (LM3-PPA) [32], based on the Perfect Plasticity Approximation (PPA) [33,34]. These models represent the current state-of-the-art in modeling vegetation dynamics globally. While individual-based global models have begun to merge, forest landscape models have remained in between, focused on spatial processes of fire, harvest, and biological disturbance. Yet, previous research has shown that such forest landscape models are often insensitive to landscape configuration and are therefore aspatial [35], counter to the main assumption and selling point of these models.
While most forest landscape and terrestrial biosphere models lack individual trees, the SAS [30] and PPA [33,36,37] model reduction strategies have demonstrated an ability to successfully up-scale gap dynamics to forest stands. Other up-scaling strategies exist as well. One recent forest landscape model participating in the CoFoLaMo intercomparison scales from individual trees to stands by pre-computing light tables [38]. Regardless of the model structure, it is clear that gap, forest landscape, and terrestrial biosphere models are beginning to merge into new models of the terrestrial biosphere. This trend is also attributable to improvements in computational efficiency with new processor designs and cluster or cloud computing infrastructure. As few, if any, existing models are designed for highly parallel architectures (e.g., general-purpose graphics processing units, or GPGPUs), there remains much potential for future model efficiency gains.
In this forest biogeochemistry model intercomparison, we focus on two sites on the East Coast of the United States, Harvard Forest (HF), Massachusetts and Jones Ecological Research Center (JERC), Georgia. The two sites were selected for their representativeness of the United States Eastern Seaboard and for the availability of data needed to parameterize and validate the models. Harvard Forest is one of the most-studied forests in the world, with Google Scholar returning 12,700 results for the site. We focus on results for the Environmental Measurement Station (EMS) eddy covariance (EC) flux tower site within the Little Prospect Hill tract – the longest-running eddy covariance flux tower in the world. Previous research at the EMS EC flux tower site found unusually high rates of ecosystem respiration in winter and low rates in mid-to-late summer compared to other temperate forests [39]. While the mechanisms behind these observed patterns remains poorly understood, this observation is outside the scope of the presented research.
Between 1992 and 2004, the site acted as a net carbon sink, with a mean annual uptake rate of 2.5MgCha−1year−1. Aging dominated the site characteristics, with a 101-115 Mg Cha-1 increase in biomass, comprised predominantly of growth of red oak (Quercus rubra). The year 1998 showed a sharp decline in net ecosystem exchange (NEE) and other metrics, recovering thereafter [40]. As Urbanski et al. [40] note of the Integrated Biosphere Simulator 2 (IBIS2) and similar models at the time, “the drivers of interannual and decadal changes in NEE are long-term increases in tree biomass, successional change in forest composition, and disturbance events, processes not well represented in current models.” The two models used in the intercomparison study, a SORTIE-PPA [33,34] variant and LANDIS-II with NECN succession [41,42], are intended to directly address these model shortcomings.
While there have been fewer studies at Jones Ecological Research Center, Georgia, USA, Google Scholar returns 1,370 results for the site, reflecting its growing role in forest sciences research. Our study focuses on the Red Dirt (RD) EC flux tower site within the mesic sector, for which a handful of relevant studies exist. Two recent studies [43,44] indicate that the mesic sector of this subtropical pine savanna functions as a moderate carbon sink (NEE = −0.83 Mg Cha−1year−1; −1.17 Mg Cha−1year−1), reduced to near-neutral uptake during the 2011 drought (NEE = −0.17 Mg Cha−1year−1), and is a carbon source when prescribed burning is taken into account. NEE typically recovered to pre-fire rates within 30-60 days. The mechanisms behind soil respiration rates here again appear to be complex, site-specific, and poorly understood [44].
Overall, existing research highlights the importance of fire and drought to carbon exchange in long-leaf pine (Pinus palustris) and oak (Quercus spp.) savanna systems [43–45] at JERC. This is in contrast to the secondary growth-dominated deciduous broadleaf characteristics of Harvard Forest. Species diversity at the EMS tower site is 350% greater than that of the JERC-RD site, with 14 species from a variety of genera compared to four species from only two genera, Pinus and Quercus.
In this work, we aim to establish a foundation for future forest biogeochemistry model intercomparisons. This includes open-source object-oriented software to facilitate model parameterization, validation, intercomparison, and simplified reproducibility of results. We perform the model intercomparison for two key research forests in the United States to assess the ability of each model to reproduce observed biogeochemistry pools and fluxes over time. We hypothesize that the inclusion of forest growth, compositional change, and mortality processes in both models will allow for accurate predictions of biomass and NEE dynamics, as suggested in previous research Urbanski et al. [40]. Accordingly, we compare both models to observations and to each other for a host of metrics related to biomass, C, N, and forest composition at the two research sites.
2. Materials and Methods
LANDIS-II NECN and PPA-SiBGC were parameterized for two forested sites in the eastern United States, Harvard Forest, Massachusetts and Jones Ecological Research Center, Georgia. At the HF site, we focus on Little Prospect Hill and the EMS EC flux tower (HF-EMS). At the JERC site, we focus on the mesic zone and RD EC flux tower (JERC-RD). Both sites provided local EC and meteorological measurements to conduct this study. Plots of EC flux and meteorological tower measurements for both sites are located in Appendix A; maps of both sites are located in Appendix B.
Both models were parameterized using data available for each site, including local (i.e., field measurements) and general information sources (e.g., species compendiums and other published sources). As these empirical or observational values were used to parameterize both models, further model calibration (i.e., parameter tuning) was not necessary. This is because tuning parameters away from measured values to improve model performance, or defining a separate set of tuning parameters, is known to produce model over-fitting (i.e., reduced generality) and thus false improvements in model accuracy through reduced parsimony [46]. We explicitly avoided this practice, as it is only appropriate when fitting empirical growth-and-yield models such as Prognosis, also known as the Forest Vegetation Simulator (FVS) [3,4]. All model parameters are provided in the Appendix C. We close the methodology section with descriptions of the metrics, models, and criteria used in the intercomparisons.
2.1. Model Descriptions
In the following sections, we provide a brief overview of the two forest ecosystem models used in this intercomparison study. For detailed information on each model, readers are encouraged to refer to the original publications.
2.1.1. LANDIS-II NECN
The LANDIS-II model is an extension of the original LANdscape DIsturbance and Succession (LANDIS) model [47–49] into a modular software framework [41]. Specifically, LANDIS-II is a model core containing basic state information that interfaces or communicates with external user-developed models known as “extensions” using a combination of object-oriented and modular design. This design makes LANDIS-II a modeling framework rather than a model. The LANDIS family of models, which also includes LANDIS PRO [50] and Fin-LANDIS [51,52], are stochastic hybrid models [25] based on the vital attributes/fuzzy systems approach of the LANDSIM model genre [53]. Perhaps unknowingly, this genre borrows heavily from cellular automata [54] and thus Markov Chains by applying simple heuristic rule-based systems, in the form of vital attributes, across two-dimensional grids.
Models of the LANDSIM genre focus on landscape-scale processes and assume game-theoretic vital attribute controls over successional trajectories following disturbance [55]. The LANDSIM model genre is thus a reasonable match for the classical forest fire model [56], given its local two-dimensional cellular basis. In contrast to the original LANDIS model, LANDIS-II is implemented in Microsoft C# rather than ISO C++98 [57], simplifying model development in exchange for a proprietary single-vendor software stack [41].
The latest version of LANDIS-II (v7) supports Linux through use of the Microsoft .NET Core developer platform. The modular design of LANDIS-II is intended to simplify the authorship and interaction of user-provided libraries for succession and disturbance. The centralized model core stores basic landscape and species state information and acts as an interface between succession and disturbance models. While there have been numerous forest landscape models over the years [6–10], the LANDIS family of models has enjoyed notable longevity and is currently united under the LANDIS-II Foundation. Part of its longevity is attributable to the prioritization of model functionality over realism in order to appeal to application-minded managers seeking a broad array of functionality.
The Net Ecosystem Carbon and Nitrogen (NECN) model [42] is a simplified variant of the classical Century model [19,20]. The original ten soil layers in Century have been replaced by a single soil layer, with functions for growth and decay borrowed directly from Century v4.5. The NECN succession model Figure 1 is thus a process-based model that simulates C and N dynamics along the plant-soil continuum at a native monthly timestep.
Atmospheric effects are included through monthly climate (i.e., temperature maxima, minima, means, and standard deviations, and precipitation means and standard deviations). Explicit geometric representation of tree canopies is forgone in favor of bounded statistical growth models based theoretically on Liebig’s Law of the Minimum. Functions for growth, mortality, and decay are adopted from Century [19] while hydrology is based on the simple bucket model [58]. The regeneration function is the only new process in NECN and is also based on Liebig’s Law. For a detailed description of the NECN model, readers may refer to the original model publication [42]. Parameterization of the LANDIS-II model for both sites was based on updating parameters used in recent [59–62] and ongoing (Flanagan et al., in review) work.
2.1.2. PPA-SiBGC
The PPA-SiBGC model belongs to the SORTIE-PPA family of models [33,36] within the SAS-PPA model genre, based on a simple and analytically tractable approximation of the classical SORTIE gap model [63,64]. The Perfect Plasticity Approximation, or PPA [33,34], was derived from the dual assumptions of perfect crown plasticity (e.g., space-filling) and phototropism (e.g., stem-leaning), both of which were supported in empirical and modeling studies [36]. The discovery of the PPA was rooted in extensive observational and in silico research [33]. The PPA model was designed to overcome the most computationally challenging aspects of gap models in order to facilitate model scaling from the landscape to global scale.
The PPA and its predecessor, the size-and-age structured (SAS) equations [30,65], are popular model reduction techniques employed in current state-of-the-art terrestrial biosphere models [13]. The PPA model can be thought of metaphorically as Navier-Stokes equations of forest dynamics, capable of modeling individual tree population dynamics with a one-dimensional von Foerster partial differential equation [33]. The simple mathematical foundation of the PPA model is provided in Equation 1. where k is the number of species, j is the species index, Nj(z) is the density of species j at height z, Aj(a*, z) is the projected crown area of species j at height z, and dz is the derivative of height. In other words, we discard the spatial location of individual trees and calculate the height at which the integral of tree crown area is equal to the ground area of the stand. This height is known as the theoretical z* height, which segments trees into overstory and understory classes [33].
The segmentation of the forest canopy into understory and overstory layers allows for separate coefficients or functions for growth, mortality, and fecundity to be applied across strata, whose first moment accurately approximates the dynamics of individual-based forest models. Recent studies have shown that the PPA model faithfully reduces the dynamics of the more recent neighborhood dynamics (ND) SORTIE-ND gap model [66] and is capable of accurately capturing forest dynamics [67,68].
In this work, we applied a simple biogeochemistry variant of the SORTIE-PPA model, PPA-SiBGC [Erickson and Strigul, In Review] Figure 2.
Empirical observations were relied upon for the C and N content of tree species compartments. Stoichiometric relations were used to estimate N from C, based on empirical measurements provided for both sites. All values were calculated directly from observations. Previously published equations [71] and parameters [72] were used to model crown allometry. Together with inventory data, general biomass equations were used to estimated dry weight mass (kg) for tree stems, branches, leaves, and, fine and coarse roots [73]. Carbon content is assumed to be 50% of dry mass, supported by data. Monthly soil respiration is modeled using the approach of Raich et al. [69], while soil organic C is modeled using the simple generalized approach of Domke et al. [70]. Species- and stratum-specific parameters for growth, mortality, and fecundity were calculated from observational data available for both sites.
2.2. Site Descriptions
In the following sections, we describe the two forested sites on the East Coast of the United States: HF-EMS and the JERC-RD. A critical factor in the selection of the sites was the availability of eddy covariance flux tower data needed to validate NEE in the models.
2.2.1. HF-EMS
The HF-EMS EC flux tower is located within the Little Prospect Hill tract of Harvard Forest (42.538°N, 72.171°W, 340 m elevation) in Petersham, Massachusetts, approximately 100 km from the city of Boston [40]. The tower has been recording NEE, heat, and meteorological measurements since 1989, with continuous measurements since 1991, making it the longest-running eddy covariance measurement system in the world. The site is currently predominantly deciduous broadleaf second-growth forests approximately 75-95 years in age, based on previous estimates [74]. Soils at Harvard Forest originate from sandy loam glacial till and are reported to be mildly acidic [40].
The site is dominated by red oak (Quercus rubra) and red maple (Acer rubrum) stands, with sporadic stands of Eastern hemlock (Tsuga canadensis), white pine (Pinus strobus), and red pine (Pinus resinosa). When the site was established, it contained 100 Mg C ha−1 in live aboveground woody biomass [74]. As noted by Urbanski et al. [40], approximately 33% of red oak stands were established prior to 1895, 33% prior to 1930, and 33% before 1940. A relatively hilly and undisturbed forest (since the 1930s) extends continuously for several km2 around the tower. In 2000, harvest operations removed 22.5 Mg C ha−1 of live aboveground woody biomass about 300 m S-SE from the tower, with little known effect on the flux tower measurements. The 40 biometric plots were designated via stratified random sampling within eight 500 m transects Urbanski et al. [40]. The HF-EMS tower site currently contains 34 biometric plots at 10 m radius each, covering 10,681 m2, or approximately one hectare, in area. Summary statistics for the EMS tower site are provided in Table 1.
A table of observed species abundances for the 2002-2012 period are provided in Table 2.
Data were collected here for a range of studies, as evidenced by the Harvard Forest Data Archive. Datasets used in model validation include HF001-04, HF004-02, HF069-09, HF278-04, HF069-06, HF015-05, HF006-01, and HF069-13. These include weather station and forest inventory time-series, eddy covariance flux tower measurements, soil respiration, soil organic matter, and studies on C:N stoichiometry. Standard measurement techniques were used for each. For both sites, local tree species, age, depth-at-breast-height (DBH), biomass, soil, and meteorological data were primarily used to parameterize the models.
2.2.2. JERC-RD
Jones Ecological Research Center at Ichauway is located near Newton, Georgina, USA (31°N, 84°W, 25-200 m elevation). The site falls within the East Gulf Coastal Plain and consists of flat to rolling land sloping to the southwest. The region is characterized by a humid subtropical climate with temperatures ranging from 5-34 °C and precipitation averaging 132 cm year-1. The overall site is 12,000 ha in area, 7,500 ha of which are forested [75]. The site also exists within a tributary drainage basin that eventually empties into the Flint River. Soils here are underlain by karst Ocala limestone and mostly Typic Quartzipsamments, with sporadic Grossarenic and Aquic Arenic Paleudults [76]. Soils here often lack well-developed organic horizons [75–77].
Forests here are mostly second-growth, approximately 65-95 years in age. Long-leaf pine (Pinus palustris) dominates the overstory, while the understory is comprised primarily of wiregrass (Aristida stricta) and secondarily of shrubs, legumes, forbs, immature hardwoods, and regenerating long-leaf pine forests [78]. Prescribed fire is a regular component of management here, with stands often burned at regular 1-5 year intervals [75]. This has promoted wiregrass and legumes in the understory, while reducing the number of hardwoods [75]. The RD EC flux tower is contained within the mesic/intermediate sector. This site consists of only four primary tree species from two genera: long-leaf pine (Pinus palustris), water oak (Quercus nigra), southern live oak (Quercus virginiana), and bluejack oak (Quercus incana). Measurements for the RD tower are available for the 2008-2013 time period. Summary statistics for the RD tower site are provided in Table 3.
A table of observed species abundances for the 2009-2013 period are provided in Table 4.
Datasets used in model validation at JERC-RD include JC010-02, JC010-01, JC003-04, JC004-01, JC003-07, and JC011-01. These include weather station and eddy covariance flux tower measurements, forest inventory data, soil respiration, soil organic matter, and studies on C:N stoichiometry. Standard measurement techniques were also used for each of these.
2.3. Site Data
To conduct this model intercomparison exercise at HF-EMS, we leveraged the large amount of data openly available to the public through the Harvard Forest Data Archive:
http://harvardforest.fas.harvard.edu/harvard-forest-data-archive
Jones Ecological Research Center has hosted multiple research efforts over the years, collectively resulting in the collection of a large data library. However, JERC-RD site data are not made openly available to the public and are thus only available by request. One may find contact information located within their website:
2.4. Scales, Metrics, and Units
The selection of simulation years was based on the availability of EC flux tower data used in model validation. Thus, we simulated the HF-EMS site for the years 2002-2012 and the JERC-RD site for the years 2009-2013. For both sites and models, we initialized the model state in the first year of simulations using field observations. The PPA-SiBGC model used an annual timestep while LANDIS-II NECN used a monthly timestep internally. Both models may be set to other timesteps if desired.
The areal extent of the single-site model intercomparisons were designed to correspond to available field measurements. At both sites, tree inventories were conducted in 10,000 m2, or one-hectare, areas. All target metrics were converted to an annual areal basis to ease interpretation, comparison, and transferability of results. Importantly, an areal conversion will allow comparison to other sites around the world. While flux tower measurements for both sites were already provided on an areal (m−2) basis, many other variables were converted to harmonize metrics between models and study sites. For example, moles CO2 measurements were converted to moles C through well-described molecular weights, all other measures of mass were converted to kg, and all areal and flux measurements were harmonized to m−2. A table of metrics and units used in the intercomparison of LANDIS-II and PPA-SiBGC is provided in Table 5.
In the subsequent section, we describe the model intercomparison methodology.
2.5. Model Intercomparison
Intercomparison of the PPA-SiBGC and LANDIS-II models at the HF-EMS and JERC-RD EC flux tower sites was conducted using a collection of object-oriented functional programming scripts written in the R language for statistical computing [79]. These scripts were designed to simplify model configuration, parameterization, operation, calibration/validation, plotting, and error calculation. The scripts and our parameters are available on GitHub (https://github.com/adam-erickson/ecosystem-model-comparison), making our results fully and efficiently reproducible. The R scripts are also designed to automatically load and parse the results from previous model simulations, in order to avoid reproducibility issues stemming from model stochasticity. We use standard regression metrics applied to the time-series of observation and simulation data to assess model fitness. The metrics used include the coefficient of determination (R2), root mean squared error (RMSE), mean absolute error (MAE), and mean error (ME) or bias, calculated using simulated and observed values. Our implementation of R2 follows the Bravais-Pearson interpretation as the squared correlation coefficient between observed and predicted values [80]. This implementation is provided in Equation 2. where n is the sample size, yi is the ith observed value, is the ith predicted value, is the mean observed value, and is the mean predicted value. The calculation of RMSE follows the standard formulation, as shown in Equation 3. where n is the sample size and et is the error for the tth value, or the difference between observed and predicted values. The calculation of MAE is similarly unexceptional, per Equation 4. where again n is the sample size and et is the error for the tth value. Our calculation of mean error (ME) or bias is the same as MAE, but without taking the absolute value.
While Nash-Sutcliffe efficiency (NSE) is often used in a simulation model context, we selected the Bravais-Pearson interpretation of R2 over NSE to simplify the interpretation of results. The NSE metric replaces 1 – (SSpredictions /SSobservations) with (SSobservations – SSpredictions)/SSobservations, where SS is the sum of squares. Thus, NSE is analogous to the standard R coefficient of determination used in regression analysis [81]. The implementation of R2 that we selected is important to note, as its results are purely correlative and quantify only dispersion, ranging in value between 0 and 1. This has some desirable properties in that no negative or large values are produced, and that it is insensitive to differences in scale. Regardless of the correlation metric used, complementary metrics are needed to quantify the direction (i.e., bias) and/or magnitude of error. We rely on RMSE and MAE to provide information on error or residual magnitude, and ME to provide information on bias. We utilize a visual analysis to assess error directionality over time, as this can be poorly characterized by a single coefficient, masking periodicity.
We compute R2, RMSE, MAE, and ME for time-series of the metrics described in Table 5 on page 10. These include NEE, above- and below-ground biomass, C, and N, soil organic C and N, soil respiration (rsoil), aboveground net primary production (ANPP), and, species aboveground biomass and relative abundance. All of these metrics are pools with the exception of NEE, rsoil, and ANPP fluxes. Finally, we diagnose the ability of both models to meet a range of logistical criteria related to deployment: model usability, performance, and transferability. Model usability is assessed per four criteria:
Ease of installation
Ease of parameterization
Ease of program operation
Ease of parsing outputs
Model software performance is assessed per a single metric: the speed of program execution for each site for the predefined simulation duration. The durations are 11 years and 5 years for the HF-EMS and JERC-RD EC flux tower sites, respectively. Simulation results are output at annual temporal resolution, the standard resolution for both models; while NECN operates on a monthly timestep, most other modules of LANDIS-II are annual. Finally, model transferability is assessed per the following five criteria:
Model generalizability
Availability of parameterization data
Size of the program
Cross-platform support
Ease of training new users
Each of these logistical criteria are compared in a qualitative analysis, with the exception of software performance.
3. Results and Discussion
Both PPA-SiBGC and LANDIS-II NECN showed strong performance for pools at the two model intercomparison sites, frequently achieving R2 values approaching unity. Yet, both models showed weak performance for fluxes. The models failed to accurately predict ANPP, while PPA-SiBGC showed stronger rsoil performance and LANDIS-II NECN showed stronger NEE performance. The R2 values for both models and sites are visualized in Figure 3.
On average, PPA-SiBGC outperformed LANDIS-II NECN across the sites and metrics tested, showing higher correlations, lower error, and less bias overall . This result is based on calculating mean values for R2, RMSE, MAE, and ME in order to clearly translate the overall results. The two models produced the following mean values for each of the four statistical metrics and two sites:
As shown in Table 6, PPA-SiBGC yielded higher R2 values and lower RMSE, MAE, and ME values in comparison to LANDIS-II, on average, across all sites and metrics tested. Below, we provide model intercomparison results individually for the two sites, HF-EMS and JERC-RD.
3.1. HF-EMS
For the HF-EMS site, PPA-SiBGC showed higher R2 values and lower RMSE, MAE, and ME values compared to LANDIS-II NECN across the range of metrics. While PPA-SiBGC predicted NEE and species relative abundance showed weaker correlations with observed values compared to LANDIS-II NECN, the magnitude of error was lower, as evidenced by lower RMSE, MAE, and ME values. While LANDIS-II NECN showed a lower magnitude of error for belowground N, this is the only metric where this is the case, while the correlation of this metric to observed values was also lower than that of PPA-SiBGC. Overall results for the HF-EMS site model intercomparison are shown in Table 7.
Time-series figures allow a visual analysis of the temporal dynamics between observations and model predictions in order to assess the ability of models to capture interannual variability. Both models effectively captured temporal dynamics in biomass, C, and, species biomass and abundance. In Figure 4, the temporal differences in modeled NEE and aboveground C are shown for the two models in comparison to observations for the HF-EMS site. While LANDIS-II NECN predicted NEE showed a higher correlation with observations, the magnitude of error and bias were also higher. Furthermore, LANDIS-II NECN predicted that the HF-EMS site is a net C source, rather than sink, in contrary to observations. Meanwhile, PPA-SiBGC outperformed LANDIS-II NECN in aboveground C per both R2 and RMSE. Both models overpredicted species cohort biomass, while LANDIS-II NECN underpredicted total aboveground C.
An analysis of simulated species biomass and abundance also shows greater fidelity of the PPA-SiBGC model to data, as shown in Figure 5. As LANDIS-II NECN does not contain data on individual trees, species relative abundance is calculated based on the number of cohorts of each species. Two species were simulated in LANDIS-II NECN, as there are no explicit trees in the model and the number of cohorts appears to have no effect on the total biomass. Results for PPA-SiBGC indicate that species relative abundance may be improved in future studies by optimizing mortality and fecundity rates. Meanwhile, species biomass predictions output by LANDIS-II NECN were inverted from those of the observations.
3.2. JERC-RD
For the JERC-RD site, both models showed stronger fidelity to data than for the HF-EMS site. Again, PPA-SiBGC showed higher R2 values and lower RMSE and MAE values compared to LANDIS-II NECN across the range of metrics tested. Yet, the margin between models was smaller for the JERC RD site. While PPA-SiBGC demonstrated higher correlations and lower errors for most metrics tested, LANDIS-II NECN outperformed PPA-SiBGC in a few cases. This includes lower error magnitude for NEE, aboveground N, belowground biomass, SOC, and SON. However, PPA-SiBGC showed correlations equal or higher for all metrics tested, and lower errors for all other metrics. Overall results for the JERC-RD site model intercomparison are shown in Table 8.
While both models showed higher performance at the JERC-RD site, an analysis of simulated species biomass and abundance again indicates greater fidelity of the PPA-SiBGC model to data, as shown in Figure 6. While LANDIS-II NECN overpredicts the rate of longleaf pine growth, PPA-SiBGC nearly perfectly matches observed species abundance and biomass trajectories for all species present. While the correlations are high, PPA-SiBGC overpredicts the magnitude of biomass here.
Our results for the HF-EMS and JERC-RD site model intercomparison exercise clearly indicate strong performance for both models at both sites. Results for the JERC-RD site are particularly close between the two models. Next, we assess results related to the logistics of model deployment to new computers, users, and modeling sites.
3.3. Model Usability, Performance, and Transferability
While the two models share a similar basis in forest dynamics and biogeochemistry modeling, they differ in important practical and conceptual terms. The command-line version of the PPA-SiBGC model used in this work, version 5.0, consists of approximately 500 lines of R code and is thus readily cross-platform, including cloud providers. Meanwhile, the LANDIS-II model core and NECN succession extension are an estimated 2,000 and 0.5 million lines of code, respectively. While this version of PPA-SiBGC fuses an explicit tree canopy geometry model with empirical data on fecundity, growth, mortality, and stoichiometry, the NECN extension of LANDIS-II borrows heavily from the process-based Century model [20], similar to the MAPSS-Century-1 (MC1) model [82]. This carries important implications for model parameterization needs. While PPA-SiBGC relies on typical forest inventory data, including tree species, age/size, and densities, LANDIS-II relies on species age/size and traits in the form of vital attributes, in addition to NECN parameters. Below, we summarize our findings regarding the logistics of model deployment.
3.3.1. Model Usability
In the following section, we provide an assessment of model usability based on four criteria.
Ease of installation
While LANDIS-II NECN requires the installation of two Windows programs, depending on the options desired, PPA-SiBGC is contained in a single R script and requires only a working R installation.
Ease of parameterization
While both models can be difficult to parameterize for regions with little to no observational data, the simple biogeochemistry in PPA-SiBGC requires an order of magnitude fewer parameters than LANDIS-II NECN. In addition, PPA-SiBGC uses commonly available forest inventory data while NECN requires a number of parameters that may be difficult to locate.
Ease of program operation
Both models use a command-line interface and are thus equally easy to operate. Yet, PPA-SiBGC is cross-platform and uses comma-separated-value (CSV) files for input tables, which are easier to work with than multiple tables nested within an unstructured text files. This additionally allows for simplification in designing model application programming interfaces (APIs), or model wrappers, a layer of abstraction above the models. These abstractions are important for simplifying model operation and reproducibility, and enable a number of research applications.
Ease of parsing outputs
All PPA-SiBGC outputs are provided in CSV files in a single folder while LANDIS-II NECN generates outputs in multiple formats in multiple folders. While the PPA-SiBGC format is simpler and easier to parse, the image output formats used by LANDIS-II carry considerable benefit for spatial applications. Both models may benefit by transitioning spatiotemporal data to the NetCDF scientific file format used by most general circulation and terrestrial biosphere models.
3.3.2. Model Performance
Next, we assess model performance in terms of the speed of operation on a consumer-off-the-shelf (COTS) laptop computer with a dual-core 2.8 GHz Intel Core i7-7600U CPU and 16 GB of DDR4-2400 RAM. We focus on a single performance metric, the timing of simulations. Other aspects of model performance in the form of precision and accuracy are described in previous sections. As shown in Table 9, PPA-SiBGC was between 1,200 and 2,800% faster than LANDIS-II NECN in our timing tests. This was surprising given that PPA-SiBGC models true cohorts (i.e., individual trees) in an interpreted language while LANDIS-II models theoretical cohorts (i.e., cohorts without a physical basis) in a compiled language. The difference in speed is likely attributable to the parsimony of the PPA-SiBGC model.
3.3.3. Model Transferability
Here, we discuss model transferability. In this section, we assess the effort required to transfer the models to new locations, new computer systems, or new users. All three are important logistical criteria for effective model deployment.
Model generalization
Both models appear to generalize effectively to different forested regions of the world, as both have shown strong performance in this study and others. No clear winner is evident in this regard. In terms of model realism, PPA-SiBGC has a more realistic representation of forest canopies while LANDIS-II NECN has more realistic processes, as it is a Century model variant.
Availability of parameterization data
While LANDIS-II NECN requires substantially greater parameterization data compared to PPA-SiBGC, it may often be possible to rely on previously published parameters. Meanwhile, the growth, mortality, and fecundity parameters used by PPA-SiBGC are easy to calculate using common field inventory data. PPA-SiBGC is simpler to transfer in this regard given the wide availability of forest inventory data.
Size of the program
PPA-SiBGC is approximately 500 lines of R code, while LANDIS-II NECN is estimated at 0.5 million lines of C# code.
Cross-platform support
While Linux support may soon be supported with Microsoft .NET Core, LANDIS-II NECN is written in C# and is thus limited to Microsoft Windows platforms. Meanwhile, PPA-SiBGC is written in standard R code and is fully cross-platform.
Ease of training new users
While both models have a learning curve, the practical simplicity of PPA-SiBGC may make it easier to train new users. While LANDIS-II NECN contains more mechanistic processes and related parameters, these come at the cost of confusing new users. The model wrapper library we developed as part of this work vastly eases the operation of both models. Future studies should measure the time required for new users to effectively operate both models.
3.4. Discussion
On average, the PPA-SiBGC model outperformed LANDIS-II NECN for the sites and metrics tested, showing stronger correlations, lower error, and less bias. Despite being a parsimonious model, PPA-SiBGC contains more realistic representation of stand canopy dynamics. Meanwhile, LANDIS-II NECN contains more realistic representation of biogeochemical processes, as a simplified variant of the Century model. These differences together with the results lend support to our hypothesis that vegetation dynamics drive biogeochemical pools to a higher degree than one-dimensional processes, while the latter better capture fluxes. This is evidenced by the higher performance of PPA-SiBGC in predicting pools and LANDIS-II NECN in predicting fluxes.
Empirical coefficients for the first moment of processes (e.g., growth, mortality, and fecundity) were used to parameterize the PPA-SiBGC model, while LANDIS-II NECN required a multitude of parameters for biogeochemical processes, some of which are often treated as tuning parameters. Yet, for this validation and intercomparison exercise, fully mechanistic processes were not required; both models required a host of empirical parameters that limit their prognostic abilities in their current form. Future developments with both models should improve upon this by adopting more mechanistic processes.
Replacing the simple biogeochemistry approach of PPA-SiBGC with mechanistic processes would vastly improve the energetic and biogeochemical realism of the PPA-SiBGC model. Meanwhile, LANDIS-II NECN and other variants of the Century model may improve their structural realism by incorporating canopy representations with a physical basis. This is because Century was not designed to be a plant production model [83] and contains no physically realistic representation of trees or canopies. Currently, the forest production model of Century is based theoretically on Liebig’s Law (i.e., limiting factors), employing allometric and stoichiometric relations with empirical constraints. In other words, plant production is strongly constrained by site-specific limits even though the Century model is mechanistic in other ways. Meanwhile, a lack of canopy representation strongly limits the potential number and resolution of modeled processes. In short, combining the approaches of both models may yield a more optimal solution.
In addition, our results suggest that improving the representation of forest dynamics in models may yield accurate biogeochemistry predictions even when simple allometric and stoichiometric biogeochemistry relations are used, as evidenced by the performance of PPA-SiBGC. This finding supports the current widespread focus on improving the representation of vegetation dynamics in global terrestrial biosphere models [27,28,84]. The PPA provides a uniquely efficient and tractable manner of incorporating three-dimensional canopy dynamics in global models. While many mechanistic processes may be considered mature, models of soil respiration are critical to accurate C projections and require continued development [18]. In order to achieve high levels of accuracy and precision, given the recent success of aboveground model approximations, next-generation models may require a similar breakthrough in belowground processes.
Future studies should expand upon the PPA with a first-principles representation of energetic and biogeochemical above- and below-ground processes in a modern component-based software framework. This work should fuse the new state-of-the-art forest biogeochemistry model with a model wrapper API written in R or Python, in order to expand native model functions to include sampling from parameter probability distributions, Monte Carlo methods, machine-learning model emulation, robust loss functions, and optimization. This would combine a high-performance forest model written in a compiled language with a user-friendly interface in an interpreted language.
3.5. Limitations
This study, similar to other forest modeling studies, was limited by the availability of observational data. The lack of temporal depth in this data poses substantial challenges in modeling the long-term effects of forest succession, as these processes can operate on a century timescale or longer. However, assessing succession predictions was not the aim of this study, as we instead focus on near-term validation of forest models using field measurements and EC flux tower data. Another limitation is that these methods may be challenging to implement for sites that are less well-characterized, particularly in the absence of EC flux tower data and/or tree species parameters. A combination of tower-based and remote sensing observations may help overcome this challenge in the coming years with advances in machine learning.
4. Conclusions
In conclusion, the PPA-SiBGC and LANDIS-II NECN models represent vegetation dynamics previously absent in modeling studies at these sites. These include, “…long-term increases in tree biomass, successional change in forest composition, and disturbance events, processes not well represented in current models,” which drive interannual variation in NEE [40]. While the timescale of our simulations were decidedly short-term due to data limitations, both models showed good performance. While PPA-SiBGC showed stronger performance across the range of metrics tested, including the logistics of model deployment, LANDIS-II NECN also performed well across the metrics tested. Further studies are needed to compare more aspects of these and other models based on an array of performance criteria.
Ultimately, we hope that this study serves as the foundation for future forest ecosystem model intercomparisons for the North American continent, similar in spirit to the former TDE Ecosystem Model Intercomparison project [24]. This may help create the impetus for a Global Forest Model Intercomparison Project (ForestMIP) together with modeling groups on other continents. The aims of this research were not to determine which model is ‘best’ for prognosis at two locations, but to improve the capabilities of existing models across a range of locations in order to advance earth system models. In this regard, there are beneficial aspects to both modeling approaches and the trade-offs presented largely depend on the desired application. Counter to the classical modeling trade-off of Levins [21], improvements in precision and generality resulted from realism.
Author Contributions
Individual contributions provided to complete this work include the following: conceptualization, N.S.; methodology, A.E. and N.S.; software, A.E.; validation, A.E.; formal analysis, A.E.; investigation, A.E.; resources, N.S.; data curation, A.E.; writing—original draft preparation, A.E.; writing—review and editing, A.E. and N.S.; visualization, A.E.; supervision, N.S.; project administration, N.S.; funding acquisition, N.S.
Funding
This research was funded by United States Army Corps of Engineers (USACE) contract number W912HQ-18-C-0007.
Acknowledgments
We would like to thank U.S. Department of Defense, Army Corps of Engineers, and the Environmental Security Technology Certification Program (ESTCP) for providing support necessary to conduct this work. We also thank Harvard University and Jones Ecological Research Center for kindly providing data to conduct the analyses. We would further like to thank those that provided guidance on parameterization of the LANDIS-II model. We thank Drs. Louise Loudermilk of USDA Forest Service and Steven Flanagan of Tall Timbers Research Station for providing guidance on parameterization for the Red Dirt flux tower site at JERC, and Dr. Matthew Duveneck of New England Conservatory for providing guidance on parameterization for the HF-EMS EC flux tower site at Harvard Forest. We also acknowledge Drs. Melissa Lucash and Robert Scheller, who provided comments. Last, we would like to thank Drs. Bradley Case, Hannah Buckley, Audrey Barket-Plotkin, David Orwig, Aaron Ellison and Zachary Robbins for providing some of the crown allometry parameters for Harvard Forest.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Abbreviations
The following abbreviations are used in this manuscript:
- ANPP
- Aboveground net primary production
- API
- Application programming interface
- BGC
- Biogeochemistry
- COST
- Cooperation in Science and Technology
- CPU
- Central processing unit
- CSV
- Comma-separated values
- DoD
- Department of Defense
- EC
- Eddy covariance
- ED
- Ecosystem Demography model
- EMS
- Environmental Measurement Station
- FVS
- Forest Vegetation Simulator
- GPGPU
- General-purpose graphics processing unit
- HF
- Harvard Forest
- IBIS2
- Integrated Biosphere Simulator 2
- JERC
- Jones Ecological Research Center
- L-systems
- Lindenmayer systems
- LANDIS-II
- Landscape Disturbance and Succession model 2
- LM3
- Land Model 3
- LPJ-GUESS
- Lund-Potsdam-Jena General Ecosystem Simulator
- MAE
- Mean absolute error
- MC1
- MAPSS-Century-1 model
- NECN
- Net Ecosystem Carbon and Nitrogen model
- NEE
- Net ecosystem exchange
- NSE
- Nash-Sutcliffe efficiency
- PPA
- Perfect Plasticity Approximation model
- ProFoUnd
- Towards robust projections of European forests under climate change
- RAM
- Random access memory
- RD
- Red Dirt
- RMSE
- Root mean squared error
- SAS
- Size- and age-structured equations
- SOC
- Soil organic carbon
- SON
- Soil organic nitrogen
- TDE
- Throughfall Displacement Experiment
Appendix A. Eddy covariance flux tower measurements
Appendix A.1. HF-EMS EC Flux Tower
Recent historical mean daily fluxes of temperature (°C), ecosystem respiration (µmol CO2 m−2), and NEE (µmol Cm−2) for the HF-EMS tower are shown in Figure A1.
Patterns in daytime and nighttime NEE are shown in Figure A2. This was calculated by taking daily mean NEE values for three-hour windows surrounding noon and midnight, respectively (1100-1300 and 2300-0100 hours). These patterns are important to diagnose, as they demonstrate responses to a gradient of light and temperature conditions.
Appendix A.2. JERC-RD EC Flux Tower
Recent historical mean daily fluxes of latent heat flux (LE) (W m−2), ecosystem respiration (µmol CO2 m−2), and NEE (µmol C m−2) for the RD flux tower are shown in Figure A3.
Patterns of daytime and nighttime NEE are shown in Figure A4. Again, this was calculated by taking daily mean NEE values for three-hour windows surrounding noon and midnight, respectively (1100-1300 and 2300-0100 hours).
Appendix B. Site maps
Below, we provide maps of the two research sites for reference. First is the HF-EMS EC flux tower with landcover classes Figure A5.
Next is the JERC-RD flux tower with landcover classes Figure A6.
Appendix C. Model Parameters
Appendix C.1. HF-EMS
Appendix C.1.1. PPA-SiBGC
Appendix C.1.2. LANDIS-II NECN
Appendix C.2. JERC-RD
Appendix C.2.1. PPA-SiBGC
Appendix C.2.2. LANDIS-II NECN
© 2018 by the authors. Submitted to Journal Not Specified for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.
- 8.
- 9.
- 10.↵
- 11.↵
- 12.
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.
- 61.
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵