Publications
- Google Scholar: https://scholar.google.com/citations?user=aCKgi18AAAAJ&hl=en
- Research Gate: https://www.researchgate.net/profile/Lei-Zhang-360
- ORCID: https://orcid.org/0000-0002-1090-6338
2025
- ESSDMapping global distributions, environmental controls, and uncertainties of apparent top- and subsoil organic carbon turnover timesLei Zhang, Lin Yang, Thomas W. Crowther, and 10 more authorsEarth System Science Data Discussions, Mar 2025
The turnover time (τ) of global soil organic carbon is central to the functioning of terrestrial ecosystems. Yet our spatially-explicit understanding of depth-dependent variations and environmental controls of τ at a global scale remain incomplete. In this study, we combine multiple state-of-the-art observation-based datasets, including over ninety thousand geo-referenced soil profiles, the latest root observations distributed globally, and large amounts of satellite-derived environmental variables, to generate global maps of apparent τ in topsoil (0–0.3 m) and subsoil (0.3–1 m) layers with a spatial resolution of 30 arcsec ( 1 km at the Equator). We show that subsoil τ (385 [20, 3485] years [mean with a variation range from 2.5th to 97.5th percentile]) is over eight times longer than topsoil τ (15 [11, 137] years). The cross-validation shows that the fitted machine learning models effectively captured the variabilities in τ, with R2 values of 0.87 and 0.70 for topsoil and subsoil τ mapping, respectively. The prediction uncertainties of the τ maps were quantified for better user applications. The environmental controls on top- and subsoil τ were investigated at global, biome, and local scales. Our analyses illustrate that how temperature, water availability, physio-chemical properties and depth exert jointly impacts on τ. The data-driven approaches allow us to identify their interactions, thereby enriching our comprehension of mechanisms driving nonlinear τ–environment relationships from global to local scales. The distributions of dominating factors of τ at local scales were mapped for identifying context-dependent controls on τ across different regions. We further reveal that the current Earth system models may underestimate τ by comparing model-derived maps with our observation-derived τ maps. The resulting maps with new insights demonstrated in this study facilitate the future modelling efforts of carbon cycle–climate feedbacks and supporting effective carbon management. The dataset is archived and freely available at https://doi.org/10.5281/zenodo.14560239 (Zhang, 2025).
- JOHApplicability of three remote sensing based soil moisture variables for mapping soil organic matter in areas with different vegetation densitiesChenconghai Yang, Lin Yang, Lei Zhang, and 5 more authorsJournal of Hydrology, Feb 2025
Obtaining accurate spatial information on soil organic matter (SOM) is crucial for understanding global carbon cycle. Digital soil mapping (DSM) has become an effective method for mapping SOM, in which selection of influential environmental covariates plays an important role. Soil moisture (SM) can serve as a potential covariate, especially it can be estimated at large spatial scales thanks to remote sensing. The normalized shortwave-infrared difference bare soil moisture indices (NSDSIs) based on Landsat SWIR bands generated at bare soil period has been employed in SOM mapping previously. However, soil is usually covered by vegetation, it is thus necessary to develop new SM indices applicable to areas covered with vegetation, and examine how SM indices perform in areas with different vegetation densities. In this paper, we developed a new SM index by introducing NSDSIs to the Optical TRApezoid Model (OPTRAM-NSDSI), and compared it with the original OPTRAM with the shortwave infrared transformed reflectance (OPTRAM-STR), as well as NSDSIs. SM indices were generated across two study areas, i.e. Zhuxi, Fujian (104 samples and 43.93 km2 with forestland and farmland as main land uses) and Heshan, Heilongjiang (106 samples and 60 km2 with primarily farmland) in China. The Integrated Nested Laplace Approximation with the Stochastic Partial Differential Equation approach was utilized as the SOM prediction model. The results suggest that adding SM variables into the commonly-used environmental covariates improves the prediction accuracies. The highest accuracy improvement of 26.8% in terms of Lin’s concordance correlation coefficient in Zhuxi is obtained by NSDSIs, and the highest improvement of 56.7% in Heshan is obtained by OPTRAM-NSDSI. This may indicate that OPTRAM-NSDSI is more effective in areas with higher vegetation densities while NSDSIs in areas with lower densities. Furthermore, the optimal image dates for SM estimation are probably at the vegetation “green-up” stage. This study provides a reference for using SM information to improve SOM mapping in areas covered with vegetation.
- ESEHydrology, vegetation, and soil properties as key drivers of soil organic carbon in coastal wetlands: A high-resolution studyMao Guo, Lin Yang, Lei Zhang, and 3 more authorsEnvironmental Science and Ecotechnology, Jan 2025
Coastal wetlands are important blue carbon ecosystems that play a significant role in the global carbon cycle. However, there is insufficient understanding of the variations in soil organic carbon (SOC) stocks and the mechanisms driving these ecosystems. Here we analyze a comprehensive multi-source dataset of SOC in topsoil (0–20 cm) and subsoil (20–100 cm) across 31 coastal wetlands in China to identify the factors influencing their distribution. Structural equation models (SEMs) reveal that hydrology has the greatest overall effect on SOC in both soil layers, followed by vegetation, soil properties, and climate. Notably, the mechanisms driving SOC density differ between the two layers. In topsoil, vegetation type and productivity directly impact carbon density as primary sources of carbon input, while hydrology, primarily through seawater salinity, exerts the largest indirect influence. Conversely, in subsoil, hydrology has the strongest direct effect on SOC, with seawater salinity also influencing SOC indirectly through soil and vegetation mediation. Soil properties, particularly pH, negatively affect carbon accumulation, while climate influences SOC indirectly via its effects on vegetation and soil, with a diminishing impact at greater depths. Using Random Forest, we generate high-resolution maps (90 m × 90 m) of topsoil and subsoil carbon density (R2 of 0.53 and 0.62, respectively), providing the most detailed spatial distribution of SOC in Chinese coastal wetlands to date. Based on these maps, we estimate that SOC storage to a depth of 1 m in Chinese coastal wetlands totals 74.58 ± 3.85 Tg C, with subsoil carbon storage being 2.5 times greater than that in topsoil. These findings provide important insights into mechanism on driving spatial pattern of blue carbon and effective ways to assess carbon status on a national scale, thus contributing to the advancement of global blue carbon monitoring and management.
2024
- ESTQuantifying the cooling effect of urban greening driven by ecological restoration projects in ChinaDong Xu, Tingting Bai, Lin Yang, and 19 more authorsEnvironmental Science & Technology, Nov 2024
Urban greening (UG) affects local climate by altering surface energy balance, while long-term UG cooling potential, patterns, and contribution to curbing urban warming remain unclear. Here, we designed an novel statistical model to evaluate the cooling potential of UG (CPUG) and created the first CPUG map for China. By exploring the trends in observed and simulated urban surface temperatures (UST), we quantified the CPUG of 0.20 K over the past two decades, which slowed down the warming trend by 14.17% in Chinese cities. We found that the CPUG varied significantly between the urban core and sprawl areas. Specifically, the CPUG in the urban core was approximately 1.01 K, and it contributed to curbing urban warming by 56.08%, which was more than 7.2 times higher than in the sprawl areas, where the CPUG was only 0.14 K and contributed to curbing urban warming by 9.93%. We further revealed that urbanization and major ecological restoration projects are the key factors influencing CPUG, emphasizing the need for anthropogenic vegetation management to curb urban warming. The proposed model in this study provides a powerful tool for quantitatively assessing the impact of long-term UG trends on urban warming. The results of the study are an important reference for building climate-adaptive cities.
- STOTENUsing process-oriented model output to enhance machine learning-based soil organic carbon prediction in space and timeLei Zhang, Gerard B.M. Heuvelink, Vera L. Mulder, and 3 more authorsScience of The Total Environment, Apr 2024
Monitoring and modelling soil organic carbon (SOC) in space and time can help us to better understand soil carbon dynamics and is of key importance to support climate change research and policy. Although machine learning (ML) has attracted a lot of attention in the digital soil mapping (DSM) community for its powerful ability to learn from data and predict soil properties, such as SOC, it is better at capturing soil spatial variation than soil temporal dynamics. By contrast, process-oriented (PO) models benefit from mechanistic knowledge to express physiochemical and biological processes that govern SOC temporal changes. Therefore, integrating PO and ML models seems a promising means to represent physically plausible SOC dynamics while retaining the spatial prediction accuracy of ML models. In this study, a hybrid modelling framework was developed and tested for predicting topsoil SOC stock in space and time for a regional cropland area located in eastern China. In essence, the hybrid model uses predictions of the PO model in unsampled years as additional training data of the ML model, with a weighting parameter assigned to balance the importance of SOC values from the PO model and real measurements. The results indicated that temporal trends of SOC stock modelled by PO and ML models were largely different, while they were notably similar between the PO and hybrid models. Cross-validation showed that the hybrid model had the best performance (RMSE = 0.29 kg m−2), with a 19 % improvement compared with the ML model. We conclude that the proposed hybrid framework not only enhances space-time soil carbon mapping in terms of prediction accuracy and physical plausibility, it also provides insights for soil management and policy decisions in the face of future climate change and intensified human activities.
- AGEEMajor contributions of agricultural management practices to topsoil organic carbon distribution and accumulation in croplands of East China over three decadesYue Pu, Lin Yang, Lei Zhang, and 3 more authorsAgriculture, Ecosystems and Environment, Jan 2024
Multiple natural and anthropogenic factors jointly drive the spatial distribution of soil organic carbon (SOC) and its dynamics in croplands. Among these factors, agricultural management practices have caused considerable impacts. Previous studies on the driving factors of SOC in croplands have provided significant understanding on this matter. However, whether and how the effects and interplay of these drivers change over time is often unknown, especially agricultural activities. To measure the effects of agricultural management practices on the spatial distribution and temporal change of SOC incorporating the network relationships with other natural drivers at the regional scale, we conducted partial least squares path analysis on the topsoil organic carbon content using two historical soil datasets from cropland samples in East China obtained during the 1980s and the 2010s. Eight indicators and their temporal changes reflecting climate, agricultural management, and edaphic conditions were used to quantify the driving mechanisms of the spatial distribution and temporal change of SOC. The drivers of SOC distribution showed that high SOC was mostly distributed in soils with a relatively low pH and high clay content in warm humid climates. High SOC was associated with the application of N fertilizer, crop residue input, and agricultural machinery in the 1980s and 2010s. The effect of N fertilization on SOC distribution increased from the 1980s to the 2010s, whereas the total edaphic effects significantly decreased from 0.62 to 0.25 (P \textless 0.01). Regarding the drivers of SOC change over the three decades, the edaphic effects presented the strongest effect (path coefficient of 0.74, P \textless 0.01), including the negative effects of topsoil acidity and baseline SOC level, as well as the positive effects of total nitrogen (TN) and clay content change. Croplands with lower intensity of management practices in the 1980s generally attained more development in agricultural modernization, which led to a considerable SOC increase. Our research revealed the importance of agricultural management practices on the spatial distribution and temporal change of cropland SOC at the regional scale. The results emphasize the need to measure the changing driving mechanisms of SOC dynamics. The findings indicate that indicators reflecting the effects of agricultural management practices should be included in digital mapping and process-based modeling of soil carbon at different spatiotemporal scales to improve prediction accuracy.
2023
- IJGISAn adaptive uncertainty-guided sampling method for geospatial prediction and its application in digital soil mappingLei Zhang, A-Xing Zhu, Junzhi Liu, and 3 more authorsInternational Journal of Geographical Information Science, Feb 2023
Sampling design can significantly reduce the uncertainty in geospatial predictions. In this paper, we developed an adaptive uncertainty-guided stepwise sampling (AUGSS) method to select sampling locations to supplement existing legacy sample points whose representation should be improved. The proposed method selects supplemental samples in a stepwise manner as guided by an objective function with two weighted sub-objectives. One reduces the area with high prediction uncertainty, and the other minimizes the overall prediction uncertainty for the entire area. The method takes an adaptive approach to adjust weights for the two sub-objectives and to tune an uncertainty threshold controlling whether a location can be reliably predicted during the sampling procedure. A case study on soil property prediction shows that AUGSS outperforms the stratified random sampling (SRS) and the non-adaptive uncertainty guided sampling method (UGSS) in terms of RMSE and Lin’s concordance correlation coefficient with different sample sizes. This study shows that the AUGSS method offers a potential for effectively adding supplemental samples to existing samples which are insufficient for spatial prediction. The adaptive strategy guided by predicted uncertainty provides an efficient support to improve the spatial pattern of samples, which plays a key role in the result accuracy of geospatial predictive mapping.
- GeodermaSoil organic matter mapping using INLA-SPDE with remote sensing based soil moisture indices and Fourier transforms decomposed variablesChenconghai Yang, Lin Yang, Lei Zhang, and 1 more authorGeoderma, Sep 2023
Generating accurate spatial information on soil organic matter (SOM) is increasingly important in the context of global environmental change. Both prediction models and environmental covariates influence the mapping results and accuracy, making them important factors in SOM mapping. The Bayesian spatial model INLA-SPDE is an emerging model, that has shown potential in digital soil mapping (DSM), but its application is still limited. Soil moisture, which affects soil water status and the decomposition of SOM, can be a potential predictor for mapping SOM. However, the difficulty of obtaining soil moisture measurements over a large area using ground-based methods hinders its application. Recently, high spatial resolution remote sensing (RS) has provided a possible way to generate soil moisture indices over a large area. However, the effectiveness of RS-based soil moisture indices on SOM mapping is unknown. Fourier transforms decomposed (FTD) variables based on vegetation indices have been proven effective in detecting time-series patterns of crop growth, thereby improving the mapping accuracy of farmland. Yet, the effectiveness of FTD variables has not been verified in other vegetation-covered areas. This paper examines the use of INLA-SPDE with three RS-based soil moisture indices (NSDSIs) and six FTD variables for SOM mapping compared to Random Forest (RF), in a study area with diverse vegetation cover in Anhui Province, China. The finding indicates that with the optimal combination of environmental covariates, INLA-SPDE yields a higher prediction accuracy than RF, with an increase of 18% in R2. Either the RS-based soil moisture indices covariates or the FTD variables are effective in mapping SOM. When compared to using only natural environmental covariates, the best combination including RS-based soil moisture indices and FTD variables improved the mapping accuracy by 25% in terms of R2, 21% of LCCC, and 11% of RMSE. Furthermore, quantitative prediction uncertainty maps are derived based on the INLA-SPDE. This study demonstrates the effectiveness of INLA-SPDE model with the RS-based soil moisture indices and Fourier transforms decomposed variables for SOM mapping.
- JEMQuantifying the direct effects of long-term dynamic land use intensity on vegetation change and its interacted effects with economic development and climate change in jiangsu, ChinaFeixue Shen, Lin Yang, Lei Zhang, and 3 more authorsJournal of Environmental Management, Jan 2023
Vegetation change reflects sensitive responses of ecosystem environment to global climate change as well as land use. It is well known that land use type and its transformation affect vegetation change. However, how the changes in land use intensity (LUI) within different land use types impact vegetation and the interactions with other drivers remain poorly understood. We measured the LUI of Jiangsu Province, China, within the main land use types in 1995, 2000, 2005, 2010, 2015 and 2018 by combining remote sensing-based land use data with representative county scale economic and social indicators. Structural equation models (SEMs) were built to quantify the influences of long term LUI on vegetation change interacting with economic development, climate change and topographical conditions in transformed land, cropland, rural settlements and urbanized land, respectively. Seventy percent of significant vegetation change existed in non-transformed land use types. Although the area with a vegetation greening trend is larger than that with a vegetation browning trend, the vegetation browning areas is prominent in urbanized lands and some croplands in south basins. The constructed SEMs suggested the dominant negative effect of fast economic development regardless of land use types, while LUI played important and different direct and indirect effects on affecting vegetation change significantly interacting with economic development and climate change in different land use types. The LUI increasing led a vegetation greening in cropland, and stronger than climate warming with both positive direct and indirect effects for influencing climate change. The LUI change took negative effects on vegetation change in rural and urban areas, while a positive indirect effect of LUI increasing in urbanized land signaled the positive results of human managements. We then provided some land use-specific suggestions on basin scale for land management in Jiangsu. Our results highlight the necessity of long-term LUI quantification and promote the understanding of its effects on vegetation change interacted with other drivers within different land use types. This can be very helpful for sustainable land use and managements in regions with fast economic development.
2022
- Remote SensingA CNN-LSTM model for soil organic carbon content prediction with long time series of MODIS-based phenological variablesLei Zhang, Yanyan Cai, Haili Huang, and 3 more authorsRemote Sensing, Jul 2022
The spatial distribution of soil organic carbon (SOC) serves as critical geographic information for assessing ecosystem services, climate change mitigation, and optimal agriculture management. Digital mapping of SOC is challenging due to the complex relationships between the soil and its environment. Except for the well-known terrain and climate environmental covariates, vegetation that interacts with soils influences SOC significantly over long periods. Although several remote-sensing-based vegetation indices have been widely adopted in digital soil mapping, variables indicating long term vegetation growth have been less used. Vegetation phenology, an indicator of vegetation growth characteristics, can be used as a potential time series environmental covariate for SOC prediction. A CNN-LSTM model was developed for SOC prediction with inputs of static and dynamic environmental variables in Xuancheng City, China. The spatially contextual features in static variables (e.g., topographic variables) were extracted by the convolutional neural network (CNN), while the temporal features in dynamic variables (e.g., vegetation phenology over a long period of time) were extracted by a long short-term memory (LSTM) network. The ten-year phenological variables derived from moderate-resolution imaging spectroradiometer (MODIS) observations were adopted as predictors with historical temporal changes in vegetation in addition to the commonly used static variables. The random forest (RF) model was used as a reference model for comparison. Our results indicate that adding phenological variables can produce a more accurate map, as tested by the five-fold cross-validation, and demonstrate that CNN-LSTM is a potentially effective model for predicting SOC at a regional spatial scale with long-term historical vegetation phenology information as an extra input. We highlight the great potential of hybrid deep learning models, which can simultaneously extract spatial and temporal features from different types of environmental variables, for future applications in digital soil mapping.
- Science AdvancesDirect and indirect impacts of urbanization on vegetation growth across the world’s citiesLei Zhang, Lin Yang, Constantin M. Zohner, and 7 more authorsScience Advances, Jul 2022
Urban environments, regarded as “harbingers” of future global change, may exert positive or negative impacts on urban vegetation growth. Because of limited ground-based experiments, the responses of vegetation to urbanization and its associated controlling factors at the global scale remain poorly understood. Here, we use satellite observations from 2001 to 2018 to quantify direct and indirect impacts of urbanization on vegetation growth in 672 worldwide cities. After controlling for the negative direct impact of urbanization on vegetation growth, we find a widespread positive indirect effect that has been increasing over time. These indirect effects depend on urban development intensity, population density, and background climate, with more pronounced positive effects in cities with cold and arid environments. We further show that vegetation responses to urbanization are modulated by a cities’ developmental status. Our findings have important implications for understanding urbanization-induced impacts on vegetation and future sustainable urban development. Positive indirect effects of global urbanization on vegetation growth partially offset the negative direct impact.
- GeodermaA multiple soil properties oriented representative sampling strategy for digital soil mappingLei Zhang, Lin Yang, Yanyan Cai, and 3 more authorsGeoderma, Jan 2022
Sampling design plays a key role in digital soil mapping (DSM). Efficient sampling design for multiple soil properties is increasingly needed for multivariate soil survey and mapping. However, most of the present sampling methods are not developed for multiple soil properties. Different soil properties have different influential covariates, but usually only one set of covariates is used in designing samples for multiple soil properties which makes simultaneously mapping multiple soil properties accurately difficult. This paper proposed a multiple soil properties oriented representative sampling strategy (MPRS) by considering the influential environmental covariates for each soil property. The method first selects the most influential set of environmental covariates for each soil property, then uses fuzzy c-means (FCM) clustering to generate environmental clusters relating to spatial variation patterns for each soil property, and the selected samples are representative of as many typical locations of environmental clusters for multiple soil properties as possible. The proposed sampling method was applied for mapping soil sand content and soil organic matter content at surface (0–20 cm) and subsurface (20–40 cm) layers in a study area with 5900 km2 located in Anhui Province, China, and compared with two methods, the purposive sampling (PS) method and integrative hierarchical stepwise sampling (IHS) method. The results showed that the proposed sampling method achieved the most accurate prediction for most of the four soil properties over different sample sizes. The proposed sampling method also has an advantage to extract representative samples which can better cover multiple soil properties with a limit of a small sample size. On average, the improvement of prediction accuracy by using the MPRS method was 38.1% and 36.3% compared with PS and IHS in terms of R2, 4.8% and 4.6% in terms of RMSE, and 11.7% and 13.7% in terms of CCC, respectively. Our case study confirmed the necessity to consider the difference of the influential environmental variable combinations for the multiple soil properties oriented sampling design. We conclude that MPRS is a potential effective method for supporting DSM for multiple soil properties.
- ERLA review on digital mapping of soil carbon in cropland: progress, challenge, and prospectHaili Huang, Lin Yang, Lei Zhang, and 6 more authorsEnvironmental Research Letters, Dec 2022
Cropland soil carbon not only serves food security but also contributes to the stability of the terrestrial ecosystem carbon pool due to the strong interconnection with atmospheric carbon dioxide. Therefore, the better monitoring of soil carbon in cropland is helpful for carbon sequestration and sustainable soil management. However, severe anthropogenic disturbance in cropland mainly in gentle terrain creates uncertainty in obtaining accurate soil information with limited sample data. Within the past 20 years, digital soil mapping has been recognized as a promising technology in mapping soil carbon. Herein, to advance existing knowledge and highlight new directions, the article reviews the research on mapping soil carbon in cropland from 2005 to 2021. There is a significant shift from linear statistical models to machine learning models because nonlinear models may be more efficient in explaining the complex soil-environment relationship. Climate covariates and parent material play an important role in soil carbon on the regional scale, while on a local scale, the variability of soil carbon often depends on topography, agricultural management, and soil properties. Recently, several kinds of agricultural covariates have been explored in mapping soil carbon based on survey or remote sensing technique, while, obtaining agricultural covariates with high resolution remains a challenge. Based on the review, we concluded several challenges in three categories: sampling, agricultural covariates, and representation of soil processes in models. We thus propose a conceptual framework with four future strategies: representative sampling strategies, establishing standardized monitoring and sharing system to acquire more efficient crop management information, exploring time-series sensing data, as well as integrating pedological knowledge into predictive models. It is intended that this review will support prospective researchers by providing knowledge clusters and gaps concerning the digital mapping of soil carbon in cropland.
- Remote SensingQuantitative Analysis on Coastline Changes of Yangtze River Delta based on High Spatial Resolution Remote Sensing ImagesQi Wu, Shiqi Miao, Haili Huang, and 4 more authorsRemote Sensing, Jan 2022
The coastline situation reflects socioeconomic development and ecological environment in coastal zones. Analyzing coastline changes clarifies the current coastline situation and provides a scientific basis for making environmental protection policies, especially for coastlines with significant human interference. As human activities become more intense, coastline types and their dynamic changes become more complicated, which needs more detailed identification of coastlines. High spatial resolution images can help provide detailed large spatial coverage at high resolution information on coastal zones. This study aims to map the position and status of the Yangtze River Delta (YRD) coastline using an NDWI threshold method based on 2 m Gaofen-1/Ziyuan-3 imagery and analyze coastline change and coastline type distribution characteristics. The results showed that natural and artificial coastlines in the YRD region accounted for 42.73% and 57.27% in 2013 and 41.56% and 58.44% in 2018, respectively. The coastline generally advanced towards the sea, causing a land area increase of 475.62 km2. The changes in the YRD coastline mainly resulted from a combination of large-scale artificial construction and natural factors such as silt deposition. This study provides a reference source for large spatial coverage at high resolution remote sensing coastline monitoring and a better understanding of land use in coastal zone.
2021
- CGSpatiotemporal causal convolutional network for forecasting hourly PM2.5 concentrations in Beijing, ChinaLei Zhang, Jiaming Na, Jie Zhu, and 3 more authorsComputers & Geosciences, Oct 2021
Air pollution in Northeastern Asia is a serious environmental problem, especially in China where PM2.5 levels are quite high. Accurate PM2.5 predictions are significant to environmental management and human health. Recently, deep learning has received increasing attention from relevant researchers. In this work, a spatiotemporal causal convolutional neural network (ST-CausalConvNet) for short-term PM2.5 prediction is proposed. The distinguishing characteristics of the proposed model is that the convolutions in the model architecture are causal, where an output at a certain time step is convolved only with elements from the same or earlier time steps in the previous layer. Accordingly, no information leakage is induced from the future to the past in this model. The spatial dependence between multiple monitoring stations was also considered in the model. Spatiotemporal correlation analysis was performed to select relevant information from monitoring stations that have a high relationship with the target station. The information from the target and related stations were then employed as the inputs and fed into the model. A case study from May 1, 2014 to April 30, 2015 in Beijing, China was conducted. The next hour PM2.5 concentration was predicted by the proposed model by using historical air quality and meteorological data from 36 monitoring stations. Experimental results show that the trends of the predicted PM2.5 concentrations and the observed values were consistent. The proposed method achieved a better prediction performance than the other three comparative models, namely artificial neural network (ANN), gated recurrent unit (GRU), and long short-term memory (LSTM). Furthermore, the effects of the important parameters and the model transferability were also conducted. We conclude that the proposed ST-CausalConvNet is a potential effective model for air pollution forecasting.
- GeodermaA self-training semi-supervised machine learning method for predictive mapping of soil classes with limited sample dataLei Zhang, Lin Yang, Tianwu Ma, and 3 more authorsGeoderma, Feb 2021
Numerous machine learning models have been developed for constructing the relationship between soil classes or properties and its environmental covariates in digital soil mapping (DSM). Most machine learning models are trained with a supervised learning (SL) method based on training samples. However, the collected sample data is often limited in practice due to that field sampling is expensive and time-consuming. The insufficient samples may limit the learning ability of the model to a large extent. Semi-supervised machine learning, a new machine learning paradigm that makes use of both unsampled data and a small amount of sampled data in the learning process, can be a potential effective method for DSM. In this study, we present a self-training semi-supervised learning (SSL) method for DSM. Different with the SL method for machine learning models, the SSL method not only utilizes the sampled locations but also the abundant environmental covariate information at the unvisited locations. Its basic idea is to iteratively enlarge the training data set by adding the unsampled points with high prediction confidence from the unvisited locations until a stopping criterion reached. The proposed SSL method was applied in machine learning models for predicting soil classes in Heshan Farm of Nenjiang County in Hei longjiang Province, China. Three machine learning models, including multinomial logistic regression (MLR), knearest neighbor (KNN) and random forest (RF), were selected to evaluate the efficiency of the SSL method. The entropy threshold was an important parameter in the SSL method, and a sensitivity analysis on this parameter was conducted with using a series of entropy thresholds. The SSL method was compared with the SL method for the three machine learning models for soil prediction. A cross-validation was employed to evaluate the accuracy of the predicted soil class maps generated based on each method. The results showed that the prediction ac curacies (the proportion of the correctly predicted samples over the total number of validation samples) of the SSL method were higher than those of the SL method for MLR, KNN, and RF by 5.9%, 12.2%, and 6.0%, respectively. RF-SSL was the most accurate model in the study area, followed by KNN-SSL. Meanwhile, the selftraining SSL method for the KNN model had the largest improvement comparing with the other two models. Furthermore, the predicted soil maps using the SSL method showed a more reasonable spatial variation pattern of soil classes. In the study area, a suitable value of the entropy threshold was 0.8 ~ 1.0. We concluded that the SSL method improved the soil prediction accuracy compared with the SL method when applying machine learning models for DSM, and thus is a potential efficient method for DSM with limit sample data.
- JCPQuantifying influences of natural and anthropogenic factors on vegetation changes using structural equation modeling: A case study in Jiangsu Province, ChinaLin Yang, Feixue Shen, Lei Zhang, and 3 more authorsJournal of Cleaner Production, Jan 2021
Vegetation coverage in highly developed areas has been significantly altered in response to multiple disturbances over recent decades. However, the major driving factor of vegetation coverage change in these areas remains unclear, with climate change and anthropogenic factors playing interactive roles under different soil and terrain conditions. Comprehensively understanding the underlying drivers of vegetation change can provide references for regulating environmental management and prevention of vegetation degradation. In this paper, a structural equation modeling (SEM) method was employed to quantify the effects of fundamental natural environment (i.e. the relative stable variables including soil and topography), climate change and human activity change on vegetation coverage change in Jiangsu province, China from 2000 to 2015. Four variables including land use, population density, road impact and night lights were used to indicate human activities. The results showed that the increase of NDVI smaller than 0.10 covered 39.13% of the study area while the decrease of NDVI larger than 0.10 accounted for 20.23%. Areas with NDVI increase mainly distributed in croplands in northern Jiangsu. This could be explained by the increase of crop yield due to the development of modern agriculture. The decrease of NDVI was mainly observed in southern Jiangsu with higher urbanization level and city centers in northern Jiangsu, indicating the effect of rapid urbanization on vegetation degradation. The constructed SEM model suggested that the total effects (influential coefficients) of fundamental natural environment, climate change, and human activity change on NDVI change in Jiangsu were −0.24, 0.17, and −0.74, respectively. Although the fundamental natural environment didn’t have a direct effect on NDVI change, but it had an indirect effect through interactions with human activities. We also constructed SEM models for northern and southern Jiangsu separately, due to their different natural environment and changing patterns of climate change. The results indicated the different driving mechanisms of NDVI change in northern and southern Jiangsu. Furthermore, the results suggested night light as the best indicator of human activity change, followed by the road impact index. We concluded that our study offered a framework to better understand and explain the complex interrelationships behind the spatial temporal change of NDVI.
- JAGA deep learning method to predict soil organic carbon content at a regional scale using satellite-based phenology variablesLin Yang, Yanyan Cai, Lei Zhang, and 3 more authorsInternational Journal of Applied Earth Observation and Geoinformation, Oct 2021
Obtaining the spatial distribution information of soil organic carbon (SOC) is significant to quantify the carbon budget and guide land management for migrating carbon emissions. Digital soil mapping of SOC at a regional scale is challenging due to the complex SOC-environment relationships. Vegetation phenology that directly indicates a long time vegetation growth characteristics can be potential environmental covariates for SOC prediction. Deep learning has been developed for soil mapping recently due to its ability of constructing high-level features from the raw data. However, only dozens of predictors were used in most of those studies. It is not clear that how deep learning with long term land surface phenology product performs for SOC prediction at a regional scale. This paper explored the effectiveness of ten-years MODIS MCD12Q2 phenology variables for SOC prediction with a convolutional neural network (CNN) model in Anhui province, China. Random forest (RF) was applied to compare with CNN using three groups of environmental variables. The results showed that adding the land surface phenology variables into the pool of the natural environmental variables improved the prediction accuracy of CNN by 5.57% of RMSE and 31.29% of R2. Adding phenology variables obtained a higher accuracy improvement than adding Normalized Differences Vegetation Indices. The CNN obtained a higher prediction accuracy than RF regardless of using which group of variables. This study proved that land surface phenology metrics were effective predictors and CNN was a promising method for soil mapping at a regional scale.
- CATENASoil organic carbon prediction using phenological parameters and remote sensing variables generated from Sentinel-2 imagesXianglin He, Lin Yang, Anqi Li, and 4 more authorsCATENA, Oct 2021
It is important to predict the spatial distribution of SOC accurately for migrating carbon emission and sustainable soil management. Environmental variables influence the accuracy of SOC prediction with digital soil mapping (DSM) approaches. In addition to the commonly-used natural predictors, remote sensing variables have been recently used in DSM. However, it is still challenging which variables are effective to predict SOC in farmland. Although phenological parameters have been recently used to indicate human activities that affect SOC in farmland, there are few studies that employ the phenological parameters in SOC prediction. Therefore, this study investigates the feasibility of SOC prediction with the phenological parameters and numerous remote sensing variables extracted from Sentinel-2 at high temporal and spatial resolutions. From 34 Sentinel-2 time series images from 2018 to 2019, 17 phenological parameters were extracted for Xuanzhou, Anhui Province using a dynamic threshold method. Furthermore, fifteen remote sensing predictors comprised of vegetation indices, bright-related indices, and moisture indices were generated from the Sentinel-2 images. The phenological parameters and remote sensing variables were combined with natural variables to predict SOC contents at the surface soil layer using random forest. The results showed that the auxiliary parameters, i.e., the phenological parameters and remote sensing predictors, enhanced the predictability of SOC with an increase in R2 by 171% and a decrease in RMSE by 14%. This study also identified relatively more important auxiliary parameters for the SOC prediction: the largest data value for the fitted function during the season (a6), rate of increase at the beginning of the season (a8), large seasonal integral (a10), SATVI, and Band8. Therefore, this study verified that the phenological parameters and remote sensing predictors extracted from the Sentinel-2 EVI time series are effective for DSM in farmland.
- IJGISExtracting knowledge from legacy maps to delineate eco-geographical regionsLin Yang, Xinming Li, Qinye Yang, and 4 more authorsInternational Journal of Geographical Information Science, Feb 2021
Legacy ecoregion maps contain knowledge on relationships between eco-region units and their environmental factors. This study proposes a method to extract knowledge from legacy area-class maps to formulate a set of fuzzy membership functions useful for regionalization. We develop a buffer zone approach to reduce the uncertainty of boundaries between eco-region units on area-class maps. We generate buffer zones with a Euclidean distance perpendicular to the boundaries, then the original eco-region units without buffer zones serve as the basic units to generate the probability density functions (PDF) of environmental variables. Then, we transform the PDFs to fuzzy membership functions for class-zones on the map. We demonstrate the proposed method with a climatic zone map of China. The results showed that the buffer zone approach effectively reduced the uncertainties of boundaries. A buffer distance of 10–15 km was recommended in this study. The climatic zone map generated based on the extracted fuzzy membership functions showed a higher spatial stratification heterogeneity (compared to the original map). Based on the fuzzy membership functions with climate data of 1961–2015, we also prepared an updated climatic zone map. This study demonstrates the prospects of using fuzzy membership functions to delineate area classes for regionalization purpose.
2020
- GeodermaComparison of conditioned Latin hypercube and feature space coverage sampling for predicting soil classes using simulation from soil mapsTianwu Ma, Dick J. Brus, A-Xing Zhu, and 2 more authorsGeoderma, Jul 2020
This study investigates sampling design for mapping soil classes based on multiple environmental features associated with the soil classes. Two types of sampling design for calibrating the prediction models are compared: conditioned Latin hypercube sampling (CLHS) and feature space coverage sampling (FSCS). Simple random sampling (SRS), which does not utilize the environmental features, is added as a reference design. The sample sizes used are 20, 30, 40, 50, 75, and 100 points, and at each sample size 100 sample sets were drawn using each of the three types of design. Each of these sample sets was then used to calibrate three prediction models: random forest (RF), individual predictive soil mapping (iPSM), and multinomial logistic regression (MLR). These sampling designs were compared based on the overall accuracy of predicted soil class maps obtained by these three prediction methods. The comparison was conducted in two study areas: Ammertal (Germany) and Raffelson (USA). For each of these two areas a detailed legacy soil class map is available. These soil class maps were used as references in a simulation study for the comparison. Results of both study areas show that on average FSCS outperforms CLHS and SRS for all three prediction methods. The difference in estimated medians of overall accuracy with CLHS and SRS was marginal. Moreover, the variation in overall accuracy among sample sets of the same size was considerably smaller for FSCS than that for CLHS. These results in the two study areas suggest that FSCS is a more effective sampling design.