In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Co...In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Considering that the words describing soil–environment relationships are often mixed with unrelated words, the first step is to extract the needed words and organize them in a structured way. This paper applies natural language processing(NLP) techniques to automatically extract and structure information from soil survey reports regarding soil–environment relationships. The method includes two steps:(1) construction of a knowledge frame and(2) information extraction using either a rule-based method or a statistic-based method for different types of information. For uniformly written text information, the rule-based approach was used to extract information. These types of variables include slope, elevation, accumulated temperature, annual mean temperature, annual precipitation, and frost-free period. For information contained in text written in diverse styles, the statistic-based method was adopted. These types of variables include landform and parent material. The soil species of China soil survey reports were selected as the experimental dataset. Precision(P), recall(R), and F1-measure(F1) were used to evaluate the performances of the method. For the rule-based method, the P values were 1, the R values were above 92%, and the F1 values were above 96% for all the involved variables. For the method based on the conditional random fields(CRFs), the P, R and F1 values for the parent material were, respectively, 84.15, 83.13, and 83.64%; the values for landform were 88.33, 76.81, and 82.17%, respectively. To explore the impact of text types on the performance of the CRFs-based method, CRFs models were trained and validated separately by the descriptive texts of soil types and typical profiles. For parent material, the maximum F1 value for the descriptive text of soil types was 90.7%, while the maximum F1 value for the descriptive text of soil profiles was only 75%. For landform, the maximum F1 value for the descriptive text of soil types was 85.33%, which was similar to that of the descriptive text of soil profiles(i.e., 85.71%). These results suggest that NLP techniques are effective for the extraction and structuration of soil–environment relationship information from a text data source.展开更多
vegetation continuous The scale-location specific control on distribution was investigated through wavelet transforms approaches in subtropical mountain-hill region, Fujian, China. The Normalized Difference Vegetatio...vegetation continuous The scale-location specific control on distribution was investigated through wavelet transforms approaches in subtropical mountain-hill region, Fujian, China. The Normalized Difference Vegetation Index (NDVI) was calculated as an indicator of vegetation greenness using Chinese Environmental Disaster Reduction Satellite images along latitudinal and longitudinal transects. Four scales of variations were identified from the local wavelet spectrum of NDVI, with much stronger wavelet variances observed at larger scales. The characteristic scale of vegetation distribution within mountainous and hilly regions in Southeast China was around 20 km. Significantly strong wavelet coherency was generally examined in regions with very diverse topography, typically characterized as small mountains and hills fractured by rivers and residents. The continuous wavelet based approaches provided valuable insight on the hierarchical structure and its corresponding characteristic scales of ecosystems, which might be applied in defining proper levels in multilevel models and optimal bandwidths in Geographically Weighted Regression.展开更多
The complex spatiotemporal vegetation variability in the subtropical mountain-hill region was investigated through a multi-level modeling framework. Three levels - parcel, landscape, and river basin levels- were selec...The complex spatiotemporal vegetation variability in the subtropical mountain-hill region was investigated through a multi-level modeling framework. Three levels - parcel, landscape, and river basin levels- were selected to discover the complex spatiotemporal vegetation variability induced by climatic, geomorphic and anthropogenic processes at different levels. The wavelet transform method was adopted to construct the annual maximum Enhanced Vegetation Index and the amplitude of the annual phenological cycle based on the 16-day time series of a5om Moderate Resolution Imaging Spectroradiometer Enhanced Vegetation Index datasets during 2OOl-2OlO. Results revealed that land use strongly influenced the overall vegetation greenness and magnitude of phenological cycles. Topographic variables also contributed considerably to the models, reflecting the positive influence from altitude and slope. Additionally, climate factors played an important role: precipitation had a considerable positive association with the vegetation greenness, whereas the temperature difference had strong positive influence on the magnitude of vegetation phenology. The multilevel approach leads to a better understanding of the complex interaction of the hierarchical ecosystem, human activities and climate change.展开更多
基金supported by the National Natural Science Foundation of China (41431177 and 41601413)the National Basic Research Program of China (2015CB954102)+1 种基金the Natural Science Research Program of Jiangsu Province, China (BK20150975 and 14KJA170001)the Outstanding Innovation Team in Colleges and Universities in Jiangsu Province, China
文摘In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Considering that the words describing soil–environment relationships are often mixed with unrelated words, the first step is to extract the needed words and organize them in a structured way. This paper applies natural language processing(NLP) techniques to automatically extract and structure information from soil survey reports regarding soil–environment relationships. The method includes two steps:(1) construction of a knowledge frame and(2) information extraction using either a rule-based method or a statistic-based method for different types of information. For uniformly written text information, the rule-based approach was used to extract information. These types of variables include slope, elevation, accumulated temperature, annual mean temperature, annual precipitation, and frost-free period. For information contained in text written in diverse styles, the statistic-based method was adopted. These types of variables include landform and parent material. The soil species of China soil survey reports were selected as the experimental dataset. Precision(P), recall(R), and F1-measure(F1) were used to evaluate the performances of the method. For the rule-based method, the P values were 1, the R values were above 92%, and the F1 values were above 96% for all the involved variables. For the method based on the conditional random fields(CRFs), the P, R and F1 values for the parent material were, respectively, 84.15, 83.13, and 83.64%; the values for landform were 88.33, 76.81, and 82.17%, respectively. To explore the impact of text types on the performance of the CRFs-based method, CRFs models were trained and validated separately by the descriptive texts of soil types and typical profiles. For parent material, the maximum F1 value for the descriptive text of soil types was 90.7%, while the maximum F1 value for the descriptive text of soil profiles was only 75%. For landform, the maximum F1 value for the descriptive text of soil types was 85.33%, which was similar to that of the descriptive text of soil profiles(i.e., 85.71%). These results suggest that NLP techniques are effective for the extraction and structuration of soil–environment relationship information from a text data source.
基金supported by the National Natural Science Foundation of China(NSFC)(Grant No.41071267)Scientific Research Foundation for Returned Scholars,Ministry of Education of China(Grant No.[2012]940)the Science & Technology Department of Fujian Province,China(Grant Nos.2012I0005,2012J01167)
文摘vegetation continuous The scale-location specific control on distribution was investigated through wavelet transforms approaches in subtropical mountain-hill region, Fujian, China. The Normalized Difference Vegetation Index (NDVI) was calculated as an indicator of vegetation greenness using Chinese Environmental Disaster Reduction Satellite images along latitudinal and longitudinal transects. Four scales of variations were identified from the local wavelet spectrum of NDVI, with much stronger wavelet variances observed at larger scales. The characteristic scale of vegetation distribution within mountainous and hilly regions in Southeast China was around 20 km. Significantly strong wavelet coherency was generally examined in regions with very diverse topography, typically characterized as small mountains and hills fractured by rivers and residents. The continuous wavelet based approaches provided valuable insight on the hierarchical structure and its corresponding characteristic scales of ecosystems, which might be applied in defining proper levels in multilevel models and optimal bandwidths in Geographically Weighted Regression.
基金supported by the National Natural Science Foundation of China (NSFC) (Grant No. 41071267)Scientific Research Foundation for Returned Scholars ([2012]940)Ministry of Education of China, and the Science Foundation of Fujian Province (Grant Nos. 2012I0005, 2012J01167)
文摘The complex spatiotemporal vegetation variability in the subtropical mountain-hill region was investigated through a multi-level modeling framework. Three levels - parcel, landscape, and river basin levels- were selected to discover the complex spatiotemporal vegetation variability induced by climatic, geomorphic and anthropogenic processes at different levels. The wavelet transform method was adopted to construct the annual maximum Enhanced Vegetation Index and the amplitude of the annual phenological cycle based on the 16-day time series of a5om Moderate Resolution Imaging Spectroradiometer Enhanced Vegetation Index datasets during 2OOl-2OlO. Results revealed that land use strongly influenced the overall vegetation greenness and magnitude of phenological cycles. Topographic variables also contributed considerably to the models, reflecting the positive influence from altitude and slope. Additionally, climate factors played an important role: precipitation had a considerable positive association with the vegetation greenness, whereas the temperature difference had strong positive influence on the magnitude of vegetation phenology. The multilevel approach leads to a better understanding of the complex interaction of the hierarchical ecosystem, human activities and climate change.