This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from tra...This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.展开更多
In this study,analyses are conducted on the information features of a construction site,a cornfield and subsidence seeper land in a coal mining area with a synthetic aperture radar (SAR) image of medium resolution. Ba...In this study,analyses are conducted on the information features of a construction site,a cornfield and subsidence seeper land in a coal mining area with a synthetic aperture radar (SAR) image of medium resolution. Based on features of land cover of the coal mining area,on texture feature extraction and a selection method of a gray-level co-occurrence matrix (GLCM) of the SAR image,we propose in this study that the optimum window size for computing the GLCM is an appropriate sized window that can effectively distinguish different types of land cover. Next,a band combination was carried out over the text feature images and the band-filtered SAR image to secure a new multi-band image. After the transformation of the new image with principal component analysis,a classification is conducted selectively on three principal component bands with the most information. Finally,through training and experimenting with the samples,a better three-layered BP neural network was established to classify the SAR image. The results show that,assisted by texture information,the neural network classification improved the accuracy of SAR image classification by 14.6%,compared with a classification by maximum likelihood estimation without texture information.展开更多
Some products are only sold in some regions while some are sold in certain months, so there are a lot of data gaps. If we choose too much of this vacancy data, it may have impact on knowledge extraction. So to aim at ...Some products are only sold in some regions while some are sold in certain months, so there are a lot of data gaps. If we choose too much of this vacancy data, it may have impact on knowledge extraction. So to aim at specific sale database, we chose to study six kinds of representative products among them and these products are relatively comprehensive in all regions and month sales including two conventional products, two special products and two accessories; we also chose seven sale regions to make detailed analysis. We selected nine characteristics of sales in database which respectively are monthly sales volume, monthly revenue, monthly average selling price, profit, program sales volume, completed percentage of a program, sales month, sales territory and product type.展开更多
Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with ...Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with this problem,we propose two scenarios to extract meaningful strings based on document clustering and term clustering with multi-strategies to optimize a Vector Space Model(VSM) in order to improve webpage classification.The results show that document clustering work better than term clustering in coping with document content.However,a better overall performance is obtained by spectral clustering with document clustering.Moreover,owing to image existing in a same webpage with document content,the proposed method is also applied to extract image meaningful terms,and experiment results also show its effectiveness in improving webpage classification.展开更多
The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated wi...The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated with learnt classification rules categorizing documents into topics. Classification provides for the reduction of the dimensio0ality of the document feature space. The semantic model of retrieved web documents is semantically labeled by querying domain ontology and processed with content-based classification method. The model obtained is mapped to the existing knowledge base by implementing inference algorithm. It enables models of the same semantic type to be recognized and integrated into the knowledge base. The approach provides for the domain knowledge integration and assists the extraction and modeling web documents semantics. Implementation results of the proposed approach are presented.展开更多
基金Project supported by the National Natural Science Foundation ofChina (No. 40101014) and by the Science and technology Committee of Zhejiang Province (No. 001110445) China
文摘This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.
基金Projects 40771143 supported by the National Natural Science Foundation of China2007AA12Z162 by the Hi-tech Research and Development Program of China
文摘In this study,analyses are conducted on the information features of a construction site,a cornfield and subsidence seeper land in a coal mining area with a synthetic aperture radar (SAR) image of medium resolution. Based on features of land cover of the coal mining area,on texture feature extraction and a selection method of a gray-level co-occurrence matrix (GLCM) of the SAR image,we propose in this study that the optimum window size for computing the GLCM is an appropriate sized window that can effectively distinguish different types of land cover. Next,a band combination was carried out over the text feature images and the band-filtered SAR image to secure a new multi-band image. After the transformation of the new image with principal component analysis,a classification is conducted selectively on three principal component bands with the most information. Finally,through training and experimenting with the samples,a better three-layered BP neural network was established to classify the SAR image. The results show that,assisted by texture information,the neural network classification improved the accuracy of SAR image classification by 14.6%,compared with a classification by maximum likelihood estimation without texture information.
文摘Some products are only sold in some regions while some are sold in certain months, so there are a lot of data gaps. If we choose too much of this vacancy data, it may have impact on knowledge extraction. So to aim at specific sale database, we chose to study six kinds of representative products among them and these products are relatively comprehensive in all regions and month sales including two conventional products, two special products and two accessories; we also chose seven sale regions to make detailed analysis. We selected nine characteristics of sales in database which respectively are monthly sales volume, monthly revenue, monthly average selling price, profit, program sales volume, completed percentage of a program, sales month, sales territory and product type.
基金supported by the National Natural Science Foundation of China under Grants No.61100205,No.60873001the HiTech Research and Development Program of China under Grant No.2011AA010705the Fundamental Research Funds for the Central Universities under Grant No.2009RC0212
文摘Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with this problem,we propose two scenarios to extract meaningful strings based on document clustering and term clustering with multi-strategies to optimize a Vector Space Model(VSM) in order to improve webpage classification.The results show that document clustering work better than term clustering in coping with document content.However,a better overall performance is obtained by spectral clustering with document clustering.Moreover,owing to image existing in a same webpage with document content,the proposed method is also applied to extract image meaningful terms,and experiment results also show its effectiveness in improving webpage classification.
文摘The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated with learnt classification rules categorizing documents into topics. Classification provides for the reduction of the dimensio0ality of the document feature space. The semantic model of retrieved web documents is semantically labeled by querying domain ontology and processed with content-based classification method. The model obtained is mapped to the existing knowledge base by implementing inference algorithm. It enables models of the same semantic type to be recognized and integrated into the knowledge base. The approach provides for the domain knowledge integration and assists the extraction and modeling web documents semantics. Implementation results of the proposed approach are presented.