Abstract—Focused crawlers (also known as subjectoriented crawlers), as the core part of vertical search engine, collect topic-specific web pages as many as they can to form a subject-oriented corpus for the latter ...Abstract—Focused crawlers (also known as subjectoriented crawlers), as the core part of vertical search engine, collect topic-specific web pages as many as they can to form a subject-oriented corpus for the latter data analyzing or user querying. This paper demonstrates that the popular algorithms utilized at the process of focused web crawling, basically refer to webpage analyzing algorithms and crawling strategies (prioritize the uniform resource locator (URLs) in the queue). Advantages and disadvantages of three crawling strategies are shown in the first experiment, which indicates that the best-first search with an appropriate heuristics is a smart choice for topic-oriented crawlingwhile the depth-first search is helpless in focused crawling. Besides, another experiment on comparison of improved ones (with a webpage analyzing algorithm added) is carried out to verify that crawling strategies alone are not quite efficient for focused crawling and in most cases their mutual efforts are taken into consideration. In light of the experiment results and recent researches, some points on the research tendency of focused crawler algorithms are suggested.展开更多
This paper explores a way of deploying the classical algorithm named genetic algorithm(GA) with the memristor. The memristor is a type of circuit device with both characteristics of storage and computing, which provid...This paper explores a way of deploying the classical algorithm named genetic algorithm(GA) with the memristor. The memristor is a type of circuit device with both characteristics of storage and computing, which provides the similarity between electronic devices and biological components, such as neurons, and the structure of the memristor-based array is similar to that of chromosomes in genetics. Besides, it provides the similarity to the image gray-value matrix that can be applied to image restoration with GA. Thus, memristor-based GA is proposed and the experiment about image restoration using memristor-based GA is carried out in this paper. And parameters,such as the size of initial population and the number of iterations, are also set different values in the experiment,which demonstrates the feasibility of implementing GA with memristors.展开更多
基金supported by the Research Fund for International Young Scientists of National Natural Science Foundation of China under Grant No.61550110248Tibet Autonomous Region Key Scientific Research Projects under Grant No.Z2014A18G2-13
文摘Abstract—Focused crawlers (also known as subjectoriented crawlers), as the core part of vertical search engine, collect topic-specific web pages as many as they can to form a subject-oriented corpus for the latter data analyzing or user querying. This paper demonstrates that the popular algorithms utilized at the process of focused web crawling, basically refer to webpage analyzing algorithms and crawling strategies (prioritize the uniform resource locator (URLs) in the queue). Advantages and disadvantages of three crawling strategies are shown in the first experiment, which indicates that the best-first search with an appropriate heuristics is a smart choice for topic-oriented crawlingwhile the depth-first search is helpless in focused crawling. Besides, another experiment on comparison of improved ones (with a webpage analyzing algorithm added) is carried out to verify that crawling strategies alone are not quite efficient for focused crawling and in most cases their mutual efforts are taken into consideration. In light of the experiment results and recent researches, some points on the research tendency of focused crawler algorithms are suggested.
基金supported by the Research Fund for International Young Scientists of National Natural Science Foundation of China under Grant No. 61550110248the Sichuan Science and Technology Program under Grant No. 2019YFG0190
文摘This paper explores a way of deploying the classical algorithm named genetic algorithm(GA) with the memristor. The memristor is a type of circuit device with both characteristics of storage and computing, which provides the similarity between electronic devices and biological components, such as neurons, and the structure of the memristor-based array is similar to that of chromosomes in genetics. Besides, it provides the similarity to the image gray-value matrix that can be applied to image restoration with GA. Thus, memristor-based GA is proposed and the experiment about image restoration using memristor-based GA is carried out in this paper. And parameters,such as the size of initial population and the number of iterations, are also set different values in the experiment,which demonstrates the feasibility of implementing GA with memristors.