However, although the total number of queries asked can be huge, most of the queries may be asked only once or a few times. That is, when a user poses a query, the search engine tries to infer the context of the query using the user's profile and his query history in order to return more customized answers within a small fraction of a second. Suppose a search engine wants to provide context-aware query recommendations. Third, Web search engines often have to deal with queries that are asked only a very small number of times.
#Which one of the followings cannot be searched with google’s specialized search engine? Offline
Most of the existing model training methods are offline and static and thus cannot be used in such a scenario.
For example, a query classifier may need to be incrementally maintained continuously since new queries keep emerging and predefined categories and the data distribution may change. Whether a model is constructed offline, the application of the model online must be fast enough to answer user queries in real time.Īnother challenge is maintaining and incrementally updating a model on fast-growing data streams. To do this, it may construct a query classifier that assigns a search query to predefined categories based on the query topic (i.e., whether the search query “apple” is meant to retrieve information about a fruit or a brand of computers). A search engine may be able to afford constructing a model offline on huge data sets. Second, Web search engines often have to deal with online data. Scaling up data mining methods over computer clouds and large distributed data sets is an area for further research. Instead, search engines often need to use computer clouds, which consist of thousands or even hundreds of thousands of computers that collaboratively mine the huge amount of data. Typically, such data cannot be processed using one or a few machines. First, they have to handle a huge and ever-growing amount of data. Search engines pose grand challenges to data mining. Various data mining techniques are used in all aspects of search engines, ranging from crawling 5 (e.g., deciding which pages should be crawled and the crawling frequencies), indexing (e.g., selecting pages to be indexed and deciding to which extent the index should be constructed), and searching (e.g., deciding how pages should be ranked, which advertisements should be added, and how the search results can be personalized or made “context aware”). Web search engines are essentially very large data mining applications. Search engines differ from web directories in that web directories are maintained by human editors whereas search engines operate algorithmically or by a mixture of algorithmic and human input.
Some search engines also search and return data available in public databases or open directories. The hits may consist of web pages, images, and other types of files. The search results of a user query are often returned as a list (sometimes called hits). Jian Pei, in Data Mining (Third Edition), 2012 1.6.2 Web Search EnginesĪ Web search engine is a specialized computer server that searches for information on the Web.