web search engine is a software system designed to find information on the World Wide Web. Search results are generally presented in a line of results that are often referred to as search engine results pages (SERPs). Information can be a combination of web pages, images, and other file types. Some search engines also mine data available in the database or open the directory. Unlike web directories, which are only managed by human editors, search engines also maintain real-time information by running algorithms on web crawlers. Internet content that can not be searched by a web search engine is generally described as a profound web.
Video Web search engine
History
The internet search engine itself preceded the Web's debut in December 1990. The Who is the user's search date back to 1982 and Knowbot Information Service multi-user search network was first implemented in 1989. The first documented search engine searching for content files, ie FTP file is Archie, which debuted on September 10, 1990.
Before September 1993 the World Wide Web was fully indexed by hand. There is a list of webservers edited by Tim Berners-Lee and hosted on the CERN webserver. One snapshot of Google.nl from the list in 1992 persists, but as more and more web servers are online, the central list can no longer follow. On the NCSA website, a new server was announced under the title "What's New!"
The first tool used to search content (as opposed to users) on the Internet is Archie. The abbreviation name of "archive" without "v". It was created by Alan Emtage, Bill Heelan and J. Peter Deutsch, a computer science student at McGill University in Montreal, Quebec, Canada. The program downloads a directory listing all files located on the public Anonymous FTP site (File Transfer Protocol), creates a searchable file name database; However, Archie Search Engine does not index the contents of these sites because the amount of data is so limited that it can be easily searched manually.
The appearance of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search programs, Veronica and Jughead. Like Archie, they look for file names and titles stored in the Gopher index system. Veronica V ery E asy R odent- O raising N et-wide I ndex to C computerized A rchives) provides a keyword search of most Gopher menu titles in the entire Gopher list. Jughead ( J onzy U niversal G opher H ierarchy E xcavation A nd D isplay) is a tool for obtaining menu information from a particular Gopher server. While the search engine name "Archie Search Engine" is not a reference to Archie's comic book series, "Veronica" and "Jughead" are characters in the series, thus referring to their predecessors.
In the summer of 1993, there was no search engine for the web, although many specialized catalogs were handled. Oscar Nierstrasz at the University of Geneva wrote a series of Perl scripts that periodically reflect these pages and rewrote them into standard formats. This formed the basis for W3Catalog, the first primitive web search engine, released on September 2, 1993.
In June 1993, Matthew Gray, later at MIT, produced what might be the first web robot, Web Based Wanderer based on Perl, and used it to generate an index called 'Wandex'. The purpose of Wanderer was to measure the size of the World Wide Web, which was done until the end of 1995. Aliweb's second web search engine appeared in November 1993. Aliweb does not use web robots, but relies on being told by the website. the administrator of the existence in each index file site in a certain format.
Mosaic NCSA (TM) - Mosaic (web browser) is not the first Web browser. But that's the first to make a big splash. In November 1993, Mosaic v 1.0 broke away from a small package of existing browsers with features - such as icons, bookmarks, more interesting interfaces, and images - that made the software easy to use and appealing to "non-geek."
JumpStation (created in December 1993 by Jonathon Fletcher) uses a web robot to find web pages and to build its index, and uses web forms as the interface for query programs. It was the first WWW resource discovery tool that incorporates three important features of the web search engine (crawling, indexing and searching) as described below. Due to the limited resources available on the platform it runs, its indexing and searching is limited to titles and titles found on crawled web pages.
One of the first "all text" search engine-based crawlers is WebCrawler, which came out in 1994. Unlike its predecessors, it allows users to search for any word on any web page, which has become the standard for all major search engines ever since. It was also the first widely known by the public. Also in 1994, Lycos (which started at Carnegie Mellon University) was launched and became a major commercial venture.
Soon after, many search engines appeared and competed for popularity. These include Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista. Yahoo! is one of the most popular ways for people to find interesting webpages, but their search functionality is operated in their web directories rather than full-text copies of their web pages. The information finder can also browse the directory instead of searching by keyword.
In 1996, Netscape wanted to give one search engine an exclusive deal as a leading search engine in the Netscape web browser. There's so much interest that Netscape makes deals with five major search engines: for $ 5 million a year, every search engine will rotate on the Netscape search engine page. The five machines are Yahoo !, Magellan, Lycos, Infoseek, and Excite.
Google adopted the idea of ââselling search terms in 1998, from a small search engine company called goto.com. This move has a significant influence on the SE business, which is changing from striving to be one of the most profitable businesses on the internet.
The search engines are also known as some of the brightest stars in the frenzy of Internet investments that took place in the late 1990s. Some companies enter the market spectacularly, receiving record profits during their initial public offering. Some have deleted their public search engine, and only market a company-specific edition, such as Northern Lights. Many search engine companies are caught in a dot-com bubble, a speculative market boom that culminated in 1999 and ended in 2001.
- Around the year 2000, the Google search engine became famous. The company achieved better results for many searches with an innovation called PageRank, as described in a Search Engine Anatomy paper written by Sergey Brin and Larry Page, Google's next founder. This iterative algorithm ranks web pages based on the number and PageRank of other websites and linking pages there, on the premise that a good or desired page is related to more than another. Google also maintains a minimalist interface to its search engine. Conversely, many of its competitors are embedding search engines on the web portal. In fact, Google's search engine became so popular that spoof machines appeared like Mystery Seeker.
In 2000, Yahoo! provides search services based on Inktomi search engine. Yahoo! acquired Inktomi in 2002, and Overture (which owns AlltheWeb and AltaVista) in 2003. Yahoo! switched to Google search engine until 2004, when launching its own search engine based on the combined technology of the acquisition.
Microsoft first launched Search MSN in the fall of 1998 using search results from Inktomi. In early 1999 the site began displaying a list of Looksmart, mixed with results from Inktomi. For a short time in 1999, MSN Search used results from AltaVista instead. In 2004, Microsoft began the transition to its own search technology, powered by its own web crawler (called msnbot).
The renamed Microsoft search engine, Bing, was launched on June 1, 2009. On July 29th, 2009, Yahoo! and Microsoft resolved the deal under which Yahoo! The search will be supported by Microsoft Bing technology.
Maps Web search engine
Approach
Search engine maintains the following process in the near future:
- Web crawling
- Indexing
- Searching
The web search engine gets its information by crawling the web from site to site. The "spider" checks the filename of the robots.txt default file, addressed to it, before sending certain information back for indexing depending on many factors, such as title, page content, JavaScript, Cascading Style Sheets (CSS) , title, as evidenced by standard HTML markup of information content, or metadata in HTML meta tags. "[N] o web crawlers might actually crawl an entire reachable web.Because of unlimited web sites, spiders traps, spam, and other urgency of the actual web, the crawler actually implements crawl policies to determine when the site crawls should be considered enough.Some sites are deeply crawled, while others are only partially crawled. "
Indexing means linking other words and tokens that can be found on web pages to their domain names and HTML-based fields. Associations are created in a public database, available for web search queries. Queries from users can be one word. Index helps find query related information as quickly as possible. Some techniques for indexing, and caching are trade secrets, while web crawling is a direct process of systematically visiting all sites.
Between visits by spider , the cached version of the page (some or all of the content needed to render it) stored in the search engine's working memory is quickly sent to the investigator. If the visit is late, search engines can only act as web proxies instead. In this case the page may be different from the indexed search terms. The cached page holds a version view whose words are indexed, so the cached version of the page can be useful for the website when the actual page has been lost, but this problem is also considered a mild form of linkrot.
Usually when a user enters a request to a search engine it is a few keywords. Index already has site names containing keywords, and this is directly obtained from the index. The actual processing load is in generating a web page that is a list of search results: Each page in the entire list should be weighted according to the information in the index. Then the top search result items require search, reconstruction, and markup of snippets that indicate the context of matching keywords. This is just part of processing every web page search result requires, and the further pages (next to the top) require more from processing this post.
In addition to simple keyword searches, search engines offer GUI operators or their own command-driven and search parameters to filter search results. It provides the necessary controls for users engaged in user-generated feedback by filtering and weighting while filtering search results, recalling the start page of the first search result. For example, since 2007 Google.com search engine has enabled someone to filter by date by clicking "Show search tool" in the leftmost column of the original search result page, and then selecting the desired date range.. It's also possible to weight by date because each page has a modification time. Most search engines support the use of the AND, OR and NOT boolean operators to help end users improve search queries. Boolean operators are for literal searches that allow users to refine and expand search terms. The engine looks for a word or phrase exactly as entered. Some search engines provide an advanced feature called proximity search, which allows users to specify distances between keywords. There is also a search based on a concept in which research involves using statistical analysis on a page containing the words or phrases you are looking for. In addition, natural language questions allow users to type questions in the same form that will ask them to humans. Sites like this will be ask.com.
The usefulness of a search engine depends on the relevance of the set of results that it is giving back. While there may be millions of web pages that include certain words or phrases, some of the pages may be more relevant, popular, or authoritative than others. Most search engines use methods to rank results to give the best results first. How search engines decide which pages fit best, and how the order of results is displayed varies greatly from machine to machine. This method also changes over time as Internet usage changes and new techniques evolve. There are two main types of search engines that have evolved: one is an extensively programmed hierarchical and hierarchical hierarchical keyword system. The other is a system that produces an "inverted index" by analyzing the text it places. This first form relies heavily on the computer itself to do most of the work.
Most of the Web search engines are commercial businesses that are supported by advertising revenue and thus some of them allow advertisers to have their list of rankings higher in search results for a fee. Search engines that do not receive money for their search results make money by running ad related searches in addition to regular search engine results. Search engines make money every time someone clicks on one of these ads.
Google is the world's most popular search engine, with a market share of 74.52 percent in February 2018.
The world's most popular search engines (with <1% market share) are:
East Asia and Russia
In some East Asian and Russian countries, Google is not the most popular search engine.
In Russia, Yandex has a market share of 61.9 percent, compared to Google at 28.3 percent. In China, Baidu is the most popular search engine. South Korea's homegrown search portal, Naver, is used for 70 percent of online searches in the country. Yahoo! Japan and Yahoo! Taiwan is the most popular road for internet searches in Japan and Taiwan.
Europe
Most of the country's markets in Western Europe are dominated by Google, except for the Czech Republic, where Seznam is a strong competitor.
Search engine bias
Although search engines are programmed to rank websites based on some combination of their popularity and relevance, empirical studies show the various political, economic, and social biases in the information they provide and the underlying assumptions about technology. This bias can be a direct result of economic and commercial processes (for example, companies that advertise with search engines can become more popular in their organic search results), and political processes (eg, removal of search results to comply with local laws). For example, Google will not bring up certain neo-Nazi websites in France and Germany, where Holocaust denials are illegal.
Biases can also be the result of social processes, because search engine algorithms are often designed to exclude non-normative perspectives that support more "popular" results. The major search engine indexing algorithms lean toward coverage of US-based sites, rather than websites from non-US countries.
Google Bombing is an example of an attempt to manipulate search results for political, social or commercial reasons.
Some scholars have studied the cultural changes triggered by search engines, and representation of certain controversial topics in their results, such as terrorism in Ireland and conspiracy theories.
Customized results and bubble filters
Many search engines like Google and Bing deliver customized results based on user activity history. This leads to an effect called a bubble filter. This term describes the phenomenon in which a website uses an algorithm to selectively guess what information the user wants to see, based on information about the user (such as location, last click behavior, and search history). As a result, websites tend to only display information that matches the point of view of previous users, effectively isolating users in bubbles that tend to exclude conflicting information. Key examples are Google's personalized search results and personalized news streams on Facebook. According to Eli Pariser, who coined the term, users get little exposure to conflicting and intellectually isolated perspectives in their own information bubble. Pariser links an example where one user searches Google for "BP" and gets investment news about British Petroleum while other searchers are informed about the Deepwater Horizon oil spill and that both search results pages are "very different". The bubble effect may have negative implications for the discourse of citizenship, according to Pariser. Because this issue has been identified, a competitor search engine has emerged that is trying to avoid this problem by not tracing or "bulging" the user, such as DuckDuckGo. Other scholars do not share Paris's view, finding evidence supporting his thesis is inconclusive.
Christianity Christian, Islamic and Jewish search engine
The global growth of Internet and electronic media in the Arab and Muslim World over the last decade has encouraged Muslims in the Middle East and Asian sub-continents, to try their own search engine, their own filtered search portal that will allow users to conduct secure searches. More than the usual secure search filters, these Islamic web portals categorize websites as "halal" or "haram", based on modern interpretations, experts, "Islamic Law". Im online online in September 2011. Halalgoogling online in July 2013. It uses illicit filters on collections from Google and Bing (and others).
While the lack of investment and the slow pace of technology in the Muslim World have stunted progress and thwarted the success of Islamic search engines, targeting the mainstream consumers of Islam, projects like Muxlim, a Muslim lifestyle site, indeed receive millions of dollars from investors like the Internet Rite Ventures, and that too faltered. Other religiously oriented search engines are Jewgle, Jewish version of Google, and SeekFind.org, which is Christian. SeekFind filters sites that attack or degrade their faith.
Search engine submission
Search engine submission is the process by which webmasters submit websites directly to search engines. Although search engine submissions are sometimes presented as a way to promote a website, it is generally not necessary because the major search engines use web crawlers, which will eventually find most websites on the Internet unaided. They can submit one web page at a time, or they can submit the entire site using a sitemap, but it's usually only necessary to submit a website home page because search engines can crawl a well-designed website. There are two reasons left for submitting websites or web pages to search engines: to add an entirely new website without waiting for search engines to find it, and to have website records updated after a substantial redesign.
Source of the article : Wikipedia