MSN Search Engine - history and highlights
The MSN Search engine is fighting for dominance with the Google Search Engine and Yahoo Search Engine.
- The MSN Search engine is one of the big four. The others are: Ask Jeeves, Google, and Yahoo.
In early 2005 MSN stopped piggy-backing off Yahoo and introduced its own search engine. Like the other major search engines, MSN employs its own pure, algorithm- driven technology.
- The MSN search engine, when launched, indexed about 5 billion pages. The size of the index is now comparable to the size of Google and Yahoo indexes.
MSN search engine: differences
The MSN Search Engine includes some features that differ from those found in Google, Yahoo and Ask Jeeves. For instance, there are options beside the search box which allows you to perform focussed searches. Some of these searches are unique to MSN.
MSN allows search in news, local and image indexes. But perhaps the neatest, and most unique, feature is the ability to search the Microsoft Encarta encyclopedia.
Also, beside the search box, there is an option to use a "search builder". This offer the same ability as "advanced search" features in other search engines. Some users might find it easier to use. It's worth a look, you might find it works better for your needs.
MSN search engine: ranking pages
How does the MSN search engine rank pages? the answer is, in much the same way as other search engines. It uses text in content, anchor text in links, heading tags, alt tags, the title attribute in links—anything visible to the user in a standard browser—to weight its search results. But, like most other search engines, it does not use the meta keywords tag as a weight—it's not visible and has a history of being spammed. The tags <b> vs. <strong> and <i> vs. <em> all have an equal weight.
- The MSN search engine team emphasize the use of static URL’s, dynamic pages are less likely to get links.
MSN use a subset of 569 different properties to predict the relevancy of a particular document. This is more than most other companies, so you may find that pages which take account of any conceivable property do particularly well in the MSN search engine.
Microsoft research labs are avidly pursuing search technologies, and might be going in some interestingly different directions. For instance, Microsoft added MSN RankNet, a neural network, to its search engine algorithm in June 2005. It learns from human search patterns.
Andy Edmonds, lead program manager, and Erik Selberg are two of the geeks who work on making MSN Search better. They had a frank talk about how the MSN Search engine works on Channel 9 Forums in October 2005. They said what they're doing to beat the competition.
Some points they make:
- They don't know what Google are doing. There's no information exchange between the main players.
- They receive 8000 emails a day telling them about garbage pages and the like. They take notice, and look for patterns to feed to the neural net.
- The questioner asked them if the neural net is looking at what people click on in the SERPS? There was no comment, but you are left with the imoression that this is happening.
- The team are focussing on enabling the neural network determine the type of search. Is it aimed at purchase? Information? Comparison?
- You can choose to do a dedicated blog search, aimed at RSS atom feed or XML.
- They boast that MSN search does a better search on MSDN than then competitors. (It should!)
- MSN are not aiming to be the largest database. They'd rather have the five billion most important pages, rather than just more pages, like Google or Yahoo.
To be really effective, the "neural net" needs to parse the writing on the page in the same way as a human being. (I''l let pass the question of: "Which human being?) The neural net needs to "understand" the concepts being presented from the site. Google, basically, just ranks pages based on how many people reference them. But this can be exploited, as with the "miserable failure" bug. Parsing the thought on the page is hard, and (although they're not saying) its unlikely that MSN have this any where near cracked.
The technology
Super high end 64 windows machines using parallel processing of neural nets.
- The technology passes W3C search engine validation. Yahoo comes close, Google does not!
- MSN search is the only engine that allows altering the SERPS via freshness of the data.
Advertising
Google hasn't nailed advertising. The long tail has not been dealt with. Finding obscure products
for small numbers of people is an MSN priority.
For details of topics like aggregation and query expansion, Erik has a lot of information on his site. He created dogpile, the metacrawler InfoSpace is based on.