Matt Cutts
Matt Cutts is a Google engineer famous for the Matt Cutts: Gadgets, Google, and SEO blog.
This page give explanations and summaries of what Matt Cutts and the cutting edge Google engineers are talking about. If you are wondering what things like Bigdaddy or canonicalization are then this is the place to start. I'll update this page frequently. You should access Matt's blog at the date given for more information. Unspecified quotes are from Matt.
May 16, 2006 Matt discusses Bigdaddy - a test data centre where you could search the results of Google's pre-release crawler and indexer. It became live in December 2005, and was fully deployed by March 29 2006.
"The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling."
Some pages are asking for trouble, and don't get crawled by Bigdaddy. He gives an example of real estate site that links to an SEO contest and an Omega 3 fish oil site. This may be due to reciprocal linking, and may lead to not being indexed.
Affiliate sites also get frowned upon. If you just copy text from a site you are affiliated to then you give no added value site and can expect not to be indexed.
"the depth of the directory doesn’t make any difference for us; PageRank is a much larger factor. So without knowing your site, I’d look at trying to make sure that your site is using your PageRank well. A tree structure with a certain fanout at each level is usually a good way of doing it."
"I’d recommend people spend less time on trying to gather links [by asking for a link exchange] or via some automated network, and more on making a great site with a creative angle or two that makes the site stand out from the crowd."
Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages." In one sense, the following are the same:
- www.example.com
- www.example.com/index.html
But they are different. When Google “canonicalizes”, it trys to pick the best url. You should pick the URL you prefer and use it for all your internal links. If someone links to your second choice you can make your webserver do a 301 (permanent) redirect to your first choice, and help Google know which url is canonical.