Today, we’re going to talk about canonical URLs or the fabled rel canonical tag. We’re going to talk about what good canonicalization looks like, how this can really mess up your website if you get it wrong. If you get it right, you can clean up a lot of really bad duplicate content and fix a ton of technical behind the scenes problem. Stay for the whole video because we’re going to talk about every possible edge case as well.
So canonical URLs or the rel canonical tags,so we’re going to dive into this. We’re going to go pretty heavy into the technical weeds here. Just a quick reminder before we do that. Canonical tags are part of SEO and SEO is just one piece of digital marketing. This is one channel and so keep that in mind as we do this. We’re zoning in on one particular channel of what should be a much larger, more comprehensive digital marketing strategy.
Just keep that at top of mind as we talk about this. Then within SEO, canonical URLs or the real canonical tag is just one tiny component of a much larger piece of this entire puzzle. Canonical URLs, it’s really just a fancy,overly technical word for saying, “This is how we deal with duplicates.” The real canonical tag that we use on our webpages tell.
It’s effectively pointing to the master copy of a page. Modern web applications create this massive problem for us where we have lots and lots and lots of the different versions of the same thing. If we weren’t doing this, if we weren’t handling this, it would be a massive problem for the internet and for any search engine to handle. A canonical tag on a webpage tells a search engine which version of the page that you want ranking, and so this is really, really vital for eCommerce sites and for any modern web application that has a sorting problem.
How to Optimize canonical URLs
I used to manage search engine optimization Airbnb and this was a massive problem for us. If you have a list of a thousand homes in a particular city and you have lots of different filters, there’s lots of different ways to organize and arrange that page. Sure, you have a thousand homes but some of them are the entire home. Some of them are shared. Some of them are two bedrooms, four bedrooms,eight bedrooms. Some of them have a pool. Some of them are family friendly. Some of them have a sauna. Some of them are instant books.
Every time you add an additional layer of filtering, you’re essentially rendering a new different type of page on the same dataset,the same thousand listings. If you do this ad nauseam for as many possible filters as you can think of, you essentially get infinite pages and that becomes a giant mess for any search engine to deal with. The basic idea here is we’re saying, okay,we still want our users to be able to filter. We still want our users to be able to render the page in different ways, but we want to tell the search engine, “Hey, Google.
I know it looks like we have a thousand pages but really we only have one.” That’s what the canonical URL is. It’s a very specific suggestion to searchengines to say, “This is the master copy of the URL. Only put this one in search results and ignore the rest.” A couple of other things on canonical URLsand a good way to think about this. Duplicate content is bad. Duplicate content can hurt your crawl budget Google and other search engines allocate a certain number of requests per day or per week or per month and if they’re spending all that time crawling what garbage that you don’t want them to crawl, that’s hurtful to you.
That’s not good for you as a webmaster. Duplicate content can lower rankings. If Google is seeing a pattern of duplicates on your site over and over and over again, that can drop you in the rankings. Obviously that’s not good. Let’s say you don’t care about crawl budget. Let’s say duplicate content isn’t hurting your rankings yet. If users are finding your unnecessary garbage in search results, that’s bad. That’s not good at all. Do keep that in mind.
These are all reasons why you want to solve the duplicate content problem because it’s not good for search engines, it’s not good for you and your rankings and it’s not good for your users either. That’s why duplicate content is bad. Let’s look at some examples here. Let’s say we are nike.com and we have a page on our website called men’s shoes and we want it ranking number one in Google for the term”men’s shoes.” On nike.com/mens-shoes, this is the original version of the page and we would implement a canonical tag here, link rel=”canonical”and the href is nike.com/mens-shoes.
This is called a self-referential canonical tag. The original master copy of the page is pointing to itself and that’s fine. We can go ahead and do that. There’s no problem at all. We are self-canonicalizing here. No issues at all and that’s totally fine. This is the master copy. The canonical tag is pointing to itself. This is a fine suggestion for Google. Let’s look at that same page with a filter on it. Let’s say we go to nike.com/mens-shoes andwe want to sort this page by everything that’s size 10, and so we add a filter.
It’s ?size=10. We now have a new page that has a new filter on it, but we don’t want that page in search results, so the canonical tag stays the same. We have this page. It exists for users but if Google were to ever find it, it sees that canonical tag in the head and we’re here saying, “You’re on this page, but actually don’t index it.
We don’t want to index this.” This would be an example where there isn’t any searcher intent for this particular type of phrase. We didn’t want this page ranking. There wasn’t any search volume for it and so we were pointing it back to the master copy. If there’s search volume for Nike shoe size10, maybe we would leave this alone and self-canonicalize. Maybe we would want it in the search results.
Now we’re looking at all the Nike men’s shoes that are size 10 and color red. It’s the same thing. It’s another page. It’s a different set of filtering. It’s more specific but we don’t want this page in the index. We’re going to keep the canonical tag back on the master copy as this URL sort of gets more and more parameters on it, we’re still pointing back to the master copy. Link rel=”canonical”, href, nike.com/mens-shoes. Let’s add another parameter, another filter,nike.com/mens-shoes, size=10&color=red &sale=yes.
Again, we don’t want this in the index so we’re going to go ahead and canonicalize back to the master copy. Cool. In every one of those situations, we had essentially[inaudible 00:07:19] searches like filtering that which is too lengthy and not necessary,not useful to users coming in from search engines. Just too many pages so we wanted to kill all of them. We want to kill all those pages so we would canonicalize those URLs up to the master copy.
However, let’s look at a different situation here. It’s the same URL however there’s a certain type of filtering that’s really important to us. In this case, it’s Jordans. Nike’s Jordan shoe. Let’s say we are doing a promotion on the homepage and there’s 25,000 people a month searching for Nike men’s Jordan shoes, and we want this page in the index. We would self-canonicalize this URL. Same URL, nike.com/mens-shoes?type=jordans. This is a different type of filtering. In this particular case, we would not canonicalize back up to the master copy.
We would self-canonicalize. What we’re doing here is when Google gets this URL, Google is saying, “Hey.” We’re saying to Google, “Hey, actually this page is the master copy. Please put it in your index.” In this particular situation, the page isin important to us. It’s a unique and good experience for users. It has search volume behind it. We would self-canonicalize and essentially create a new page. You do want to watch out and make sure that it’s not a complete duplicate of the original men’s shoes page.
Make sure there’s some differentiation therefore sure. This would be one way to capture a bunch of search volume that you may not be capturing if you were to self-canonicalize back up to the core page because you’re not able to get it into the index if you do that. The way to think about the real canonical tag and canonicalization in general is what would search results look like if Google didn’t have a way to remove duplicate content? Next time you’re on an eCommerce site, everytime you click anything, watch the URL bar.
It is a massive, massive, massive problem. The point here is that the internet wouldsuck without this. It’s really good that Google has built in the technical way to handle a lot of this. We have a ton of URLs here that are effectively all the same thing, mens-shoes, mens-shoes, size=10, size=10&color is red. Think about every possible permutation of this. It can start to get really messy really, really quick.
The internet would be terrible if we couldn’t handle this. This is one way to handle duplicate content. It makes it easier for both search engines and for users. A couple more tips on real canonical and canonical URLs. Self-referential canonical tags are fine. There’s some debate out there at a really high technical level with massive web applications. Trip advisor is doing some really interesting stuff on this.
Take a look at their source code and see what they are doing. I’ve seen evidence for and against this. It really depends on your situation. If you’re just getting into this, self-referential canonical tags are fine. Canonicalize your homepage. Homepages I find are one of the most oddly linked to things. There’s the http version. There’s https. There’s http://www. There’s https://www. There’s www.website.com/index.php, /index.html.
There’s a ton of different ways to render homepages. People mess it up all the time. Canonicalizing your homepage to the core one,picking the core one and then canonicalizing every other variation around that is really helpful, so I highly recommend you do that. Robots.txt is generally a directive. 301 redirects are generally a directive. You’re telling a search engine, “Hey, you have to do this.” Canonical tags are a suggestion. Google has put out a lot of information on this.
Their point here is that webmasters mess this up a lot. Webmasters mess up a lot of stuff a lot and so this is one of those situations where they actively admit that they’re allowed to ignore you if they think you’re shooting yourself in the foot. Canonical tags are one additional suggestion that we give to search engines to advise them on how to handle your duplicate content. A lot of people say, “Hey, I put a canonical tag and it’s not working.
What happened?” Take a look a little more deeply at what you have going on because Google may be getting mixed signals from you which we’ll talk about next. But the basic idea here is like they’re not taking full responsibility for this tag as a directive. They’re not saying that they will absolutely enforce it. But more often than not, I see that they do. Cross-domain canonicalization is okay. One example of this might be let’s say your publishing site …
You have 20 different websites and every time you write a new blogpost, it cascades across all 20 of your domains. It is totally fine if you write a blog post for website1.com. It is totally fine to post that on website2and website3 and website4.com and cross-domain canonicalize back to the original. Anything that does cross canonicalize won’t show up in the index. You’re essentially saying, “Hey, Google, don’t put this in the index.
The original version is over here.” However, any links that you get on any of those pages should be attributed back to the original master copy. Cross-domain canonicalization is totally fine so feel free to do that if you want. Don’t send mixed signals. There’s a lot of ways to mess this up. Taking two pages and canonicalizing them to each other, taking two pages, canonicalizing one to one and 301 redirecting one to the other, there’s a lot of different ways to mess this up.
Don’t send mixed signals. Figure out what your plan is. Figure out what you want your master copyto be and hammer that plan. Make sure it’s very, very clear in every element. The other thing to think about 301 redirects versus canonical tags, a lot of people say, “Okay. Essentially I want to kill a bunch of duplicate pages. I want to consolidate them all into one page.
The general consensus is this. First of all, 301 redirect seems to be a stronger signal in terms of link equity. If you have two deprecated pages, you want to kill them, you want to pass all those links over to a different page, it feels like most people see that 301 redirects are generally more helpful in that sense. You generally seem to get more link equity when you do a 301 redirect. Do keep that in mind.
The other thing to think about is that the experience for the user is different with a canonical tag. With a 301 redirect, the user moves to the new end page. With a canonical tag, they do not. You’re saying to Google, “Hey, Google. The page you’re on is a copy. Ignore it and send any links over to this other page.” But the user is still staying there so keep that in mind. The first thing people think of is they think about manipulation.
They say, “Okay. Then I can rank super high and win the internet.” It doesn’t work that way. It looks like what Google is doing is finding a document relevance piece to this.
They want to make sure that it’s an actual copy. If the page you’re canonicalizing to is dramatically different than the one that you’re currently on, it’s going to be ignored. If you have a page about blue widgets and you’re canonicalizing to a page about gorillas, it’s not going to work is the basic point there so do keep that in mind when you think about 301 redirects versus canonical tags.
That’s it. That’s really all there is to canonical URLsand the rel canonical tag. I hope that was useful. If it was helpful and if you learned something today, go ahead and click “subscribe” down below for even more digital marketing tactics and tips from us. If you’re on YouTube, I would love a comment. What do you think? Is this how you implement rel canonical tags? Have you seen it done differently? I’d love to hear from you.
I read every single one. Finally, if you want even more from us with a super comprehensive SEO checklist, there’s a free downloadable.