Canonicalization: Fix Duplicate Content

Q: How do I know if my website has duplicate content issues?

Try accessing your website with and without 'www' (www.yoursite.com and yoursite.com). Try it with http and https. Add a trailing slash and remove it. If both versions load without redirecting to a single version, you likely have duplicates. You can also use Google Search Console to check for duplicate pages under the Pages report, or search Google for site:yoursite.com and look for multiple listings of the same content.

One shop, two addresses, zero sales.

Bea runs a jewellery shop online. She had been selling handmade earrings and bracelets for over a year, but her traffic from Google was oddly flat. She was publishing product descriptions, writing blog posts about jewellery care, and doing everything the guides told her to do. Nothing moved the needle.

When she finally asked for a technical audit, the problem was embarrassingly simple. Her website was accessible at both www.beahandmade.com and beahandmade.com — two different addresses loading the exact same content. Google was treating them as two separate websites. Instead of building ranking power on one strong site, her authority was being split across two identical copies. Every link, every customer visit, every social media share — half went to one version, half to the other.

Bea had no idea this was happening. As far as she could tell, her website was fine. It loaded, it looked good, customers could place orders. But behind the scenes, Google was confused. And when Google is confused, your rankings suffer.

This is one of the most common technical SEO problems in existence. And the fix takes about five minutes.

The problem

What duplicate content actually means.

Duplicate content means the same content (or very similar content) appears at more than one web address. To a human, this seems harmless. If your about page loads at both yoursite.com/about and yoursite.com/about/, you would never notice — the page looks identical. But to Google, those are two completely different URLs, and it has to decide which one to show in search results.

Here is an analogy. Imagine you own a physical shop, and the city accidentally listed your business at two different street addresses. One listing says you are at 123 Main Street. The other says 456 Main Street. Both addresses lead to the same shop. But when a delivery driver looks you up, they do not know which address to use. Some deliveries go to one, some go to the other, and your business reputation gets split between two listings instead of building on one.

That is exactly what happens with duplicate content online. Google sees multiple addresses for the same content and has to guess which one matters. Sometimes it guesses wrong. Sometimes it splits your ranking signals across both versions, weakening each one.

This is more common than you think

Most small business websites have duplicate content without the owner knowing. Here are the most common causes:

www vs. non-www: www.yoursite.com and yoursite.com both load the same site
HTTP vs. HTTPS: http://yoursite.com and https://yoursite.com both work
Trailing slashes: yoursite.com/services and yoursite.com/services/ both exist
URL parameters: yoursite.com/shoes and yoursite.com/shoes?color=red&size=8 show the same product
Pagination: Blog archive page 1 and blog archive page 2 share introductory content
Sorting and filtering: yoursite.com/products and yoursite.com/products?sort=price-low show the same products in different order

If even one of these situations applies to your site, you have duplicate content. And if you have not addressed it, Google is guessing which version to rank.

Why it matters

How duplicate content hurts your visibility.

Lorenzo runs a travel blog. He writes detailed guides about destinations across the Philippines. His blog uses pagination — when visitors reach the bottom of his articles list, they click “Next Page” to see older posts. The problem? His blog’s introductory text, sidebar content, and meta description appeared on every paginated page. Pages 1, 2, 3, and 4 of his blog archive all looked substantially similar to Google.

Lorenzo noticed that his blog posts were not ranking as well as competitors who wrote shorter, less detailed content. The reason was not his writing. It was that Google could not figure out which of his archive pages was the main one. Instead of sending all the ranking power to page 1 of his blog, Google was distributing it across four near-identical pages.

Here is what happens when Google encounters duplicate content:

Diluted ranking signals. Links, social shares, and visitor engagement are split between duplicate versions instead of concentrating on one URL. If ten websites link to your content but five link to version A and five link to version B, each version only gets half the benefit.
Wasted crawl budget. Google allocates a certain amount of time to crawl your website. Every minute it spends crawling a duplicate page is a minute it could have spent discovering your new content.
Wrong page in search results. Google might decide to show the wrong version of your page. Instead of your clean, well-written product page, it might show the version with messy URL parameters appended.
Inconsistent user experience. If Google indexes the non-www version but your internal links point to the www version, visitors bounce between two versions. Some might not have your SSL certificate properly configured, showing a security warning.

The common misconception is that Google “penalises” duplicate content. Technically, it does not apply a formal penalty. But the practical effect — lower rankings, confused indexing, wasted crawl budget — feels exactly like a penalty. Whether you call it a penalty or not, the result is the same: less visibility for your business.

The solution

Canonical tags tell Google which version is the real one.

A canonical tag (also written as rel=“canonical”) is a small piece of HTML code placed in the <head> section of a web page. Its job is simple: it points to the “official” version of that page.

Think of it as a sign on a building that says, “This is a copy. The real office is at this address.” Google reads the sign, nods, and sends all the ranking power to the address you specified.

Here is what the tag looks like in HTML:

<link rel=“canonical” href=“https://yoursite.com/preferred-page/”>

You place this tag in the <head> section of every page on your site. On your preferred (official) page, the canonical tag points to itself. On duplicate versions, it points to the preferred version.

Real example — how Bea fixed her jewellery site

Bea’s developer added a canonical tag to every page on her site pointing to the non-www, HTTPS version. So the page at www.beahandmade.com/earrings/gold-hoops/ now contains a canonical tag pointing to https://beahandmade.com/earrings/gold-hoops/. They also set up a 301 redirect from the www version to the non-www version, so visitors always land on the correct URL.

Within three weeks, Google consolidated her rankings. Pages that had been bouncing between position 15 and 25 jumped to position 8 and 12. Her organic traffic increased by about 40 percent in the first month — not because she created any new content, but because Google finally understood which pages to rank.

When to use a canonical tag vs. a redirect

This is a question Hazel had. She runs an e-commerce store selling skincare products. Some of her product pages were accessible through multiple URLs because of filtering parameters — things like ?category=moisturisers or ?ref=homepage-banner appended to the URL.

The rule of thumb is straightforward:

Use a 301 redirect when the duplicate page should not exist at all. Visitors should never land on it. Example: redirecting http:// to https://, or www. to non-www.
Use a canonical tag when both URLs need to stay accessible but you want Google to focus on one. Example: a product page with colour filter parameters. Visitors might arrive through /moisturiser/?shade=light, and you need that URL to work, but you want Google to rank the clean /moisturiser/ URL.

Hazel used canonical tags on her filtered product pages and redirects for her http-to-https and www-to-non-www issues. The combination covered all her duplication scenarios.

How to do it

Setting up canonical tags on your website.

WordPress (with Yoast SEO or Rank Math)

Good news: if you use Yoast SEO or Rank Math, canonical tags are added to every page automatically. The plugin sets the canonical URL to the page’s own URL by default. If you need to change it (for example, if you have two similar pages and want to point one to the other), you can do so in the SEO settings panel on each page or post editing screen. Look for the “Canonical URL” field under the Advanced tab.

Shopify

Shopify adds canonical tags to all pages automatically. Product pages, collection pages, blog posts, and regular pages all get a self-referencing canonical tag. Shopify also handles the common issue of products appearing in multiple collections by canonicalising them to the main product URL. In most cases, no action is needed.

Wix

Wix generates canonical tags automatically for all pages. If you need to customise the canonical URL for a specific page, go to that page’s SEO settings (found in the page settings panel) and edit the canonical URL field.

Squarespace

Squarespace adds canonical tags automatically. It handles trailing slash and pagination issues on its own. No manual intervention is required for most sites.

Custom-built or static HTML sites

If your site was built from scratch, you (or your developer) need to add the canonical tag manually to every page. Place this line in the <head> section:

<link rel=“canonical” href=“https://yoursite.com/this-page/”>

Replace the URL with the full, absolute URL of the preferred version of that page. Every page should have a canonical tag, even if it points to itself. This is called a self-referencing canonical, and it is considered best practice because it removes all ambiguity.

What Dennis did for his location pages

Dennis runs a food delivery service with separate pages for each city they cover — Manila, Makati, Pasig, Quezon City, and so on. The problem was that each location page shared about 70 percent of the same content (menu descriptions, ordering process, delivery times) with only the city name and a few local details changed.

Dennis could not use canonical tags here because each page targeted a different city — they were supposed to be different pages. Instead, he rewrote each page with substantially unique content: unique customer testimonials from each area, specific delivery zone maps, local restaurant partners, and neighbourhood-specific tips. He kept canonical tags self-referencing on each page to confirm that each location page was its own official version.

This is an important distinction. Canonical tags solve the problem of identical pages at different URLs. They do not solve the problem of pages that are too similar in content. If your pages are supposed to be different, make the content genuinely different.

Common causes

The six most common sources of duplicate content.

1. www vs. non-www

This was Bea’s problem. Both www.yoursite.com and yoursite.com load your website, but Google treats them as different sites. The fix is to pick one version (most businesses choose non-www these days) and redirect the other to it. Then add canonical tags pointing to the chosen version on every page.

2. HTTP vs. HTTPS

If your site has an SSL certificate (the padlock icon in the browser), make sure the http:// version redirects to https://. If both versions load without a redirect, Google sees two copies of your entire website. Most hosting providers let you enable “Force HTTPS” with a single toggle in your dashboard.

3. Trailing slashes

Does yoursite.com/about and yoursite.com/about/ both load? If so, you have duplicates. Pick one format (with or without the trailing slash) and be consistent. Set up redirects for the version you do not want, and make sure your canonical tags use the format you chose.

4. URL parameters

Hazel’s skincare store had product URLs like /vitamin-c-serum/ that could also be accessed as /vitamin-c-serum/?ref=newsletter, /vitamin-c-serum/?utm_source=facebook, and /vitamin-c-serum/?variant=30ml. Each parameter created a new URL in Google’s eyes, even though the page content was identical or nearly identical. The fix was adding canonical tags to all parameterised URLs pointing back to the clean, parameter-free URL.

5. Pagination

Lorenzo’s travel blog had this issue. When your blog archive, product category, or search results span multiple pages, Google can get confused about which page to rank. The solution is to add a self-referencing canonical to each paginated page. Page 1 of your blog should canonical to page 1. Page 2 should canonical to page 2. Do not canonical all pages to page 1, because pages 2, 3, and 4 contain unique content (different blog posts). The key is making sure each paginated page has its own canonical and that Google can follow the pagination links to discover all content.

6. Product and content variants

Marites runs a clothing store. She sells a cotton blouse that comes in five colours: white, cream, blush, navy, and black. On her site, each colour has its own URL: /cotton-blouse-white/, /cotton-blouse-cream/, /cotton-blouse-blush/, and so on. The product description, sizing chart, material details, and care instructions are identical across all five pages. Only the product photo and colour name change.

Marites had two options. She could combine all variants onto a single page with a colour selector (the approach most e-commerce platforms recommend) and eliminate the duplicate URLs entirely. Or she could keep separate pages but add canonical tags pointing all colour variants to one primary version — usually the most popular colour. She chose the first approach, consolidating everything onto one product page with a colour dropdown. Her single “Cotton Blouse” page now ranks far better than any of the five individual colour pages ever did.

AI search impact

Duplicate content confuses AI search engines too.

In 2026, your content does not just appear in Google’s traditional search results. It can also be cited by AI search tools like ChatGPT, Perplexity, Gemini, and Microsoft Copilot. These AI systems pull from indexed web content to generate their answers. When they encounter duplicate versions of your pages, several things can go wrong.

First, AI systems may cite the wrong URL. If Perplexity finds your content at both www.yoursite.com/services/ and yoursite.com/services/, it might link to the version you do not want visitors landing on — perhaps the one without a properly configured SSL certificate or the one with broken tracking parameters.

Second, AI systems may view your duplicated content as lower quality. When the same text appears at multiple addresses, it dilutes the perceived authority of the content. AI tools favour content that appears authoritative and well-maintained. A site with clean, deduplicated URLs signals that kind of quality.

Third, AI crawlers (like GPTBot, PerplexityBot, and others) have their own crawl budgets. When they waste time crawling duplicate versions of your pages, they have less capacity to discover your new or updated content. This means your freshest material takes longer to appear in AI-generated answers.

Proper canonicalization ensures that when AI search tools encounter your content, they find a single, authoritative version. This increases the likelihood of correct citations and links back to your preferred pages.

Action steps

How to check your own site right now.

You do not need any special tools to check for the most common duplicate content issues. Try these steps:

Test www vs. non-www. Type www.yoursite.com into your browser. Then type yoursite.com. Does one redirect to the other, or do both load independently? If both load without redirecting, you have a duplicate content issue.
Test HTTP vs. HTTPS. Type http://yoursite.com into your browser. Does it redirect to https://yoursite.com? If the http version loads on its own without redirecting, you have another duplicate.
Test trailing slashes. Visit yoursite.com/about and then yoursite.com/about/. Does one redirect to the other? If both load independently, that is another source of duplication.
Check your canonical tags. On any page of your site, right-click and select “View Page Source.” Search for “canonical” in the source code. You should find a line that looks like <link rel=“canonical” href=“...”>. If it is missing, your pages do not have canonical tags.
Search Google for your site. Type site:yoursite.com into Google. Look through the results. Do you see the same page listed multiple times with slightly different URLs? That confirms Google has indexed duplicates.
Check Google Search Console. If you have Search Console set up, go to the Pages report. Look for pages marked as “Duplicate without user-selected canonical” or “Duplicate, Google chose different canonical than user.” These alerts tell you exactly where your duplication issues are.

If you find issues in any of these tests, do not panic. Most can be fixed with a combination of redirects and canonical tags. If you use WordPress, Shopify, Wix, or Squarespace, the platform handles most of this for you — but it is still worth checking that everything is configured correctly.

Common questions

Frequently asked questions about canonicalization.

What is duplicate content and why is it bad for SEO?

Duplicate content means two or more URLs on your website show the same (or very similar) content. It is bad for SEO because Google does not know which version to rank, so it may split your ranking power between the duplicates or pick the wrong version to show in search results. This can cause your important pages to rank lower than they should.

What is a canonical tag and what does it do?

A canonical tag is a small piece of HTML code you add to the head section of a web page. It tells Google which version of a page is the “official” one when multiple versions exist. The tag looks like this: <link rel=“canonical” href=“https://yoursite.com/preferred-page/”>. Google will then focus its ranking power on the URL you specify and treat the other versions as copies.

How do I know if my website has duplicate content issues?

Try accessing your website with and without “www” (www.yoursite.com and yoursite.com). Try it with http and https. Add a trailing slash and remove it. If both versions load without redirecting to a single version, you likely have duplicates. You can also use Google Search Console to check for duplicate pages under the Pages report, or search Google for site:yoursite.com and look for multiple listings of the same content.

Will Google penalise my site for duplicate content?

Google does not apply a formal penalty for most duplicate content situations. However, it does get confused about which version to rank, which dilutes your ranking signals and can cause your pages to perform worse in search results. The practical effect is the same as a penalty — lower visibility — even though it is technically just confusion rather than punishment.

Do Shopify, WordPress, and Wix handle canonical tags automatically?

Yes, all three platforms add canonical tags to your pages automatically. Shopify adds them to product and collection pages. WordPress SEO plugins like Yoast and Rank Math generate canonical tags for every page. Wix adds them automatically as well. However, you should still check that they are pointing to the correct URLs, especially if you have products with multiple variants or pages accessible through different URL structures.

What is the difference between a canonical tag and a 301 redirect?

A 301 redirect physically sends visitors from one URL to another — the old page is no longer accessible. A canonical tag keeps both URLs accessible to visitors but tells Google to only count one of them for ranking purposes. Use a redirect when you want to permanently remove a page. Use a canonical tag when you need both URLs to remain accessible (such as product pages with different sorting or filtering options) but want Google to focus on one version.

Does duplicate content affect how AI search engines cite my site?

Yes. AI search tools like ChatGPT, Perplexity, Gemini, and Copilot pull from indexed content. When duplicate versions of your pages exist, these AI systems may cite the wrong version, cite an outdated version, or become confused about which page is authoritative. Proper canonicalization helps AI systems identify your preferred content, increasing the chance that they cite the correct, most up-to-date version of your pages.

Quick glossary

Terms used in this article.

Duplicate content: The same or substantially similar content appearing at more than one URL on your website. Confuses search engines about which version to rank.
Canonical tag (rel=“canonical”): An HTML tag placed in the head section of a page that tells search engines which URL is the preferred, official version of that content.
Self-referencing canonical: A canonical tag on a page that points to itself. Best practice for every page on your site, confirming to Google that this URL is the official version.
301 redirect: A permanent redirect that automatically sends visitors and search engines from one URL to another. The original URL becomes inaccessible.
URL parameters: The extra information added to a URL after a question mark, such as ?color=red or ?sort=price. Often creates duplicate content by generating new URLs for the same page.
Crawl budget: The limited amount of time and resources Google and AI crawlers allocate to scanning your website. Duplicate pages waste this budget.
Pagination: Splitting content across multiple pages (page 1, page 2, page 3, etc.), commonly used for blog archives, product categories, and search results.

Bottom line: Duplicate content is one of the most common and most overlooked technical SEO issues. Bea’s jewellery shop lost months of ranking potential because of a simple www vs. non-www split. Hazel’s product URLs were multiplying with every tracking parameter. Marites had five pages competing against each other for the same product. In every case, the fix was straightforward — canonical tags, redirects, or both. Check your own site using the steps above. If you find duplicates, fix them. Your rankings will thank you, and so will every AI search engine trying to figure out which version of your content to cite.

Next: JavaScript SEO → ← Previous: Redirects

Need help?

Want your duplicate content issues found and fixed?

A technical audit catches every canonicalization issue on your site — plus dozens of other problems you might not know about.

Get a Technical Audit Free Consultation