Duplicate Content
What This Means
Duplicate content occurs when identical or substantially similar content appears on multiple URLs, either within your own website (internal duplication) or across different websites (external duplication). Search engines struggle to determine which version to index and rank, leading to diluted SEO value and potentially lower rankings.
Types of Duplicate Content
Internal Duplication:
- Same content on multiple URLs on your site
- HTTP vs HTTPS versions
- WWW vs non-WWW versions
- Trailing slash vs non-trailing slash
- URL parameters creating duplicate pages
- Print versions of pages
- Mobile vs desktop versions (if separate URLs)
External Duplication:
- Your content copied to other websites (scraped)
- Syndicated content without proper attribution
- Product descriptions copied from manufacturers
- Press releases on multiple sites
Technical Duplication:
- Session IDs in URLs
- Tracking parameters (utm_, etc.)
- Faceted navigation creating URL variations
- Pagination without proper handling
- Case-sensitive URLs treated as different pages
Impact on Your Business
Search Rankings:
- Search engines don't know which version to rank
- Ranking power is diluted across duplicate URLs
- Original content may not rank if others outrank you
- Can trigger Google filters or penalties (in extreme cases)
- Search engines waste time crawling duplicates
- Less time spent on unique, valuable pages
- Important pages may not get crawled
- Slower indexing of new content
- Backlinks split across duplicate URLs
- Individual URLs have less ranking power
- Link value is diluted instead of consolidated
- Harder to build strong page authority
User Experience:
- Confusing to find same content on multiple URLs
- Inconsistent URLs make sharing difficult
- May encounter outdated versions
- Reduced trust in website quality
How to Diagnose
Method 1: Google Search Console
- Log into Google Search Console
- Navigate to "Coverage" report
- Look for:
- "Duplicate without user-selected canonical"
- "Duplicate, Google chose different canonical than user"
- Multiple versions of same page indexed
- Review "Page Indexing" report for duplicates
- Check "Sitemaps" for URLs submitted vs indexed
What to Look For:
- Pages flagged as duplicates
- Canonical tag conflicts
- Multiple versions of homepage indexed
- Parameter-based duplicates
Method 2: Site: Search Operator
- Google:
site:yourwebsite.com "exact page title" - Review how many results appear
- Check if multiple URLs have same content
- Look for HTTP/HTTPS and WWW variations
What to Look For:
- Multiple results for same title
- Different URLs with identical content
- Protocol variations (http/https)
- Subdomain variations (www/non-www)
Method 3: Screaming Frog SEO Spider
- Download Screaming Frog
- Crawl your website
- Navigate to "Content" → "Duplicate" tab
- Review:
- Duplicate pages (exact match)
- Near duplicates (similar content)
- Duplicate titles
- Duplicate meta descriptions
What to Look For:
- Pages with 100% content similarity
- Pages with >90% similarity (near duplicates)
- Duplicate title tags
- URL patterns creating duplicates
Method 4: Copyscape or Similar Tools
- Visit Copyscape
- Enter your page URL
- Search for duplicate content online
- Review results for:
- External sites with your content
- How much content is duplicated
- Whether proper attribution exists
What to Look For:
- Content scrapers copying your pages
- Syndicated content without canonical
- Competitor sites with your content
- Product descriptions on multiple sites
Method 5: Check URL Variations
Manually test common duplicates:
# Test these URL variations for your homepage:
https://www.example.com
https://example.com
http://www.example.com
http://example.com
https://www.example.com/
https://www.example.com/index.html
https://www.example.com/index.php
https://www.example.com/?
What to Look For:
- Multiple variations loading successfully
- 200 OK status on all variations
- No redirects to preferred version
- Different URLs serving same content
General Fixes
Fix 1: Set Preferred Domain with Canonical Tags
Tell search engines which version is the original:
Add canonical tag to all pages:
<head> <link rel="canonical" href="https://www.example.com/page/"> </head>Point all duplicate versions to canonical:
<!-- On https://example.com/page/ --> <!-- On http://www.example.com/page/ --> <!-- On http://example.com/page/ --> <link rel="canonical" href="https://www.example.com/page/">Self-referencing canonical on preferred version:
<!-- On https://www.example.com/page/ --> <link rel="canonical" href="https://www.example.com/page/">Canonical for URL parameters:
<!-- On https://www.example.com/page/?utm_source=email --> <link rel="canonical" href="https://www.example.com/page/">
Fix 2: Implement 301 Redirects
Permanently redirect duplicates to preferred version:
Redirect HTTP to HTTPS:
# Nginx server { listen 80; server_name example.com www.example.com; return 301 https://www.example.com$request_uri; }# Apache .htaccess RewriteEngine On RewriteCond %{HTTPS} off RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]Redirect non-WWW to WWW (or vice versa):
# Nginx - non-WWW to WWW server { listen 443 ssl; server_name example.com; return 301 https://www.example.com$request_uri; }# Apache .htaccess - non-WWW to WWW RewriteEngine On RewriteCond %{HTTP_HOST} ^example\.com [NC] RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]Redirect trailing slash inconsistencies:
# Nginx - add trailing slash rewrite ^([^.]*[^/])$ $1/ permanent;Redirect old URLs to new URLs:
# Apache .htaccess Redirect 301 /old-page.html https://www.example.com/new-page/ Redirect 301 /old-category/ https://www.example.com/new-category/
Fix 3: Use URL Parameters Tool in Search Console
Tell Google how to handle parameters:
- Log into Google Search Console
- Navigate to legacy "Crawl" → "URL Parameters"
- Add parameters and specify behavior:
- Passive - Doesn't change page content (e.g., utm_source)
- Active - Changes content (e.g., color, size)
- For passive parameters: Select "No: Doesn't change page content"
- For active parameters: Specify representative URL
Common parameters:
utm_*- Passive (tracking)sessionid- Passive (tracking)sort- Active (changes content)page- Active (pagination)color,size- Active (filters)
Fix 4: Implement Rel="Next" and Rel="Prev" for Pagination
Handle paginated content properly:
On paginated series:
<!-- Page 1 (https://example.com/blog/) --> <head> <link rel="canonical" href="https://example.com/blog/"> <link rel="next" href="https://example.com/blog/page/2/"> </head> <!-- Page 2 (https://example.com/blog/page/2/) --> <head> <link rel="canonical" href="https://example.com/blog/page/2/"> <link rel="prev" href="https://example.com/blog/"> <link rel="next" href="https://example.com/blog/page/3/"> </head> <!-- Page 3 (https://example.com/blog/page/3/) --> <head> <link rel="canonical" href="https://example.com/blog/page/3/"> <link rel="prev" href="https://example.com/blog/page/2/"> </head>Or use "View All" page approach:
<!-- On paginated pages --> <link rel="canonical" href="https://example.com/blog/all/">
Fix 5: Fix Faceted Navigation
E-commerce and filtered pages:
Use canonical tags for filtered URLs:
<!-- https://example.com/shoes?color=red&size=10 --> <link rel="canonical" href="https://example.com/shoes/">Or use noindex for filter combinations:
<!-- On filtered pages --> <meta name="robots" content="noindex, follow">Use clean URLs for important filters:
<!-- Instead of: /shoes?color=red --> <!-- Use: /shoes/red/ --> <link rel="canonical" href="https://example.com/shoes/red/">AJAX-based filters (don't change URL):
// Filter content without changing URL // No duplicate URL created
Fix 6: Handle Print and Mobile Versions
Separate print/mobile URLs:
Print versions:
<!-- On print version page --> <link rel="canonical" href="https://example.com/article/"> <!-- Or use CSS print styles instead of separate URL --> <style> @media print { /* Print styles */ } </style>Mobile versions (if using separate m. subdomain):
<!-- On desktop version (www.example.com) --> <link rel="alternate" media="only screen and (max-width: 640px)" href="https://m.example.com/page/"> <!-- On mobile version (m.example.com) --> <link rel="canonical" href="https://www.example.com/page/">Preferred approach: Responsive design (no separate URLs):
<!-- Single URL serves both desktop and mobile --> <!-- No duplicate content issue -->
Fix 7: Handle Syndicated Content
When publishing content on multiple sites:
Add canonical tag on syndicated versions:
<!-- On partner site publishing your content --> <link rel="canonical" href="https://www.yoursite.com/original-article/">Wait before syndicating:
- Publish on your site first
- Wait 1-2 weeks for indexing
- Then syndicate to other sites
- Include canonical tag or "originally published" link
Add attribution:
<p>Originally published on <a href="https://www.yoursite.com/article/">YourSite.com</a> </p>Use excerpt or modified version:
- Don't publish 100% duplicate
- Syndicate excerpt with link to full article
- Or create unique version for syndication
Platform-Specific Guides
Detailed implementation instructions for your specific platform:
Verification
After implementing fixes:
Check redirects:
curl -I https://example.com curl -I http://example.com curl -I http://www.example.com # All should 301 redirect to preferred versionVerify canonical tags:
- View source on all pages
- Confirm canonical tag present
- Verify pointing to correct URL
- Check consistency across site
Google Search Console:
- Wait 2-4 weeks for re-crawling
- Check "Coverage" report
- Verify duplicate warnings reduced
- Monitor indexed pages count
Site: search test:
- Google:
site:yourwebsite.com "page title" - Should see only one result
- Verify preferred version appears
- Check other versions redirect
- Google:
Screaming Frog re-crawl:
- Run new crawl
- Check duplicates tab
- Verify duplicates eliminated
- Confirm redirects in place
Common Mistakes
- No canonical tags - Letting search engines guess
- Inconsistent canonical tags - Different tags on same content
- Canonical to non-canonical URL - Self-referencing wrong version
- No 301 redirects - Relying only on canonical (use both)
- Ignoring URL parameters - Creating unlimited duplicates
- Multiple domains with same content - Splitting authority
- Not handling WWW vs non-WWW - Common duplicate source
- HTTP and HTTPS both accessible - Protocol duplication
- Trailing slash inconsistencies - Both versions accessible
- Syndicating without canonical - External duplicates hurting SEO
Duplicate Content Checklist
Technical Setup:
- Preferred domain set (WWW or non-WWW)
- HTTPS enforced site-wide
- 301 redirects from non-preferred versions
- Canonical tags on all pages
- Self-referencing canonical on originals
- URL parameters handled properly
Content Management:
- No identical content on multiple URLs
- Pagination handled with rel="next"/"prev"
- Filtered/faceted navigation uses canonical
- Print versions point to canonical
- Mobile versions handled (or responsive design)
- Syndicated content has canonical attribution
Monitoring:
- Google Search Console checked for duplicates
- Regular site: searches performed
- Screaming Frog crawls for duplicates
- External duplicate content monitored
- New content checked for duplication