Crawl Errors
What This Means
Crawl errors occur when search engine bots (like Googlebot) cannot access pages on your website. This prevents those pages from being indexed and appearing in search results.
Impact:
- Pages not indexed in search engines
- Lost organic traffic
- Broken user experience
- Wasted crawl budget
Types of Crawl Errors
Server Errors (5xx)
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
Not Found Errors (4xx)
- 404 Not Found
- 410 Gone
- 403 Forbidden
Redirect Errors
- Redirect chains (too many redirects)
- Redirect loops
- Broken redirects
Robots.txt Issues
- Blocking important pages
- Malformed robots.txt
- Overly restrictive rules
How to Diagnose
1. Google Search Console
- Go to Search Console > Indexing > Pages
- Check "Not indexed" section
- Review crawl errors under each category
- Click into specific issues for affected URLs
2. Crawl Your Site
Use crawling tools:
3. Server Log Analysis
Check server logs for:
- 4xx and 5xx response codes
- Googlebot requests
- Error patterns
General Fixes
Fix 1: 404 Not Found Errors
Identify source:
- Find pages linking to 404 URLs
- Check if URL changed or page deleted
- Review external links pointing to missing pages
Solutions:
- Redirect to relevant existing page (301 redirect)
- Restore the missing content
- Update internal links
- Request external sites update their links
- If intentionally removed, return 410 (Gone)
Fix 2: Server Errors (5xx)
Common causes:
Solutions:
- Review server error logs
- Increase server resources
- Fix application bugs
- Optimize database queries
- Consider better hosting
Fix 3: Redirect Issues
Redirect chains: A → B → C → D (bad) A → D (good)
Solutions:
- Update redirects to point directly to final URL
- Maximum 1-2 redirects per chain
- Remove redirect loops
- Audit redirects regularly
Fix 4: Robots.txt Problems
Check robots.txt:
https://yoursite.com/robots.txt
Common issues:
# Too restrictive - blocks everything
User-agent: *
Disallow: /
# Blocking important directories
Disallow: /products/
Disallow: /blog/
Solution:
- Allow important pages to be crawled
- Only block admin/private areas
- Test with Google's robots.txt tester
Fix 5: Soft 404s
Pages returning 200 but with "not found" content.
Detection:
- Google Search Console flags these
- Crawlers identify thin/duplicate "not found" content
Solution:
- Return proper 404 status code
- Or provide valuable content on the page
Platform-Specific Guides
| Platform | Guide |
|---|---|
| Shopify | Shopify Redirects |
| WordPress | WordPress 404 Handling |
| Squarespace | Squarespace URL Mapping |
Prevention
- Monitor Search Console weekly
- Set up alerts for crawl errors
- Test redirects after site changes
- Audit links during site migrations
- Use proper status codes
Verification
After fixes:
- Request re-crawl in Search Console
- Use "URL Inspection" tool
- Monitor error count trending down
- Check pages returning to index