Today, for the millionth time, I had to explain why https://www.example.co.za and https://example.co.za aren’t the same thing.
Not similar. Not “basically the same.” Not “Google will figure it out.”
They’re duplicates. And duplicates are killing your search visibility while you obsess over AI-generated blog posts and hero images.
The Domain Split That Costs Millions
Here’s what happens when you don’t resolve your canonical domain:
Googlebot arrives. It sees example.co.za. It crawls 500 pages. Then it discovers www.example.co.za—and crawls those same 500 pages again.
Congratulations. You just halved your crawl budget without adding a single new piece of content.
For e-commerce sites, this isn’t theoretical. It’s hemorrhaging authority. While your competitors capture fresh SERP real estate, Google is stuck re-crawling your homepage for the fifteenth time because you couldn’t decide if you’re “www” or not.
The fix is one line in .htaccess. One DNS record. One canonical tag. Yet it remains undone in boardrooms where “SEO hygiene” sounds less exciting than “AI content strategy.”
Crawl Budget Isn’t a “Nice-to-Have”
Let me be blunt: Crawl budget is your site’s oxygen supply.
Google doesn’t have infinite resources to index the web. When it wastes cycles on your duplicate protocols (http vs https), subdomains (shop. vs www.), and trailing slashes, it has less budget for:
- Your new product pages
- Your updated pricing
- Your actual competitive differentiators
Big corporations with 100,000+ SKUs are particularly guilty here. I’ve seen ecommerce giants with 40% of their crawl budget consumed by parameter variations of the same three category pages. Meanwhile, their Black Friday landing pages—built with blood, sweat, and marketing budget—sit unindexed for weeks.
Pretty pictures don’t rank if Google never sees them.
Duplicate Content: The Silent Killer
“But we write unique descriptions!”
Irrelevant. When your site serves identical content across multiple URLs, you’re not just confusing crawlers—you’re diluting your link equity. That high-authority backlink pointing to www. doesn’t fully count for the non-www version. That social share? Split between two URLs. Your PageRank? Divided, conquered, and diminished.
The result? Neither version ranks as well as either could alone.
And don’t get me started on:
- Mobile vs desktop subdomains (still happening in 2026)
- Staging sites left crawlable (indexed, ranking, competing with production)
- Print-friendly pages (yes, really)
- Session IDs in URLs (killing ecommerce sites daily)
The AI Distraction
We’re in an arms race of AI content generation. Everyone’s pumping out “optimized” articles at scale, chasing the latest LLM integration, building chatbots that nobody asked for.
Meanwhile, the fundamentals rot.
I’ve audited sites with 10,000 AI-generated blog posts and broken canonical chains. Sites with “semantic SEO” strategies that can’t handle basic pagination. Sites spending six figures on content generation while their XML sitemaps return 404s.
AI didn’t fix your information architecture. It didn’t resolve your hreflang conflicts. It didn’t consolidate your duplicate product variations.
You built a Ferrari on a cracked foundation.
The Checklist Nobody Wants
Before you generate another “10 Best [Product] for 2026” article, fix this:
- Pick one domain. www or non-www. 301 everything else. Yesterday.
- Consolidate protocols. HTTPS only. No exceptions.
- Canonical tags. Self-referencing on every page, no exceptions.
- Parameter handling. Tell Googlebot what to ignore in Search Console.
- Pagination. Rel=next/prev or proper infinite scroll implementation—not both.
- Hreflang. If you’re multinational, audit it monthly. It breaks silently.
- Robots.txt + meta robots. Noindex the junk, nofollow the waste, crawl the gold.
The Bottom Line
SEO isn’t sexy when it’s technical. It doesn’t demo well in stakeholder meetings. “We fixed our canonicalization strategy” doesn’t trend on LinkedIn like “We implemented AI-powered content workflows.”
But here’s what fixes do: They make everything else work.
Your AI content? Actually gets indexed. Your pretty pictures? Actually appear in image search. Your ad spend? Actually efficient because your landing pages load fast and pass authority.
The web is built on fundamentals. Ignore them for the shiny object, and you’re not innovating—you’re decorating a burning house.
Resolve your domains. Consolidate your duplicates. Respect the crawl budget.
Then, and only then, should you worry about making it pretty.
Found yourself explaining canonicalization for the millionth time? You’re not alone. But maybe—just maybe—this time it’ll stick.
