Fix Google Search Console coverage issues, canonical problems, sitemap errors, and crawl budget waste in Next.js apps.
generateMetadata, robots.js, and sitemap routes.| Status | Meaning | Fix |
|---|---|---|
| Crawled – not indexed | Google crawled but chose not to index | Improve content quality + canonical + internal links |
| Duplicate without canonical | Multiple URLs serve same content, no canonical | Add explicit canonical to the preferred URL |
| Excluded by noindex | noindex tag present |
Remove noindex if page should be indexed |
| Duplicate, Google chose different canonical | Google prefers a different URL than you specified | Align canonical with the URL Google naturally picks |
| Alternative page with proper canonical | Correct — non-preferred duplicate pointing to canonical | Expected behavior, not a problem |
| Not found 404 | Page deleted or URL changed | Add redirect or restore page |
| Discovered – not indexed | Google knows it exists but hasn't crawled it | Improve internal linking + crawl budget |
| Page with redirect | Redirect chain or redirect to wrong target | Shorten redirect chain, verify destination |
// app/blog/my-post/page.js
export const metadata = {
title: 'My Post Title',
alternates: {
canonical: 'https://www.yourdomain.com/blog/my-post',
},
};
export async function generateMetadata({ params }) {
return {
alternates: {
canonical: `https://www.yourdomain.com/blog/${params.slug}`,
},
};
}
// ❌ WRONG — relative URL
canonical: '/blog/my-post'
// ❌ WRONG — missing trailing slash inconsistency
// (pick one and stick with it sitewide)
// ✓ CORRECT — absolute URL, consistent scheme + subdomain
canonical: 'https://www.yourdomain.com/blog/my-post'
Find pages that are accidentally noindexed:
# Search for noindex in metadata
rg -n --glob '*.{js,ts,jsx,tsx}' 'noindex|robots.*noindex' app pages
# Check layout.js — a noindex here affects ALL pages
grep -n "robots" app/layout.js
In Next.js App Router, robots in the root layout applies globally. Only set it there if you want the whole site affected.
// app/layout.js — only set robots if you need sitewide control
export const metadata = {
// ✓ Allow indexing
robots: { index: true, follow: true },
// ❌ This would noindex the entire site:
// robots: { index: false }
};
curl -sI https://www.yourdomain.com/sitemap.xml | grep -i "content-type\|status"
curl -s https://www.yourdomain.com/sitemap.xml | head -20
// app/sitemap.js
export default async function sitemap() {
const baseUrl = 'https://www.yourdomain.com';
// Static pages
const staticPages = [
{ url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
{ url: `${baseUrl}/about`, lastModified: new Date(), changeFrequency: 'monthly', priority: 0.8 },
];
// Dynamic pages (fetch from DB or CMS)
const posts = await getPosts(); // your data fetch
const dynamicPages = posts.map(post => ({
url: `${baseUrl}/blog/${post.slug}`,
lastModified: new Date(post.updatedAt),
changeFrequency: 'weekly',
priority: 0.7,
}));
return [...staticPages, ...dynamicPages];
}
// app/sitemap-tools/sitemap.js
// app/sitemap-blog/sitemap.js
// Each returns an array of URL entries
Pages must be statically generated (or SSR with metadata in HTML) for Google to see SEO tags.
# Check build output — pages should show ● (static) not λ (dynamic)
npm run build 2>&1 | grep -E "○|●|λ|/blog|/tools"
○ /about (static)
● /blog/[slug] (SSG) ← good
λ /api/data (serverless) ← expected for APIs
If important pages are λ (fully dynamic with no static generation), add:
// app/blog/[slug]/page.js
export async function generateStaticParams() {
const posts = await getPosts();
return posts.map(post => ({ slug: post.slug }));
}
Pages with zero internal links are rarely indexed. Every important page should be reachable from:
# Find pages that have no inbound links from other pages
# (manual check — grep for the slug across all files)
grep -r "/blog/my-orphan-post" --include="*.{js,ts,jsx,tsx,md}" . | grep -v "sitemap\|the-page-itself"
# Find all redirects in Next.js config
grep -A 3 "redirects" next.config.js
# Check for redirect chains (A → B → C — should be A → C)
# Test a suspected chain:
curl -sI https://www.yourdomain.com/old-url | grep -i location
// next.config.js — keep redirects flat (no chains)
async redirects() {
return [
{
source: '/old-url',
destination: '/new-url', // Must NOT itself redirect
permanent: true, // 308 for SEO
},
];
}
curl -s https://www.yourdomain.com/robots.txt
# ✓ Good
User-agent: *
Allow: /
Sitemap: https://www.yourdomain.com/sitemap.xml
# ❌ Bad — disallows crawling of important content
Disallow: /blog/
Disallow: /tools/
// app/robots.js (Next.js App Router)
export default function robots() {
return {
rules: { userAgent: '*', allow: '/' },
sitemap: 'https://www.yourdomain.com/sitemap.xml',
};
}
generateStaticParams added for dynamic routes with known slugs