The Quick Answer
A sitemap.xml is a file that tells search engines which pages on your site exist and when they were last updated:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-02-05</lastmod>
</url>
</urlset>
Place it at yoursite.com/sitemap.xml, reference it in robots.txt, and submit it to Google Search Console. That's the essentials.
The rest of this guide covers the full format, optional fields, deployment, and mistakes to avoid.
Why Sitemaps Exist
Search engines discover pages by following links. But not every page is well-linked:
- New pages may have no inbound links yet
- Orphan pages aren't linked from your navigation
- Deep pages sit many clicks from the homepage
- JavaScript-rendered content may not expose links to crawlers
A sitemap provides a direct list of URLs for crawlers to check. It does not guarantee indexing — it just makes discovery reliable.
Key point: A sitemap is a suggestion, not a command. Search engines decide independently what to crawl and index.
Sitemap XML Format
Required Elements
Every valid sitemap needs:
| Element | Description |
|---|---|
<?xml version="1.0" encoding="UTF-8"?> |
XML declaration |
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> |
Root element with namespace |
<url> |
Container for each URL entry |
<loc> |
Full URL including protocol (https://) |
Optional Elements
| Element | Description | Used by Google? |
|---|---|---|
<lastmod> |
Last modification date (YYYY-MM-DD) | Yes — if accurate |
<changefreq> |
Expected change frequency | Largely ignored |
<priority> |
Relative importance (0.0–1.0) | Largely ignored |
A Complete Example
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-02-05</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2025-11-20</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>https://example.com/blog/xml-sitemaps</loc>
<lastmod>2026-02-05</lastmod>
<changefreq>yearly</changefreq>
<priority>0.7</priority>
</url>
</urlset>
The lastmod Tag — The One That Matters
Of the three optional tags, lastmod is the only one Google actively uses. But it only works if it's accurate.
Good use of lastmod
Set it to the date the page content was actually last modified:
<lastmod>2026-01-15</lastmod>
Bad use of lastmod
Setting today's date on every page every time you regenerate the sitemap:
<!-- Don't do this — makes lastmod meaningless -->
<lastmod>2026-02-05</lastmod> <!-- on every URL, every day -->
When every URL has the same recent date, search engines cannot tell which pages actually changed. The signal becomes noise.
Accepted date formats
2026-02-05(date only — most common)2026-02-05T10:30:00+00:00(full W3C datetime)2026-02-05T10:30:00Z(UTC)
Sitemap Limits
| Constraint | Limit |
|---|---|
| URLs per sitemap | 50,000 |
| File size (uncompressed) | 50 MB |
| Gzipped files | Accepted (.xml.gz) |
| Protocol | Must match site protocol |
| Scope | Same domain/subdomain only |
If your site exceeds these limits, use a sitemap index file.
Sitemap Index Files
A sitemap index references multiple sitemap files:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2026-02-05</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
<lastmod>2026-02-03</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-products.xml</loc>
<lastmod>2026-01-28</lastmod>
</sitemap>
</sitemapindex>
When to use a sitemap index
- Site has more than 50,000 URLs
- You want to organize sitemaps by content type (pages, blog, products)
- Different sections update at different frequencies
- You want to track indexing per section in Search Console
Deployment Checklist
1. Create your sitemap
Use the Sitemap XML Generator or generate it programmatically from your CMS or build process.
2. Upload to your root directory
https://yoursite.com/sitemap.xml
The sitemap can technically live at any URL, but the root is the convention and the easiest for crawlers to find.
3. Reference in robots.txt
Add this line to your robots.txt:
Sitemap: https://yoursite.com/sitemap.xml
This is how crawlers auto-discover your sitemap without you needing to submit it manually.
4. Submit to search engines
- Google: Search Console → Sitemaps → Add
- Bing: Webmaster Tools → Sitemaps → Submit
5. Monitor
Check back after a few days. Search Console shows:
- How many URLs were discovered
- How many were indexed
- Any errors (404s, redirects, blocked pages)
Common Mistakes
1. Including non-canonical URLs
If a page has <link rel="canonical" href="..."> pointing to a different URL, only include the canonical URL in your sitemap. Mixed signals confuse crawlers.
2. Including URLs that return errors
Only include URLs that return HTTP 200. Remove URLs that:
- Return 404 (not found)
- Return 301/302 (redirects)
- Return 5xx (server errors)
- Are blocked by robots.txt
3. Inconsistent trailing slashes
/about and /about/ are technically different URLs. If your server treats them differently, pick one and be consistent. If it doesn't matter, still pick one for the sitemap.
4. Stale sitemaps
A sitemap generated once and never updated gradually becomes useless. New pages aren't listed, deleted pages return 404s, and lastmod dates are wrong.
Automate it. Most frameworks and CMS platforms can generate sitemaps at build time or on a schedule.
5. Listing pages you don't want indexed
If a page has <meta name="robots" content="noindex">, don't include it in your sitemap. Including noindex pages wastes crawl budget and creates contradictory signals.
6. Fake lastmod dates
Setting all lastmod dates to today or to the same date makes the field meaningless. Only set lastmod when you know the actual last-modified date.
Generating Sitemaps Automatically
Static site generators
Most static site generators (Eleventy, Hugo, Next.js, Gatsby, Astro) have sitemap plugins that auto-generate sitemap.xml at build time from your pages.
CMS platforms
WordPress, Drupal, and other CMS platforms typically include sitemap generation or have plugins for it. WordPress has built-in sitemap support since version 5.5.
Custom generation
For dynamic sites, generate sitemaps programmatically by querying your database for published URLs and writing the XML:
// Simplified Node.js example
const urls = await getPublishedUrls();
let xml = '<?xml version="1.0" encoding="UTF-8"?>\n';
xml += '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n';
for (const url of urls) {
xml += ` <url>\n`;
xml += ` <loc>${escapeXml(url.href)}</loc>\n`;
if (url.updatedAt) {
xml += ` <lastmod>${url.updatedAt.toISOString().split('T')[0]}</lastmod>\n`;
}
xml += ` </url>\n`;
}
xml += '</urlset>';
Run this on a schedule (daily or on content change) and write the output to your public directory.
Sitemap vs. robots.txt
These two files are complementary, not interchangeable:
| File | Purpose | Mechanism |
|---|---|---|
robots.txt |
Controls access — which URLs crawlers may request | Restrict / allow |
sitemap.xml |
Suggests discovery — which URLs you want crawlers to find | Recommend |
A common setup:
robots.txtblocks admin pages, staging URLs, and duplicate contentsitemap.xmllists all public, canonical, indexable pagesrobots.txtreferences the sitemap location
You can use the Robots.txt Generator to create both files together.
Image and Video Sitemaps
Google supports XML namespace extensions for images and videos:
Image sitemap example
<url>
<loc>https://example.com/gallery</loc>
<image:image>
<image:loc>https://example.com/images/photo1.jpg</image:loc>
<image:caption>Sunset over the mountains</image:caption>
</image:image>
</url>
Video sitemap example
<url>
<loc>https://example.com/videos/tutorial</loc>
<video:video>
<video:thumbnail_loc>https://example.com/thumbs/tutorial.jpg</video:thumbnail_loc>
<video:title>Getting Started Tutorial</video:title>
<video:description>A step-by-step walkthrough.</video:description>
</video:video>
</url>
These extensions help media-heavy sites surface content in Google Images and Google Video search.
Quick Reference
| What | Value |
|---|---|
| Max URLs per file | 50,000 |
| Max file size | 50 MB (uncompressed) |
| Accepted compression | gzip (.xml.gz) |
| Required tag | <loc> |
| Most useful optional tag | <lastmod> |
| Standard location | /sitemap.xml |
| Declaration in robots.txt | Sitemap: https://yoursite.com/sitemap.xml |
| Protocol | Must match your site (https) |
| Scope | Same host only |
Next Steps
- Generate your sitemap with the Sitemap XML Generator
- Create a robots.txt with the Robots.txt Generator and add your sitemap reference
- Submit to search engines via Google Search Console and Bing Webmaster Tools
- Set up auto-generation so your sitemap stays current as your site grows