Free Tool — No Signup Required

Sitemap Checker & XML Sitemap Validator

Instantly check whether your website's sitemap.xml exists, is publicly accessible, and passes key validation checks — including XML structure, URL count, sitemap index detection, and common errors. Free, no signup required.

Supports example.com, https://example.com, or https://www.example.com — we check /sitemap.xml automatically.

No GSC account needed Works on competitors & client sites Detects sitemap index vs standard 9-point validation
Background

What is a sitemap.xml?

The XML file that tells search engines — and AI crawlers — what pages your site has and when they were last updated.

A sitemap.xml is a structured XML file placed at the root of your domain (e.g. https://example.com/sitemap.xml) that lists the pages on your website. It is part of the Sitemap Protocol, which is supported by Google, Bing, Yahoo, and other major search engines.

While search engine crawlers will discover most pages through internal links, a sitemap provides a reliable fallback — especially for large sites, new content, or pages buried deep in a site architecture. It is particularly useful for pages that aren't well-linked internally.

Sitemaps can also be used for image sitemaps, video sitemaps, and news sitemaps — each using specialised extensions of the same XML format to surface media content to search engines.

Sitemap Index vs Standard Sitemap

<urlset> (Standard)

Lists individual page URLs directly. Max 50,000 URLs per file.

Best for: Sites up to ~50,000 pages

<sitemapindex> (Index)

References multiple child sitemap files. Used for large or segmented sites.

Best for: Large or multi-section sites

SEO Impact

Why a Valid Sitemap Matters for SEO

🔍

Faster Crawl Discovery

Search engines don't have to rely solely on following links to find your pages. A sitemap acts as a direct map — new and updated pages get discovered and potentially indexed faster.

🗂️

Large Site Management

For sites with thousands of pages, a sitemap index makes it easy to organise URLs into logical groups (blog, products, authors) and ensure complete coverage without hitting the 50,000 URL per file limit.

📅

Freshness Signals

The <lastmod> tag in your sitemap tells crawlers when a page was last updated. Accurate lastmod dates help prioritise crawl frequency for content that changes regularly.

An honest perspective on sitemaps and rankings

A sitemap alone does not improve your Google rankings. It helps search engines find your pages, but whether those pages rank depends on content quality, authority, and relevance. Think of a sitemap as removing a technical obstacle — not as a ranking lever. A site with excellent content but no sitemap will almost always outrank a poorly-written site with a perfect sitemap.

Best Practices

Sitemap Best Practices for Technical SEO

What separates a good sitemap from a problematic one — and common mistakes to avoid.

01

Only include indexable pages

Your sitemap should list only pages you want indexed. Exclude pages with noindex directives, pagination variants, thin content pages, 404s, redirects, or canonicalized duplicates.

02

Use accurate <lastmod> dates

The <lastmod> tag should reflect when the page's content last substantively changed — not simply when the CMS ran a batch update. Inaccurate lastmod dates can cause crawlers to deprioritise your sitemap data.

03

Stay within the 50,000 URL limit

Each sitemap file is limited to 50,000 URLs and 50MB. If your site exceeds this, use a sitemap index file that references multiple individual sitemap files — each within the limit.

04

Reference your sitemap in robots.txt

Add a Sitemap directive to your robots.txt file: "Sitemap: https://yourdomain.com/sitemap.xml". This ensures any crawler that reads your robots.txt can discover your sitemap, even without a Search Console submission.

05

Submit to Search Console and Bing Webmaster Tools

Submit your sitemap directly in Google Search Console and Bing Webmaster Tools. This gives you visibility into how many URLs were submitted vs indexed, and any errors the crawlers encountered.

Common Mistakes

  • Including noindex or redirected pages
  • Using relative URLs instead of absolute ones
  • Exceeding the 50,000 URL per file limit
  • Serving sitemap with wrong Content-Type
  • Returning a non-200 HTTP status
  • Using inaccurate or auto-generated <lastmod> dates

Best Practices

  • List only pages you want indexed
  • Use full absolute HTTPS URLs
  • Include accurate <lastmod> dates in ISO 8601
  • Use sitemap index for sites over 50,000 pages
  • Reference sitemap in robots.txt
  • Submit to Search Console and monitor coverage

Example sitemap.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-03-01</lastmod>
  </url>
  <url>
    <loc>https://example.com/about</loc>
    <lastmod>2026-01-15</lastmod>
  </url>
  <url>
    <loc>https://example.com/blog/my-post</loc>
    <lastmod>2026-02-20</lastmod>
  </url>
</urlset>

⚠ Note: Google ignores <changefreq> and <priority> values. Use <lastmod> to signal content freshness — it's the only date field that influences crawl prioritisation.

About This Tool

How This Sitemap Checker Works

What we check — and what each result means.

🌐

HTTP 200 response

We fetch /sitemap.xml and confirm the server returns HTTP 200. Non-200 responses (404, 403, 5xx) are flagged.

📄

Content-Type header

We verify the sitemap is served with an XML-appropriate Content-Type (application/xml or text/xml).

🏗️

XML structure

We detect whether the file has a valid <urlset> or <sitemapindex> root element. Malformed XML is flagged.

🗂️

Sitemap type detection

We identify whether this is a standard URL sitemap or a sitemap index file, and display the appropriate metrics.

🔢

URL / child sitemap count

We count <url> entries in standard sitemaps and <sitemap> entries in index files. Empty sitemaps are flagged as errors.

🔗

URL validity

We sample <loc> values and flag relative URLs, empty <loc> tags, or malformed URLs that violate the Sitemap Protocol.

📋

Duplicate URL detection

We check for duplicate <loc> entries within the sampled portion of the sitemap.

📅

<lastmod> presence

We check whether the sitemap includes <lastmod> dates, which help search engines understand content freshness.

⚠️

URL limit warning

We flag standard sitemaps approaching or exceeding the 50,000 URL per file limit.

Understanding Results

Pass

This check meets the Sitemap Protocol or best practice standards. No action needed.

Warning

A recommended improvement. Your sitemap is still functional, but this is worth addressing.

Fail

A critical issue that should be fixed. Search engine crawlers may not process this sitemap correctly.

FAQ

Frequently Asked Questions About Sitemaps

What is a sitemap.xml?+
A sitemap.xml is an XML file that lists the pages on your website, helping search engine crawlers discover and index your content more efficiently. It forms part of the Sitemap Protocol supported by Google, Bing, and other major search engines. A well-structured sitemap tells crawlers which pages exist, when they were last updated, and — optionally — how frequently they change.
Why is having a sitemap important for SEO?+
A sitemap is especially valuable for large websites, newly launched sites, or pages that lack strong internal linking. It acts as a direct signal to search engines about what pages you want indexed. Without a sitemap, crawlers must discover pages purely through links — which can lead to important pages being missed or crawled less frequently. For AI crawlers, sitemaps are increasingly used to understand site scope and structure.
What is a sitemap index?+
A sitemap index is a special XML file that does not contain individual page URLs — instead, it references multiple child sitemap files. It is the recommended approach for large websites with more than 50,000 pages, or sites that organise URLs into logical groups (e.g. a blog sitemap, a product sitemap, and an image sitemap). A sitemap index file uses <sitemapindex> as the root element, while standard sitemaps use <urlset>.
What is the URL limit for a sitemap?+
Each individual sitemap file can contain a maximum of 50,000 URLs and must not exceed 50MB in size. If your site has more than 50,000 pages, you must split them into multiple sitemap files and reference them using a sitemap index. This limit is defined by the Sitemap Protocol at sitemaps.org.
How do I submit my sitemap to Google?+
You can submit your sitemap to Google in two ways: (1) through Google Search Console under the Sitemaps section, by entering the full URL (e.g. https://example.com/sitemap.xml), or (2) by referencing it in your robots.txt file using "Sitemap: https://example.com/sitemap.xml". Once submitted, Google will crawl your sitemap periodically. Verification of indexation should be confirmed in Search Console, not assumed from submission alone.
What should a sitemap.xml include?+
A well-structured sitemap.xml should include: (1) a valid XML declaration, (2) correct namespace declarations in the root <urlset> element, (3) one <url> block per page with a <loc> element containing the full absolute URL, (4) optional <lastmod> dates in ISO 8601 format, and (5) optionally <changefreq> and <priority> values. Pages should be indexable — avoid including URLs that are noindexed, redirected, or returning errors.
Should I include noindex pages in my sitemap?+
No. Pages with a noindex directive should not be included in your sitemap. Including them sends conflicting signals to search engines — you are telling them the page exists (sitemap) while also telling them not to index it (noindex). This can confuse crawlers and waste crawl budget. Your sitemap should only contain URLs you actively want indexed.
What does this sitemap checker actually test?+
This tool performs a live server-side check of your website's /sitemap.xml. It tests: HTTP status (must be 200), Content-Type headers, XML structure validity, sitemap type detection (index vs standard), URL count, duplicate URL detection, empty <loc> checks, invalid URL formatting, and <lastmod> presence. All checks are performed in real time — no caching, no stored data.
My sitemap returns HTTP 200 but search engines are not indexing all my pages — why?+
A healthy sitemap is a necessary but not sufficient condition for full indexation. Google and other search engines make independent indexation decisions based on content quality, crawl budget, PageRank, duplicate content, and technical issues. Submitting a valid sitemap gives search engines the information they need, but does not guarantee every URL will be indexed. Use Google Search Console's URL Inspection tool to investigate specific pages.
Does a sitemap improve my Google rankings?+
A sitemap does not directly improve rankings — it aids discoverability and crawling efficiency. Better crawl coverage means more of your content can be considered for ranking, but the quality and relevance of your content remains the primary ranking factor. Think of a sitemap as infrastructure: it ensures search engines can find your pages, but the work of ranking is done by the content itself.
Where should a sitemap be located?+
Your sitemap should be placed at the root of your domain: https://yourdomain.com/sitemap.xml. You can only control the crawling scope through your sitemap for URLs within the same domain. Sitemaps can also be in subdirectories (e.g. /sitemap/sitemap.xml), but they will only be valid for URLs within or below that directory path. The simplest and most widely compatible approach is to keep it at the domain root.

Check Another Domain

Run the sitemap validator on any website — a competitor, a client site, or your own domain after making improvements.

← All Free Tools

Want deeper AI visibility insights? GEOflux tracks your brand across ChatGPT, Gemini, and Perplexity.

Start a free trial →