Robots.txt Tester & Validator

Test and validate your robots.txt file

Robots.txt Content

Paste your robots.txt content below to validate syntax and test URL paths. All processing happens in your browser.

Test URL Path

Robots.txt Directives Reference

Directive Example Description
User-agent User-agent: Googlebot Specifies which crawler the rules apply to. Use * for all crawlers.
Disallow Disallow: /admin/ Blocks access to the specified path. Empty Disallow: allows everything.
Allow Allow: /public/ Explicitly allows access to a path. Overrides broader Disallow rules.
Sitemap Sitemap: https://example.com/sitemap.xml Points crawlers to your XML sitemap. Can list multiple sitemaps.
Crawl-delay Crawl-delay: 10 Minimum seconds between requests. Supported by some crawlers (not Google).
* (wildcard) Disallow: /*.pdf$ Matches any sequence of characters. $ denotes end of URL.

Wildcard Pattern Matching

Pattern Example Rule Matches Does Not Match
* Disallow: /*admin /admin, /user/admin /admin-panel
$ Disallow: /*.pdf$ /doc.pdf, /files/report.pdf /doc.pdf?print=1
/ Disallow: /admin/ /admin/, /admin/users /administrator

Common Robots.txt Examples

Allow All Crawlers

User-agent: *
Disallow:

Sitemap: https://example.com/sitemap.xml

Block All Crawlers

User-agent: *
Disallow: /

WordPress Site

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://example.com/sitemap.xml

E-commerce Site

User-agent: *
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /*?sort=
Disallow: /*?filter=

Allow: /

Sitemap: https://example.com/sitemap.xml

Blog with Multiple User-Agents

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /

User-agent: BadBot
Disallow: /

Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/news-sitemap.xml

About Robots.txt

The robots.txt file is a standard used by websites to communicate with web crawlers and search engine bots. It sits at the root of your domain (e.g., https://example.com/robots.txt) and tells crawlers which parts of your site they can and cannot access.

While robots.txt helps control crawler access, it is not a security mechanism. The file is publicly accessible and crawlers may choose to ignore it. For sensitive content, use proper authentication and access controls.

Robots.txt is defined by the RFC 9309 standard, though many search engines support additional directives and patterns beyond the base specification.

Frequently Asked Questions

What is robots.txt and why do I need it?

Robots.txt is a text file that tells search engine crawlers which pages or sections of your website they should not access. It helps prevent crawlers from indexing duplicate content, admin pages, or pages that consume server resources. Every website should have a robots.txt file, even if it just allows all access.

Where should I place the robots.txt file?

The robots.txt file must be placed at the root of your domain: https://example.com/robots.txt. It will not work in subdirectories. Each subdomain requires its own robots.txt file (e.g., https://blog.example.com/robots.txt).

Does robots.txt block pages from appearing in search results?

No. Robots.txt prevents crawlers from accessing pages, but does not guarantee they will not appear in search results. If other sites link to a blocked page, search engines may still index the URL without crawling the content. To prevent indexing, use the noindex meta tag or X-Robots-Tag HTTP header.

What is the difference between Disallow and Allow?

Disallow blocks crawler access to a path. Allow explicitly permits access, which is useful for creating exceptions to broader Disallow rules. For example, you can block /admin/ but allow /admin/public/.

Can I use wildcards in robots.txt?

Yes. Google and most modern crawlers support * (matches any sequence of characters) and $ (denotes end of URL). For example, Disallow: /*.pdf$ blocks all PDF files. Wildcard support is not universal, so test your rules carefully.

How do I test if my robots.txt is working?

Use this tool to validate syntax and test paths. Google Search Console also provides a robots.txt tester under the "Crawl" section. You can also view your live robots.txt file by visiting https://yoursite.com/robots.txt in a browser.

What happens if I have multiple User-agent sections?

Crawlers use the most specific matching user-agent section. If you have rules for User-agent: Googlebot and User-agent: *, Googlebot will follow its specific rules and ignore the * section. Other bots will use the * rules.

Should I block CSS and JavaScript files?

No. Google recommends allowing access to CSS and JavaScript files so crawlers can properly render and understand your pages. Blocking these files may negatively impact how search engines interpret your site.

Can robots.txt prevent malicious bots?

No. Robots.txt is a voluntary standard. Malicious bots ignore it. For security, use proper authentication, rate limiting, CAPTCHA, and server-level access controls. Robots.txt is only for legitimate crawlers that respect the standard.

What is Crawl-delay and who supports it?

Crawl-delay specifies the minimum number of seconds between requests from a crawler. Bing and some other search engines support it, but Google does not. Google uses its own algorithms to determine crawl rate, which you can configure in Google Search Console.

How do I block specific bots but allow others?

Create separate User-agent sections. For example, to block BadBot but allow all others:

User-agent: BadBot
Disallow: /

User-agent: *
Disallow:
Does robots.txt affect page ranking?

Not directly, but improper use can harm SEO. Blocking important pages prevents them from being indexed. Blocking duplicate or low-quality content can help search engines focus on your best pages. Always test changes carefully.

Privacy & Limitations

  • All calculations run entirely in your browser -- nothing is sent to any server.
  • Results are estimates and may vary based on actual conditions.

Related Tools

Related Tools

View all tools

Robots.txt Tester FAQ

What is Robots.txt Tester?

Robots.txt Tester is a free web & seo tool that helps you Test and validate robots.txt files with URL path checking.

How do I use Robots.txt Tester?

Enter your input values, review the calculated output, and adjust inputs until you reach the result you need. The result updates in your browser.

Is Robots.txt Tester private?

Yes. Calculations run locally in your browser. Inputs are not uploaded to a server by default, and refreshing the page clears session data.

Does Robots.txt Tester require an account or installation?

No. You can use this tool directly in your browser without sign-up or software installation.

How accurate are results from Robots.txt Tester?

This tool applies standard formulas or deterministic processing logic for estimates. For medical, legal, tax, or investment decisions, verify with a qualified professional.

Can I save or share outputs from Robots.txt Tester?

You can bookmark this page and copy outputs manually. Results are not persisted in your account and are typically not embedded in the URL.

Request a New Tool
Improve This Tool