BaseToolbox LogoBaseToolbox
Blog

© 2025 BaseToolbox. All rights reserved.

Privacy PolicyAboutContact Us

robots.txt Allow and Disallow Examples for Common SEO Tasks

Published on June 25, 2026

robots.txt is a crawl control file. It tells compliant crawlers which paths they may fetch, but it is not a reliable way to remove a URL from search results.

That distinction matters. A disallowed URL can still be discovered through links and appear in search with limited information.

Basic Allow and Disallow

Block an admin area:

User-agent: *
Disallow: /admin/

Allow everything:

User-agent: *
Disallow:

Block everything:

User-agent: *
Disallow: /

Use the last version carefully. It can remove crawl access to the entire site.

Allow a Subfolder Inside a Blocked Folder

For Google and other crawlers that support Allow, you can open a specific path inside a blocked area:

User-agent: *
Disallow: /private/
Allow: /private/public-guide/

The more specific matching rule usually wins, but crawler behavior can vary. Always test important rules.

Do Not Block CSS and JavaScript by Accident

Search engines may need page assets to render and understand content. Avoid broad rules that block important static files:

User-agent: *
Disallow: /api/
Disallow: /internal-search/
Allow: /_next/static/
Allow: /assets/

robots.txt Is Not noindex

Use robots.txt for crawl management. Use noindex or removal tools when you need indexing control. If Google cannot crawl a page because of robots.txt, it may not see a noindex tag on that page.

Quick Answer

Use Disallow to reduce crawling of low-value or private paths, use Allow to reopen important subpaths or assets, and test rules before deployment. Do not use robots.txt as your only method for removing pages from search.

What to Double-Check

| Check | Why it matters | | ------------------- | ----------------------------------------------------------------------------- | | Final URL path | A missing slash can change which URLs are blocked. | | Important assets | CSS, JavaScript, images, and sitemap files should stay crawlable when needed. | | noindex vs Disallow | Blocking crawl can prevent crawlers from seeing a noindex tag. | | Staging protection | Private environments need authentication or noindex, not only robots.txt. |

FAQ

Can robots.txt remove a page from Google?

No. robots.txt controls crawling, not guaranteed indexing. If a URL is already known, it can still appear with limited information. Use noindex, removals, or access control when the goal is to keep a page out of search results.

robots.txt is a crawl hint, not access control

Use robots.txt to guide search crawlers, not to hide private content. A disallowed URL can still be discovered through links, logs, or shared URLs, and different crawlers may interpret rules differently. For private pages, use authentication, noindex where appropriate, or remove the page from public access entirely.

Publishing checks that matter

Use practical robots.txt examples to block crawl traps, allow important assets, and avoid confusing crawl control with noindex. SEO configuration should be tested against the final URL, not only the generated snippet. After editing, check how crawlers or social apps see the page: whether assets are crawlable, previews use the intended image, and structured data matches visible content.

Use Generate robots.txt to produce the first draft, then keep a note explaining why the rule exists. Robots, Open Graph, and schema changes often live in templates, so future edits are safer when the original intent is written down.

Ready to try it yourself?

Put what you have learned into practice with our free online tool.

Generate robots.txt