robots.txt Allow and Disallow Examples for Common SEO Tasks

robots.txt is a crawl control file. It tells compliant crawlers which paths they may fetch, but it is not a reliable way to remove a URL from search results.

That distinction matters. A disallowed URL can still be discovered through links and appear in search with limited information.

Basic Allow and Disallow

Block an admin area:

User-agent: *
Disallow: /admin/

Allow everything:

User-agent: *
Disallow:

Block everything:

User-agent: *
Disallow: /

Use the last version carefully. It can remove crawl access to the entire site.

Allow a Subfolder Inside a Blocked Folder

For Google and other crawlers that support Allow, you can open a specific path inside a blocked area:

User-agent: *
Disallow: /private/
Allow: /private/public-guide/

The more specific matching rule usually wins, but crawler behavior can vary. Always test important rules.

Do Not Block CSS and JavaScript by Accident

Search engines may need page assets to render and understand content. Avoid broad rules that block important static files:

User-agent: *
Disallow: /api/
Disallow: /internal-search/
Allow: /_next/static/
Allow: /assets/

robots.txt Is Not noindex

Use robots.txt for crawl management. Use noindex or removal tools when you need indexing control. If Google cannot crawl a page because of robots.txt, it may not see a noindex tag on that page.

Quick Answer

Use Disallow to reduce crawling of low-value or private paths, use Allow to reopen important subpaths or assets, and test rules before deployment. Do not use robots.txt as your only method for removing pages from search.

What to Double-Check

| Check | Why it matters | | ------------------- | ----------------------------------------------------------------------------- | | Final URL path | A missing slash can change which URLs are blocked. | | Important assets | CSS, JavaScript, images, and sitemap files should stay crawlable when needed. | | noindex vs Disallow | Blocking crawl can prevent crawlers from seeing a noindex tag. | | Staging protection | Private environments need authentication or noindex, not only robots.txt. |

FAQ

Can robots.txt remove a page from Google?

No. robots.txt controls crawling, not guaranteed indexing. If a URL is already known, it can still appear with limited information. Use noindex, removals, or access control when the goal is to keep a page out of search results.

robots.txt is a crawl hint, not access control

Use robots.txt to guide search crawlers, not to hide private content. A disallowed URL can still be discovered through links, logs, or shared URLs, and different crawlers may interpret rules differently. For private pages, use authentication, noindex where appropriate, or remove the page from public access entirely.

Publishing checks that matter

Use practical robots.txt examples to block crawl traps, allow important assets, and avoid confusing crawl control with noindex. SEO configuration should be tested against the final URL, not only the generated snippet. After editing, check how crawlers or social apps see the page: whether assets are crawlable, previews use the intended image, and structured data matches visible content.

Use Generate robots.txt to produce the first draft, then keep a note explaining why the rule exists. Robots, Open Graph, and schema changes often live in templates, so future edits are safer when the original intent is written down.

What to Double-Check

robots.txt is a crawl hint, not access control

Publishing checks that matter

robots.txt Allow and Disallow Examples for Common SEO Tasks

Basic Allow and Disallow

Allow a Subfolder Inside a Blocked Folder

Do Not Block CSS and JavaScript by Accident

robots.txt Is Not noindex

Quick Answer

What to Double-Check

FAQ

Can robots.txt remove a page from Google?

robots.txt is a crawl hint, not access control

Publishing checks that matter

Ready to try it yourself?

robots.txt Allow and Disallow Examples for Common SEO Tasks

Basic Allow and Disallow

Allow a Subfolder Inside a Blocked Folder

Do Not Block CSS and JavaScript by Accident

robots.txt Is Not noindex

Quick Answer

What to Double-Check

FAQ

Can robots.txt remove a page from Google?

robots.txt is a crawl hint, not access control

Publishing checks that matter

Ready to try it yourself?