Crawling

What is robots.txt

A simple text file that tells search engines where they can and cannot go on your site.

We strongly recommend keeping this check green in Articlero. Without the right setup, Articlero may not deliver the results you expect from your content.

In short

Think of robots.txt as a sign at the entrance to your site. Google arrives, reads it, and learns whether it may crawl the whole site or should skip some parts. It is not a security lock and does not guarantee a page stays hidden from Google. It is an instruction for well-behaved crawlers.

Why it matters for Articlero

Articlero can generate good articles for you, but if robots.txt accidentally blocks the whole site or important article sections, Google may not reach them. The result can then be weaker than you expect, because the content exists but the search engine cannot crawl it normally.

What should be right

The file exists at /robots.txt.
It does not block the whole site with User-agent: * and Disallow: /.
It includes a Sitemap: https://your-site.com/sitemap.xml line.
After changes, it passes the audit in the project overview.

How to set it up

1Open your site admin, hosting, or CMS.
2Find the robots.txt settings. In some CMSs an SEO plugin generates it.
3Allow crawling of important pages and articles.
4Add a Sitemap line with the full URL of your sitemap.
5In Articlero, open the Project overview and run the check again.

Common mistakes

Disallow: / for all crawlers on a production site.
robots.txt stays in staging mode even after the site goes live.
The sitemap is on the site but not listed in robots.txt.
Blocking CSS or JavaScript files that Google needs to understand the page.

What Articlero checks

The audit checks whether robots.txt exists, whether it includes a Sitemap, and whether it likely does not block the whole site.

Open dashboard

A simple example

User-agent: *
Allow: /

Sitemap: https://your-site.com/sitemap.xml

Sources

Google Search Central: robots.txt Google: create robots.txt