ArticleroBot

We fetch web pages to set up and run our users' projects, typically for the owners of those websites. This page covers everything you need to know.

How to recognize us

In your server's access logs we identify ourselves with this User-Agent string:

ArticleroBot/1.0 (+https://www.articlero.com/crawler)

What we fetch and when

ArticleroBot is not a classic crawler. It does not roam the web on its own, follow links from page to page, or index content. It fetches a specific page only as part of setting up or running an Articlero user's project.

Website homepage
During the project eligibility check and project setup, to understand what the website is about.
robots.txt and sitemap.xml
When discovering the site map, to learn which articles already exist on the website.
llms.txt
When checking project settings, to verify the file is reachable.
Pages we may reference
While preparing a user's content, to confirm that a page is reachable before we point to it.

How we behave

Targeted GET requests for specific pages, no broad crawling or harvesting of the internet.
Request timeout under 15 seconds and response size caps of a few MB.
No cookies, no JavaScript execution, no login attempts.
Fetched content is used only to run the relevant project. We do not archive it, publish it or use it for model training.

How to block us

If you do not want us to fetch your website, add this to your robots.txt file:

User-agent: ArticleroBot
Disallow: /

Some of our requests are one-off fetches tied to a specific user action, similar to link previews in chat apps, and may happen despite robots.txt rules. Recurring automated access respects the block. If you want your website excluded entirely, write to us and we will take care of it.

Contact

Send questions about fetching, problem reports or website exclusion requests via our contact page