Last updated: April 2026
If your site is on Cloudflare, there is a single setting that may be silently preventing every AI system from ever seeing your content. Since July 2025, Cloudflare blocks AI bots by default. If you have not checked this setting, your GEO work — your structured content, your FAQ schema, your answer-first formatting — is invisible to ChatGPT, Perplexity, and Claude at the infrastructure layer, before any of those systems ever try to crawl you.
This takes five minutes to check. Do it now.
What Changed in Cloudflare in July 2025?
In July 2025, Cloudflare added a dedicated control for AI scraper and crawler traffic to their Security > Bots dashboard. The default state for this control is Block.
The reason for the default-block behavior is straightforward from Cloudflare's commercial perspective: AI crawlers can generate substantial server load, and Cloudflare's core product value is protecting servers from unwanted traffic. Blocking AI scrapers by default is a reasonable infrastructure default for most of their customer base — particularly for sites that are not trying to get cited by AI systems.
The problem is that most site owners have no idea this setting exists. Cloudflare did not send email notifications when the setting was added. There is no visible alert in the dashboard if you are not looking for it. The traffic is dropped silently. Your server logs show nothing, your analytics show nothing, and your GA4 referral data from AI platforms stays at zero — which most site owners interpret as "AI search isn't sending us traffic yet" rather than "our infrastructure is blocking AI crawlers entirely."
If you set up GEO optimizations in 2025 and saw no results, this may be the entire explanation.
How Do You Check and Fix This Right Now?
Step 1: Log into your Cloudflare dashboard
Go to dash.cloudflare.com and select the domain you want to check.
Step 2: Navigate to Security > Bots
In the left sidebar, click Security, then click Bots. This opens the bot management settings panel.
Step 3: Find the "AI Scrapers and Crawlers" setting
Look for a section or toggle labeled "AI Scrapers and Crawlers." If this setting exists on your plan and is set to Block, that is your problem. Toggle it to Allow if you want to permit all AI crawlers, or click through to the custom rule options to allow specific bots while blocking others.
Step 4: Configure specific bots using the table below
Not all AI bots have the same purpose. Use this table to decide which bots to allow based on your goals.
| Bot Name | Who Runs It | Allow for Citations? | Block for Training? |
|---|---|---|---|
| OAI-SearchBot | OpenAI (ChatGPT Search) | Yes | No — this is a citation bot |
| ChatGPT-User | OpenAI (browsing plugin) | Yes | No — this is a citation bot |
| GPTBot | OpenAI (training data) | Optional | Yes, if training opt-out matters |
| ClaudeBot | Anthropic (training data) | Optional | Yes, if training opt-out matters |
| Claude-SearchBot | Anthropic (Claude search) | Yes | No — this is a citation bot |
| Claude-User | Anthropic (user sessions) | Yes | No — this is a citation bot |
| PerplexityBot | Perplexity AI | Yes | No — this is a citation bot |
| Google-Extended | Google (AI training) | Optional | Yes, if training opt-out matters |
| Bingbot | Microsoft (Bing + Copilot) | Yes | No — this also serves Bing SEO |
| CCBot | Common Crawl (training) | No strong reason | Yes, if training opt-out matters |
If GEO is your goal: Allow all citation bots (OAI-SearchBot, ChatGPT-User, Claude-SearchBot, Claude-User, PerplexityBot, Bingbot) unconditionally. These bots are how AI systems discover and cite your content at query time. Blocking them is equivalent to blocking Googlebot for traditional SEO.
If you want citations but not training data use: Allow the citation bots above and add custom WAF rules to block GPTBot, ClaudeBot, Google-Extended, and CCBot separately. These are the training-data crawlers that feed into future model versions rather than real-time citation retrieval.
If you block all AI bots: Your content will not appear in AI search citations at all. This is a valid choice if your business model is not dependent on AI search visibility, but it is incompatible with running a GEO strategy.
What Is the Difference Between Allowing Citation Bots and Allowing Training Bots?
This distinction matters and is frequently confused.
Citation bots crawl your content to power real-time retrieval. When a user asks ChatGPT a question today, OAI-SearchBot may crawl relevant pages in real-time to provide a current, cited answer. Allowing citation bots directly enables AI systems to cite your content in their responses. This is the mechanism that drives AI-referred traffic. Blocking citation bots means AI systems cannot retrieve your content at query time and therefore cannot cite you.
Training bots crawl your content to incorporate it into future versions of AI models during the next training run. Your content is used as training data, which means it influences how the model understands your topic area but does not necessarily result in your site being cited. Allowing training bots contributes your content to the model's base knowledge without guaranteeing citation. Blocking training bots opts your content out of future training datasets.
The decision framework:
If your primary goal is AI citations for traffic and authority, allow citation bots unconditionally. The question of training bot access is a separate policy decision about how you feel about your content being used to train future AI models. It is a legitimate concern, but it is orthogonal to your GEO strategy. You can allow citation bots and block training bots simultaneously using Cloudflare's custom WAF rules — they are different user agent strings.
For most content publishers running GEO: allow all citation bots, and make a deliberate decision about training bots based on your own policy preferences rather than leaving Cloudflare's default block in place for all AI traffic.
Does This Also Affect robots.txt?
Yes, and this is a critical architectural point that most practitioners miss.
Robots.txt is a file on your server that tells crawlers which paths they are and are not permitted to access. It operates at the application layer. A bot must first establish a network connection to your server, and then must request and read your robots.txt file, before it can know your crawl permissions.
Cloudflare operates at the network edge — it sits between the internet and your server. A Cloudflare block happens before the bot establishes a connection to your server. The bot cannot read your robots.txt because it never reaches your server. Your robots.txt Allow directives are completely irrelevant if Cloudflare is dropping the connection at the edge.
This means two independent configuration layers both need to be correct:
Cloudflare bot settings: Must allow the AI crawler user agents before they can connect to your server.
robots.txt: Must allow the AI crawler user agents after they connect, so they know which paths to crawl.
If Cloudflare allows the bot but your robots.txt blocks it, the bot connects but does not crawl. If your robots.txt allows the bot but Cloudflare blocks it, the bot never connects. Both layers must be permissive for AI crawlers to successfully index your content.
robots.txt configuration for AI crawlers:
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: Claude-User
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
If you want to block training bots at the robots.txt layer (as a secondary control), replace Allow: / with Disallow: / for GPTBot and ClaudeBot. Note that this only works if Cloudflare has already allowed those bots to connect — the Cloudflare setting takes precedence at the network layer.
Frequently Asked Questions
Does Cloudflare block AI crawlers?
Yes, since July 2025 Cloudflare changed its default configuration to block AI bots. If you are on Cloudflare and did not actively change this setting, your content may be invisible to ChatGPT, Perplexity, Claude, and other AI systems that crawl for citations. This override operates at the infrastructure layer, meaning your robots.txt permissions are irrelevant — Cloudflare blocks the traffic before it reaches your server.
How do I allow AI bots on Cloudflare?
Log into your Cloudflare dashboard, navigate to Security > Bots, and look for the "AI Scrapers and Crawlers" setting. Toggle it from Block to Allow or create a custom rule that permits specific AI user agents. The key bots to allow for citations: OAI-SearchBot (ChatGPT Search), ChatGPT-User, Claude-SearchBot, Claude-User, and PerplexityBot. If you also want Bing Copilot citations, allow Bingbot. If you want AI citations but not AI training data, allow only the search bots and block the training bots (GPTBot, ClaudeBot, Google-Extended) separately.
Should I block AI bots or allow them?
For a personal brand or educational content site where AI citations are the goal, allow all citation bots and selectively block only training bots if you are concerned about your content being used for LLM training. The bots to allow for citation: OAI-SearchBot, ChatGPT-User, Claude-SearchBot, Claude-User, PerplexityBot. The bots to optionally block for training: GPTBot, ClaudeBot, Google-Extended, CCBot. If you are running GEO as a strategy, blocking all AI bots entirely defeats the purpose.
Does robots.txt work for AI bots if I use Cloudflare?
No. Cloudflare operates at the network edge before your server — it blocks traffic before it can read your robots.txt. If Cloudflare blocks AI bots, your robots.txt Allow directives are irrelevant. You must configure bot permissions at the Cloudflare dashboard level AND maintain correct robots.txt for situations where traffic bypasses Cloudflare. They are independent controls and both need to be configured correctly.
Share this with anyone on Cloudflare who is doing GEO and has not checked this setting.