Cloudflare Updates AI Crawler Access Rules: Key Changes Explained

Post Views: 12

Cloudflare introduces new AI traffic classification mechanisms to balance content protection and compensation for original material.

New AI Traffic Classification Mechanisms

Cloudflare implements new mechanisms to manage AI traffic classifications across three distinct categories: Search, Agent, and Training. These updates are accessible to all Cloudflare users, including those utilizing the Free tier, enabling website administrators to regulate how various AI crawlers interact with their content. The adjustments aim to balance content protection with the need for compensation for original material created by content providers. The company acknowledges that rigid content restrictions do not universally apply, emphasizing the importance of offering more nuanced options than blanket automation blocks. The revised framework shifts from a binary allow-or-block approach to a function-based categorization system. Bots are now classified by their operational roles rather than AI versus non-AI designations, allowing administrators to apply tailored policies to each category. This update considers post-crawling content usage patterns, encouraging the use of specialized crawlers for specific tasks to enhance transparency and control.

Default Settings and Policy Changes

Starting September 15, 2026, new domains will default to blocking Training and Agent crawlers on ad-displaying pages, while Search crawlers remain permitted by default. Multi-functional crawlers executing both Search and Training tasks will be subject to dual policy evaluations. If Training crawlers are restricted, bots like Googlebot, Applebot, and BingBot will be blocked even if Search crawlers are allowed. Administrators can opt out of the new defaults via Security settings before the deadline, preserving existing configurations for Training crawlers engaged in Search activities. Cloudflare will provide advance notifications to ensure users can review and adjust settings.

BotBase Expansion and Enterprise Tools

BotBase, a centralized repository of verified bots and AI agents, expands visibility for Enterprise Bot Management customers. This database enables administrators to access detailed classifications, behaviors, and traffic filters for known bots, with the ability to copy detection IDs for security rule integration. Future updates will introduce additional controls for managing automated traffic through the same interface.

Content Use Controls and Verification

Cloudflare is introducing content use controls for Enterprise Bot Management users, allowing them to specify post-crawling content utilization through three tiers: Immediate (no storage or reuse), Reference (indexing, excerpts, and backlinks), and Full (summaries or reproduction). The company is extending the robots.txt format with a use parameter to communicate these preferences. While robots.txt does not enforce compliance, BotBase will report Verified Bots’ adherence to declared policies. Non-compliant bots risking full content reproduction may lose their Verified status. Previously, all Verified Bots were automatically permitted, but the updated model requires verification only for identity confirmation, with access determined by classification and website policies. Cloudflare plans to enhance transparency in the verification process and develop tools for bot operators to manage classifications and Verified status.

Transitive Trust Model and Privacy Considerations

The company is also proposing a transitive trust model for AI agents operating via third-party platforms. This approach leverages the HTTP Forwarded header to identify the original requester behind intermediaries, including declared content use parameters like use=reference. This enables website owners to apply trust and access policies based on the primary bot operator rather than intermediaries. Cloudflare acknowledges potential limitations in scenarios requiring privacy or anonymous access.

Key Changes and Future Outlook

The updates reflect ongoing efforts to refine AI traffic management while addressing content protection and operational flexibility. Key changes include revised default settings, enhanced bot classification tools, and expanded controls for content usage and trust models.