Meta
Meta AI Crawlers
Agents used to build Meta's Llama models and power AI features across Facebook, Instagram, and WhatsApp.
Purpose: Llama model training and AI features
📊 Popularity & Traffic
#2Ranking among AI crawlers
One of the largest consumers of web data for LLM training globally.
🤖 User Agent Strings
Use these patterns to identify Meta AI Crawlers in your server logs or configure your robots.txt file.
Meta-ExternalAgent
Respects robots.txtPrimary Meta AI training crawler
Meta-ExternalAgentfacebookexternalhit
May ignore robots.txtLegacy social metadata fetcher (link previews)
facebookexternalhit🌐 IP Ranges
Source: Meta ASN ranges (AS32934, AS54115, AS63293)
Identified IP Ranges13 Ranges
31.13.24.0/21Subnet with 2048 addresses
31.13.64.0/18Subnet with 16384 addresses
45.64.40.0/22Subnet with 1024 addresses
57.144.0.0/14Subnet with 262144 addresses
66.220.144.0/20Subnet with 4096 addresses
69.63.176.0/20Subnet with 4096 addresses
How to read CIDR notation:
The/28 suffix indicates a block of 16 IP addresses. For example,.112/28 covers all addresses from .112 up to .127. Adding these to your firewall will block the entire range used by Meta AI Crawlers.📝 Robots.txt Configuration
Add the following to your robots.txt file to block Meta AI Crawlers:
User-agent: Meta-ExternalAgent
Disallow: /💡 Important Notes
- Legacy 'facebookexternalhit' may ignore robots.txt for user-shared link previews
- Meta introduced 'noai' and 'noimageai' robots meta tags for granular control
- Verify Meta crawlers by checking IP belongs to AS32934, AS54115, or AS63293
- Meta has increasingly shifted towards its own crawlers instead of relying solely on Common Crawl
Beyond blocking crawlers
See what AI is saying about your brand
Understanding crawlers is step one. With Aiso, you can see the actual conversations happening about your brand inside ChatGPT, Claude, and Perplexity.