AI Crawler Tracking

AI companies regularly crawl websites to build and update the knowledge that powers their models. Knowing which AI crawlers visit your site -- and which pages they're interested in -- gives you insight into how your content is being indexed by AI platforms.

Cleotic lets you track AI crawler visits to your website using two complementary methods.

Why track AI crawlers?

Understanding AI crawler activity on your site helps you:

  • Confirm your content is being indexed. If GPTBot visits your blog regularly, ChatGPT likely has access to that content for its responses.
  • Identify gaps in indexing. If certain AI crawlers never visit your site, those models may have limited knowledge of your brand.
  • Prioritise content placement. Pages that AI crawlers visit frequently are more likely to influence AI responses.
  • Correlate crawling with visibility. If crawler visits increase and your visibility scores rise, your content strategy is working.

Two tracking methods

1. Tracking pixel (JavaScript)

A small JavaScript snippet you add to your website's HTML. When a page loads, it pings Cleotic to record the visit.

Best for: AI-powered browsers that execute JavaScript, such as Arc Search and Perplexity Comet.

Limitation: Most AI training crawlers (GPTBot, ClaudeBot, CCBot, etc.) only fetch HTML and don't run JavaScript. They won't trigger the pixel.

2. Vercel log drain

A server-side integration that sends your website's request logs from Vercel to Cleotic. Every page request is checked against a list of ~140 known AI crawlers.

Best for: Detecting training crawlers like GPTBot, ClaudeBot, CCBot, anthropic-ai, Google-Extended, and PerplexityBot that only fetch HTML.

Requirement: Your website must be deployed on Vercel.

For most comprehensive tracking, use both methods together. See the Vercel log drain setup guide for detailed instructions on the server-side integration.

Setting up a tracking site

  1. Go to your project and click the Crawlers tab
  2. Click Add Tracking Site
  3. Enter the domain you want to track (e.g., "example.com")
  4. Optionally give it a name (e.g., "Main Website")
  5. Click Save

Once created, click Snippet on the tracking site to get your installation code.

Installing the pixel

  1. Click Snippet on your tracking site
  2. Copy the HTML code from the Tracking pixel section
  3. Paste it into your website's HTML, just before the closing </body> tag
  4. Deploy your site

The pixel will start recording visits from JavaScript-capable AI browsers immediately.

Setting up the Vercel log drain

See the dedicated Vercel log drain setup guide for step-by-step instructions. The setup takes about 5 minutes.

Managing tracking sites

From the Crawlers tab, you can:

  • Toggle active/paused -- Pause a tracking site to stop recording visits without deleting configuration. Incoming log drain data is silently dropped while paused.
  • Edit -- Update the domain or name
  • Delete -- Remove the tracking site and its data

One tracking site per domain

Create a separate tracking site for each domain or subdomain you want to monitor. For example, example.com and blog.example.com need separate tracking sites. Cleotic matches incoming requests to tracking sites by domain, so only requests matching the configured domain are recorded.

What gets tracked

Cleotic only records visits from identified AI crawlers. Normal user traffic from browsers is never stored. Each recorded visit includes:

  • The URL and page path that was visited
  • Which AI crawler was identified (e.g., GPTBot, ClaudeBot)
  • The tracking method that detected it (pixel, beacon, or Vercel drain)
  • The page title at the time of the visit
  • The timestamp

See Crawler analytics to learn how to read and use this data.

Related