Vercel Log Drain Setup

Most AI crawlers -- including GPTBot, ClaudeBot, CCBot, anthropic-ai, Google-Extended, and PerplexityBot -- fetch only the HTML of your pages. They don't load images or run JavaScript, which means the Cleotic tracking pixel can't see them.

Vercel log drains give Cleotic a feed of your site's server-side request logs, so every HTML page request gets checked against a known-crawler list and recorded when there's a match. This is the most reliable way to track training crawlers on a Vercel-hosted site.

This guide takes about 5 minutes.

What you'll need

  • A Vercel project with your website deployed on it
  • Owner or Admin role on the Vercel team/account (log drains are team-level)
  • An Cleotic account with a project and at least one tracking site configured for the domain you want to track

Step 1 -- Copy the drain URL from Cleotic

  1. In Cleotic, open your project and go to the Crawlers tab
  2. Find your tracking site in the list and click Snippet
  3. Scroll to the Vercel log drain section
  4. Copy the Drain URL -- it looks like: https://api.cleotic.ai/t/aB3cD4eF5gH6/vercel

Keep the Cleotic tab open -- you'll come back in Step 3.

Step 2 -- Create the log drain in Vercel

  1. Go to vercel.com and sign in
  2. Click your account/team avatar, then Account Settings (or Team Settings for a team project)
  3. In the left sidebar, select Integrations, then Log Drains
  4. Click Create Log Drain
  5. Fill in the form:
    • Delivery Format: JSON (or NDJSON -- both work)
    • Sources: Select Edge Network at minimum. You can also select Static, Lambda, Edge Function, and External.
    • Environments: Production at minimum. Include Preview only if you want to track crawlers hitting preview deployments.
    • Projects: Select the Vercel project(s) serving the domain you're tracking
    • Endpoint: Paste your Drain URL from Step 1
  6. Click Create

Vercel will verify the endpoint and create the drain.

Step 3 -- Copy the signature secret to Cleotic

Vercel generates a signature secret for each log drain. Cleotic needs this to verify that incoming payloads are genuinely from Vercel.

  1. In Vercel's Log Drains list, click Edit on the drain you just created
  2. Find the Signature Secret and copy it
  3. Back in Cleotic, in the Snippet modal's Vercel log drain section, paste the secret into the Signature secret field
  4. Click Save

The badge should update to Configured.

Step 4 -- Verify it's working

Vercel side: Your new drain should show a green status in the Log Drains list. If you see "Failed Delivery Rate" > 0%, see the troubleshooting section below.

Cleotic side: AI crawlers visit sporadically -- it can take anywhere from a few minutes to a few hours to see your first hit. In the Crawlers tab:

  • The "Total visits" counter will start climbing
  • The bar chart will show entries with crawlers like GPTBot, ClaudeBot, and CCBot

Quick test: You can force an immediate test by visiting your site with a crawler user agent:

curl -A "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)" \
  https://your-domain.com/any-page

That request should appear in the Crawlers tab within about 30 seconds.

Troubleshooting

"Failed Delivery Rate" > 0% in Vercel

Open a failed delivery in Vercel to see the response code:

CodeMeaningFix
401 drain secret not configuredNo secret saved in CleoticComplete Step 3 to paste the Vercel signature secret
401 invalid signatureSecret doesn't matchRe-copy the secret from Vercel and paste into Cleotic
404 tracking site not foundWrong drain URL or deleted tracking siteVerify the URL matches what the Cleotic Snippet modal shows
413 request entity too largeVery rare -- oversized batchContact support@cleotic.ai

No crawler visits after 24 hours

  1. Check your domain matches. The tracking site domain in Cleotic must match the Host header Vercel sees. www.example.com and example.com both match, but blog.example.com does not match example.com -- create a separate tracking site for each subdomain.
  2. Check drain environments. Confirm Production is selected.
  3. Check drain sources. Confirm Edge Network is included.
  4. Check for upstream blocking. If your site blocks AI crawlers via robots.txt, Cloudflare WAF, or Vercel Bot Protection, crawlers are blocked before they reach your logs.

The secret doesn't match after editing

If the drain was recreated or the secret was regenerated in Vercel, the old secret in Cleotic is no longer valid. Re-copy the current secret from Vercel and paste it into Cleotic.

Stopping tracking

Either delete the log drain in Vercel, or toggle the Cleotic tracking site to Paused. Pausing keeps the drain secret configured but silently drops incoming data until you unpause.

Technical details

Cleotic's drain endpoint:

  1. Verifies each batch's HMAC-SHA1 signature using the Vercel-provided secret
  2. Parses each log entry and checks the user agent
  3. Matches against a list of ~140 known AI crawlers (sourced from ai-robots-txt, refreshed periodically)
  4. Filters out entries whose host doesn't match your tracking site's domain
  5. Records matches as crawler visits with hit_type: vercel_drain

Normal browser traffic is never stored. Only identified AI crawler visits are recorded.

Limits

  • Volume: Each batch is capped at 8 MB (roughly 10,000-20,000 log entries). Contact support if you see 413 responses.
  • Data retention: Crawler visit history depends on your plan tier. Starter plans keep 7 days.
  • Historical data: Log drains only capture future traffic. There's no way to backfill data from before setup.
  • Preview deployments: If you include Preview environments, those visits mix into your stats. Most users leave this off.

Related