Cloudflare AI Workers

Cloudflare AI Workers integration allows your product to run AI models directly at the edge using Cloudflare's AI Gateway and Workers AI platform. This integration enables you to perform text generation, embeddings, speech recognition, and image tasks using models like Llama 3, Mistral, Whisper, and Stable Diffusion, all hosted within Cloudflare's global edge network.

With this integration, you can process AI inferences close to users, reduce latency, and eliminate the need for direct cloud API dependencies (like OpenAI or Anthropic).

Credentials Needed

To integrate with Cloudflare AI Workers, you need credentials associated with your Cloudflare account and Workers AI project.

Required credentials:

Cloudflare Account ID
Cloudflare API Token (with appropriate permissions)
(Optional) Cloudflare Zone ID (if interacting with specific domain-based workers)

Tokens should be created with least privilege access and stored securely — never hardcoded in frontend apps.

Permissions Needed / API Scopes

The integration requires Workers AI and Account-level permissions to deploy and invoke AI models.

Required Token Scopes

Permission	Purpose
Account.Workers AI Read	Read model details, status, and logs
Account.Workers AI Write	Run or deploy models in Workers AI
Account.Workers Scripts Read	Access deployed worker details
Account.Workers Scripts Write	Deploy or update AI workers
Account.Workers Tail Read	View worker logs for debugging (optional)

Creating Users / Access Tokens

Step 1: Generate an API Token

Log in to your Cloudflare Dashboard: https://dash.cloudflare.com/profile/api-tokens
Click Create Token.
Select Create Custom Token.
Assign permissions:
- Workers AI → Read and Write
- Workers Scripts → Read and Write
Scope the token to your Account (not global, for safety).
Click Continue to Summary → Create Token.
Copy the API Token — it will only be shown once.

Test Connectivity

You can test your Cloudflare AI connection using the Workers AI REST API or curl:

curl -X POST "https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai/run/@cf/meta/llama-3-8b-instruct" \
  -H "Authorization: Bearer <CLOUDFLARE_API_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are an assistant that confirms API connectivity."},
      {"role": "user", "content": "Hello from my integration!"}
    ]
  }'

If the request returns a valid JSON with a result.response field, your credentials and permissions are correctly configured.

Save the Results in the Platform and Create Connection

In your platform's integrations setup, securely store:
- CLOUDFLARE_ACCOUNT_ID
- CLOUDFLARE_API_TOKEN
- (Optional) ZONE_ID (if applicable)
Label the connector as Cloudflare AI Workers Integration.
Test the connection by running a simple inference (e.g., text generation with Llama 3).
Save the verified connection for use across inference or model deployment modules.

Best Practices

Use Workers AI for low-latency AI inference directly at the edge — ideal for chatbots, summarization, or streaming experiences.
Prefer Account-level tokens over global API keys — apply least privilege access.
Rotate tokens regularly and revoke old ones through the Cloudflare Dashboard.
Store all credentials securely using Cloudflare Secrets or your platform's encrypted vault.
For sensitive use cases, deploy models inside your own Cloudflare Worker and call them securely via internal APIs.
Cache frequent inferences using Cloudflare KV or R2 to reduce API load.
Use Cloudflare Logs and Workers Tail to monitor latency and error metrics.
Enable AI Gateway to track analytics, rate limits, and cost usage across your AI calls.

Credentials Needed​

Permissions Needed / API Scopes​

Creating Users / Access Tokens​

Test Connectivity​

Save the Results in the Platform and Create Connection​

Best Practices​