Cloudflare AI Workers
Cloudflare AI Workers integration allows your product to run AI models directly at the edge using Cloudflare's AI Gateway and Workers AI platform. This integration enables you to perform text generation, embeddings, speech recognition, and image tasks using models like Llama 3, Mistral, Whisper, and Stable Diffusion, all hosted within Cloudflare's global edge network.
With this integration, you can process AI inferences close to users, reduce latency, and eliminate the need for direct cloud API dependencies (like OpenAI or Anthropic).
Credentials Needed
To integrate with Cloudflare AI Workers, you need credentials associated with your Cloudflare account and Workers AI project.
Required credentials:
- Cloudflare Account ID
- Cloudflare API Token (with appropriate permissions)
- (Optional) Cloudflare Zone ID (if interacting with specific domain-based workers)
Tokens should be created with least privilege access and stored securely — never hardcoded in frontend apps.
Permissions Needed / API Scopes
The integration requires Workers AI and Account-level permissions to deploy and invoke AI models.
Required Token Scopes
| Permission | Purpose |
|---|---|
| Account.Workers AI Read | Read model details, status, and logs |
| Account.Workers AI Write | Run or deploy models in Workers AI |
| Account.Workers Scripts Read | Access deployed worker details |
| Account.Workers Scripts Write | Deploy or update AI workers |
| Account.Workers Tail Read | View worker logs for debugging (optional) |
Creating Users / Access Tokens
Step 1: Generate an API Token
- Log in to your Cloudflare Dashboard: https://dash.cloudflare.com/profile/api-tokens
- Click Create Token.
- Select Create Custom Token.
- Assign permissions:
- Workers AI → Read and Write
- Workers Scripts → Read and Write
- Scope the token to your Account (not global, for safety).
- Click Continue to Summary → Create Token.
- Copy the API Token — it will only be shown once.
Test Connectivity
You can test your Cloudflare AI connection using the Workers AI REST API or curl:
curl -X POST "https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai/run/@cf/meta/llama-3-8b-instruct" \
-H "Authorization: Bearer <CLOUDFLARE_API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are an assistant that confirms API connectivity."},
{"role": "user", "content": "Hello from my integration!"}
]
}'
If the request returns a valid JSON with a result.response field, your credentials and permissions are correctly configured.
Save the Results in the Platform and Create Connection
- In your platform's integrations setup, securely store:
CLOUDFLARE_ACCOUNT_IDCLOUDFLARE_API_TOKEN- (Optional)
ZONE_ID(if applicable)
- Label the connector as Cloudflare AI Workers Integration.
- Test the connection by running a simple inference (e.g., text generation with Llama 3).
- Save the verified connection for use across inference or model deployment modules.
Best Practices
- Use Workers AI for low-latency AI inference directly at the edge — ideal for chatbots, summarization, or streaming experiences.
- Prefer Account-level tokens over global API keys — apply least privilege access.
- Rotate tokens regularly and revoke old ones through the Cloudflare Dashboard.
- Store all credentials securely using Cloudflare Secrets or your platform's encrypted vault.
- For sensitive use cases, deploy models inside your own Cloudflare Worker and call them securely via internal APIs.
- Cache frequent inferences using Cloudflare KV or R2 to reduce API load.
- Use Cloudflare Logs and Workers Tail to monitor latency and error metrics.
- Enable AI Gateway to track analytics, rate limits, and cost usage across your AI calls.