Execution Limits & Quotas - AI Planet Platform

The platform applies limits to keep workflow execution stable and fair across your organization. This page explains the limits you may encounter and how to work with them.

Concurrency limits

To keep the platform responsive, there’s a cap on how many runs can execute at the same time:

A limit per workflow — how many runs of a single workflow can run concurrently.
A limit per organization — how many runs across all your workflows can run concurrently.

When a concurrency limit is reached, a new run is rejected rather than queued — the trigger returns an error and you retry once a slot frees up. In practice, most workflows finish quickly and slots free up continuously.

A single thread runs one execution at a time. Sending a new message in a thread waits for the previous run in that thread to finish.

API rate limits

Requests to the API are rate limited. If you send requests too quickly, the API responds with a rate-limit error — wait briefly and retry. Build retry handling into any integration that calls the API at volume.

Working within limits

If you regularly hit concurrency limits, stagger when workflows are triggered rather than starting many at once, and keep workflows efficient so runs finish quickly and free up slots.

Watch Tracing to spot runs that are slower than expected.
Stagger bulk work instead of triggering many runs simultaneously.
Retry on rate-limit errors with a short delay in any API integration.

If your usage consistently exceeds the available limits, contact your platform administrator about raising them for your organization.

Next steps

Calling workflows via API

Trigger workflows from your own applications.

Tracing API Keys

​Concurrency limits

​API rate limits

​Working within limits

​Next steps

Calling workflows via API

Concurrency limits

API rate limits

Working within limits

Next steps