The parallelism limit controls the maximum number of calls that can be executed concurrently.
Unlike rate limiting (which works per time window), parallelism enforces concurrency control with a token-based system.
Configure Retry Attempt Count
import { Client } from "@upstash/workflow";
const client = new Client({ token: "<QSTASH_TOKEN>" })
const { workflowRunId } = await client.trigger({
url: "https://<YOUR_WORKFLOW_ENDPOINT>/<YOUR-WORKFLOW-ROUTE>",
flowControl: {
key: "user-signup",
parallelism: 10,
},
keepTriggerConfig: true,
})
Example:
If parallelism = 3, at most 3 requests can run concurrently.
When tokens are available, requests acquire one and start execution:
A failing step is automatically retried three times
When all tokens are in use, additional requests are not failed — they’re queued in a waitlist:
A failing step is automatically retried three times
The step in the waitlist will wait for a step to complete and hand off it’s token to a pending request:
Token handoff does not guarantee strict ordering.
A later request in the waitlist may acquire a token before an earlier one.
A failing step is automatically retried three times