Parallelism - Upstash Documentation

The parallelism limit controls the maximum number of calls that can be executed concurrently. Unlike rate limiting (which works per time window), parallelism enforces concurrency control with a token-based system.

Configure Retry Attempt Count

import { Client } from "@upstash/workflow";

const client = new Client({ token: "<QSTASH_TOKEN>" })

const { workflowRunId } = await client.trigger({
  url: "https://<YOUR_WORKFLOW_ENDPOINT>/<YOUR-WORKFLOW-ROUTE>",
  flowControl: {
    key: "user-signup",
    parallelism: 10,
  }
})

Example: If parallelism = 3, at most 3 requests can run concurrently. When tokens are available, requests acquire one and start execution:

When all tokens are in use, additional requests are not failed — they’re queued in a waitlist:

The step in the waitlist will wait for a step to complete and hand off it’s token to a pending request:

Token handoff does not guarantee strict ordering. A later request in the waitlist may acquire a token before an earlier one.

Workflow