
When the HiAPI gateway sends back HTTP 429 Too Many Requests, it is telling you that the key you authenticated with has briefly exceeded its allowed rate or concurrency on https://api.hiapi.ai/v1/tasks (or on the polling endpoint GET /v1/tasks/{taskId}). The response body follows HiAPI's standard error envelope — {"error":{"type":"hiapi_error","code":"...","request_id":"..."}} — and is paired with an HTTP status code of 429. The fix is almost never to email support; it is to add a small retry-and-throttle layer on the client. This guide walks you through what the error means, why it fires, and the exact steps to make it go away.
A typical session looks like this on the wire:
> POST /v1/tasks HTTP/1.1
> Host: api.hiapi.ai
> Authorization: Bearer hi-sk-...
> Content-Type: application/json
< HTTP/1.1 429 Too Many Requests
< Content-Type: application/json
<
{"error":{"type":"hiapi_error","code":"rate_limit_exceeded","request_id":"req_..."}}
Two things to notice:
429, not 401/403. A 401 means the key is wrong or missing. A 429 means the key is good but you are sending too fast. Don't conflate them.hiapi_error envelope. Some HTTP clients only surface the status code by default; if your code is silently retrying without logging the body, you will lose useful context like request_id.If you are calling the GPT Image 2, Nano-Banana, flux, Seedance, Wan, or HappyHorse models, all of them ride on the same async endpoint, so they all share the same rate-limit accounting per key.
In our experience the order from most to least common is:
POST /v1/tasks for each in parallel with no concurrency cap. Even ten concurrent task creations is enough to trip the limit on a fresh account.GET /v1/tasks/{taskId}. A tight while True: poll(); sleep(0.2) loop counts as request volume just like task creation does. Sub-second polling cadence is the #1 cause of "I only created one task and still got 429".seedance-2-0, wan2.7-video/*, happyhorse-1-0) and the @pro image variants — count against a stricter concurrent-task ceiling. Submitting many of these at once trips 429 even when your per-minute rate is low.HIAPI_API_KEY add up. Each environment looks fine in isolation; together they push the key over the limit.Work through these in order. Most teams stop being paged after step 3.
Any client that talks to api.hiapi.ai should retry 429 with growing delays — never retry immediately. A safe default:
import random, time, requests
def post_task_with_backoff(body, key, *, max_retries=5):
delay = 1.0
for attempt in range(max_retries):
r = requests.post(
"https://api.hiapi.ai/v1/tasks",
headers={"Authorization": f"Bearer {key}", "Content-Type": "application/json"},
json=body, timeout=30,
)
if r.status_code != 429:
r.raise_for_status()
return r.json()
retry_after = r.headers.get("Retry-After")
wait = float(retry_after) if retry_after else delay + random.random()
time.sleep(min(wait, 30.0))
delay = min(delay * 2, 30.0)
raise RuntimeError("rate-limited after retries")
The two things this snippet gets right: it caps the delay at 30 seconds, and it honors a Retry-After header if one is present. Even if your specific 429 response does not include Retry-After, the cap-and-jitter strategy is enough to clear most bursts.
A semaphore on the client side is the single most effective change. If you do not know what value to pick, start with 5:
import asyncio, httpx
sem = asyncio.Semaphore(5)
async def create_task(client, body):
async with sem:
r = await client.post("https://api.hiapi.ai/v1/tasks", json=body, timeout=30)
return r.json()
Five in-flight tasks is comfortable for most accounts. If you still see 429s, drop to 3. Heavy video models often need a separate, lower semaphore (try 2).
callback.urlFor production workloads, polling GET /v1/tasks/{taskId} is the wrong default. Pass callback.url in the request body and HiAPI will POST the terminal result to you instead:
{
"model": "Nano-Banana",
"input": { "prompt": "a calm lake at dawn", "aspect_ratio": "16:9" },
"callback": { "url": "https://your-domain.com/hiapi/callback", "when": "final" }
}
This converts an O(tasks × polls) request pattern into O(tasks) — usually a 10-50× reduction in API calls and a near-instant cure for polling-induced 429s. Keep GET /v1/tasks/{taskId} for local debugging and low-volume one-shots.
If you cannot use callbacks (offline scripts, local notebooks, fallback reconciliation), do not poll faster than every 2-5 seconds, and add jitter so multiple workers do not lockstep:
import random, time
time.sleep(2 + random.random() * 3) # 2-5s with jitter
Generate one key per environment — dev, staging, prod — at the HiAPI API keys dashboard. Each environment's burst stays isolated, and you can revoke a leaked key without taking the others down.
Open the HiAPI usage dashboard and confirm whether you are bumping a concurrency or plan limit, or merely a transient burst. Plan headroom is shown at the billing page. If usage is well under your plan but you still see 429, the cause is almost certainly client-side bursting, not the account.
Once the retry layer is in place, a single Nano-Banana task is the cheapest way to confirm everything is wired up. Replace $HIAPI_API_KEY with your key:
curl -s -X POST https://api.hiapi.ai/v1/tasks \
-H "Authorization: Bearer $HIAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Nano-Banana",
"input": { "prompt": "a single red apple on a white table", "aspect_ratio": "1:1" }
}'
A success looks like {"code":200,"data":{"taskId":"tk-hiapi-..."}}. Then poll once:
curl -s -H "Authorization: Bearer $HIAPI_API_KEY" \
https://api.hiapi.ai/v1/tasks/tk-hiapi-...
Status moves through queued → handling → archiving → success, and the final data.output[0].url is your image. If this single round-trip returns 200 but your batch job still 429s, the problem is concurrency at your client, not your key.
Why am I getting 429 even though I only sent one request?
Almost always because the polling loop is the actual request volume, not the task creation. A 200 ms polling interval on five concurrent in-flight tasks is 25 requests per second by itself. Slow the polling to 2-5 seconds, or move to callback.url.
Does HiAPI always send a Retry-After header?
Treat Retry-After as advisory: use its value when present, fall back to exponential backoff (1s, 2s, 4s, …, cap 30s) when not. Either way, never retry without a delay — that just amplifies the burst.
Is 429 the same as 401 or 403?
No. 401 and 403 mean authentication failed and follow the same hiapi_error envelope, e.g. {"error":{"code":"permission_denied","type":"hiapi_error","request_id":"req_..."}}. If you see 401/403, double-check the key value at the API keys dashboard — backoff will not help.
Can I just raise my rate limit?
If your traffic genuinely justifies a higher concurrency limit, check your plan headroom on the billing page and contact support from there. In practice the fixes above resolve >90% of 429 complaints without any limit change.
What is the difference between 429 on /v1/tasks and 429 on /v1/tasks/{taskId}?
The endpoint just tells you what you were doing at the time — both share the same per-key rate budget. Creation 429s point to fan-out (cap concurrency, batch); polling 429s point to tight polling loops (slow cadence or switch to callbacks). See the HiAPI Quickstart and the Authentication doc for the full request/response shape.
429 is a normal, recoverable signal from any production-grade API gateway. Add exponential backoff, cap your client-side concurrency, prefer callbacks over polling, and confirm at the usage dashboard. Once those four things are true, you can stop thinking about 429 entirely.
Key Takeaways