Current Rate Limits
API usage is limited by concurrency (i.e., the number of in-flight requests). Below are the current rate limits for each model.
Loading Rate limits...
Explanation of Rate Limits
To ensure stable access to GLM-4-Flash during the free trial, requests with context lengths over 8K will be throttled to 1% of the standard concurrency limit.
Loading...