We’ve got a temporarily underutilized 64 x AMD MI300X cluster, so instead of letting it sit idle, we’re opening it up for LLM inference.
Running: LLaMA 4 Maverick, DeepSeek V3, R1, and R1-0528. Want another open model? Let us know. We are happy to deploy it.
Prices are around 50% lower than the cheapest OpenRouter endpoints, and they will stay that way through June (maybe July).
The server handles up to 10,000 requests/sec, and we allocate GPUs per model based on demand. So feel free to load-test it, hammer it, or run production traffic. We're collecting no data whatsoever.
cloudrift.ai/inference
Full disclosure: I am the founder. We're trying to make good use of this capacity. Let us know if you have any ideas on how to utilize the cluster meaningfully. We're happy to hear the feedback.
Running: LLaMA 4 Maverick, DeepSeek V3, R1, and R1-0528. Want another open model? Let us know. We are happy to deploy it.
Prices are around 50% lower than the cheapest OpenRouter endpoints, and they will stay that way through June (maybe July).
The server handles up to 10,000 requests/sec, and we allocate GPUs per model based on demand. So feel free to load-test it, hammer it, or run production traffic. We're collecting no data whatsoever.
cloudrift.ai/inference
Full disclosure: I am the founder. We're trying to make good use of this capacity. Let us know if you have any ideas on how to utilize the cluster meaningfully. We're happy to hear the feedback.