We're building infer.bid, a real-time matchmaking platform that connects suppliers and consumers of AI inference compute. Our primary focus is on open-source models initially, aiming to tackle high inference costs and GPU scarcity by introducing dynamic pricing through a real-time bidding system.
We’d love your input:
If you use services like OpenRouter, Replicate, or Hugging Face, what are your biggest pain points? Is cost your primary issue, or are you looking for more advanced features such as intelligent model routing, caching, or prompt optimization to save costs?
Would a platform that lets you dynamically bid for inference compute be compelling for your use cases? How important is pricing transparency and the ability to choose your hardware provider directly?
We're trying to better understand different user scenarios and prioritize features that directly address the community's needs. Any insights or feedback would be greatly appreciated!
This is still down the path of running your own infra which is a choice the inference consumer needs to make regarding the setup they have and if the management is worth it. We are also looking into how to encrypt or privatize prompts with inference providers to maintain privacy of your data. Infer.bid is more about having an open marketplace of inference providers to compete on price for your business. In the future your app/biz organization might just need inference and not have the hassle of maintaining the infra for it. Instead you could just consume inference like electricity.
We’d love your input:
If you use services like OpenRouter, Replicate, or Hugging Face, what are your biggest pain points? Is cost your primary issue, or are you looking for more advanced features such as intelligent model routing, caching, or prompt optimization to save costs?
Would a platform that lets you dynamically bid for inference compute be compelling for your use cases? How important is pricing transparency and the ability to choose your hardware provider directly?
We're trying to better understand different user scenarios and prioritize features that directly address the community's needs. Any insights or feedback would be greatly appreciated!
Thank you!