Foundry Local is an on-device AI inference solution offering performance, privacy, customization, and cost advantages.
Optimize performance using ONNX Runtime and hardware acceleration, Foundry Local will automatically select and download a model variant with the best performance for your hardware. (CUDA if you have NVidia GPU, NPU-optimized model for Qualcomm NPU, and if nothing, CPU-optimized model)
Python and js SDK available.
If the model is not already available in ONNX, Olive (https://microsoft.github.io/Olive/) allows to compile existing models in Safetensor or PyTorch format into the ONNX format
Foundry Local is licensed under the Microsoft Software License Terms
Optimize performance using ONNX Runtime and hardware acceleration, Foundry Local will automatically select and download a model variant with the best performance for your hardware. (CUDA if you have NVidia GPU, NPU-optimized model for Qualcomm NPU, and if nothing, CPU-optimized model)
Python and js SDK available.
If the model is not already available in ONNX, Olive (https://microsoft.github.io/Olive/) allows to compile existing models in Safetensor or PyTorch format into the ONNX format
Foundry Local is licensed under the Microsoft Software License Terms
https://devblogs.microsoft.com/foundry/unlock-instant-on-dev...