MK1 - Blog

The MK1 Blog

Stay up-to-date with the latest updates from MK1

Cut Costs and Accelerate LLM Inference using MK1 Flywheel on Modal

We are thrilled to announce that our LLM inference engine, MK1 Flywheel, is now available through Modal's developer-friendly cloud service.

MK1 Flywheel Unlocks the Full Potential of AMD Instinct for LLM Inference

With the release of our new inference engine MK1 Flywheel, we are excited to report that AMD Instinct Series can achieve comparable performance to a compute-matched NVIDIA GPU. We've designed MK1 Flywheel for maximum performance on AMD and NVIDIA chips: our benchmarks demonstrate up to 3.7x higher throughput compared to vLLM.

MK1 Flywheel has the Best Throughput and Latency for LLM Inference on NVIDIA and AMD

MK1 has built an optimized LLM inference runtime, called Flywheel, that now blazes with higher throughput and lower latency on NVIDIA Ampere, Ada, Hopper and on AMD Instinct. Try it on Amazon SageMaker and engage with us for large-scale deployment.

Introducing MK1 Flywheel Beta

We are pushing the performance of AI models to the limit of what’s physically possible. Introducing MK1 Flywheel, our enterprise LLM inference solution. Contact us to become an early partner in the closed beta trial.