Outerport

startup

Hot-swap AI model weights in production

Outerport was founded in 2024 and participated in Y Combinator's Summer 2024 batch. The company addresses a critical infrastructure challenge in AI deployment: the cost and complexity of managing multiple large language models in production. The company's core technology enables hot-swapping of AI model weights, meaning that different models can be loaded and unloaded from GPU memory dynamically based on incoming requests. This is analogous to how virtual memory works in operating systems—model weights are kept in CPU RAM or disk and swapped into GPU memory on demand. This approach has significant implications for AI infrastructure costs. Most deployed LLMs are idle for the majority of the time, yet they must remain loaded in GPU memory to serve requests with low latency. Outerport's technology eliminates this waste by allowing GPU resources to be shared across models, dramatically reducing the hardware required to serve multiple AI applications. Outerport's solution integrates with existing inference frameworks and is designed to work with popular open-source models. The technology handles the complexity of memory management, weight loading, and request routing, providing a simple API for developers to deploy and manage multiple models.

San Francisco, CAFounded 20241-10 employees$500K (YC standard deal) raised

Visit Website GitHub Twitter

AI InfrastructureDeveloper ToolsCloud AI