Red Hat has signed a definitive agreement to acquire Neural Magic, a US based provider in software and algorithms that accelerate generative AI (gen AI) inference workloads. Neural Magic’s expertise in inference performance engineering and commitment to open source aligns with Red Hat’s vision of high-performing AI workloads that directly map to customer-specific use cases and data, anywhere and everywhere across the hybrid cloud.
“AI workloads need to run wherever customer data lives across the hybrid cloud; this makes flexible, standardized and open platforms and tools a necessity, as they enable organizations to select the environments, resources and architectures that best align with their unique operational and data needs,” says Matt Hicks, President and CEO, Red Hat
While the promise of gen AI dominates much of today’s technology landscape, the large language models (LLMs) underpinning these systems continue to increase in size. As a result, building cost-efficient and reliable LLM services requires significant computing power, energy resources and specialized operational skills. These challenges effectively put the benefits of customized, deployment-ready and more security-conscious AI out of reach for most organizations.
Red Hat intends to address these challenges by making gen AI more accessible to more organizations through the open innovation of vLLM. Developed by UC Berkeley, vLLM is a community-driven open source project for open model serving (how gen AI models infer and solve problems), with support for all key model families, advanced inference acceleration research and diverse hardware backends including AMD GPUs, AWS Neuron, Google TPUs, Intel Gaudi, NVIDIA GPUs and x86 CPUs. Neural Magic’s leadership in the vLLM project combined with Red Hat’s strong portfolio of hybrid cloud AI technologies will offer organizations an open pathway to building AI strategies that meet their unique needs, wherever their data lives.
Red Hat + Neural Magic: Enabling a future of hybrid cloud-ready gen AI
Neural Magic spun out of MIT in 2018 with the goal of building performant inference software for deep learning. With Neural Magic’s technology and performance engineering expertise, Red Hat aims to accelerate its vision for AI’s future, powered by the Red Hat AI technology portfolio. Built to break through the challenges of wide-scale enterprise AI, Red Hat uses open source innovation to further democratize access to AI’s transformative power via:
- Open source-licensed models, from the 1B to 405B parameter scale, that can run anywhere and everywhere needed across the hybrid cloud – in corporate data centers, on multiple clouds and at the edge.
- Fine-tuning capabilities that enable organizations to more easily customize LLMs to their private data and uses cases with a stronger security footprint;
- Inference performance engineering expertise, resulting in greater operational and infrastructure efficiencies; and
- A partner and open source ecosystem and support structures that enable broader customer choice, from LLMs and tooling to certified server hardware and underlying chip architectures.
vLLM leadership to enhance Red Hat AI
Neural Magic uses its expertise and knowledge in vLLM to build an enterprise-grade inference stack which enables customers to optimize, deploy and scale LLM workloads across hybrid cloud environments with full control over infrastructure choice, security policies and model lifecycle. Neural Magic also develops model optimization research, builds LLM Compressor (a unified library for optimizing LLMs with state-of-the-art sparsity and quantization algorithms) and maintains a repository of pre-optimized models ready to deploy with vLLM.
Red Hat AI aims to help customers lower AI’s costs and skill barriers with powerful technologies, including:
- Red Hat Enterprise Linux AI (RHEL AI), a foundation model platform to more seamlessly develop, test and run the IBM Granite family of open source LLMs for enterprise applications on Linux server deployments;
- Red Hat OpenShift AI, an AI platform that provides tools to rapidly develop, train, serve and monitor machine learning models across distributed Kubernetes environments on-site, in the public cloud or at the edge; and
- InstructLab, an approachable open source AI community project created by Red Hat and IBM that enables anyone to shape the future of gen AI via the collaborative improvement of open source-licensed Granite LLMs using InstructLab’s fine-tuning technology.
Neural Magic’s technology leadership in vLLM will enhance Red Hat AI’s ability to support LLM deployments anywhere and everywhere across the hybrid cloud with a ready-made, highly-optimized and open inference stack.
The transaction is subject to applicable regulatory reviews and other customary closing conditions.