Nvidia Introduces New Microservices for Enhanced Inferencing Performance

by

in

– Nvidia announced Nvidia NIM, a software platform for deploying custom and pre-trained AI models
– NIM aims to create an ecosystem of AI-ready containers for companies looking to speed up their AI roadmap
– NIM includes support for models from various companies and will be integrated into frameworks like Deepset, LangChain, and LlamaIndex

Nvidia announced the release of Nvidia NIM, a software platform designed to simplify the deployment of custom and pre-trained AI models in production environments. NIM combines optimized inferencing engines with models and packages them into containers, making them accessible as microservices. This streamlined process aims to help companies speed up their AI roadmap by providing curated microservices using Nvidia’s hardware as the foundational layer.

Currently, NIM supports models from various companies and open-source sources, including Google, Microsoft, and Shutterstock. Nvidia is collaborating with Amazon, Google, and Microsoft to make these microservices available on platforms like SageMaker, Kubernetes Engine, and Azure AI. They will also be integrated into frameworks like Deepset, LangChain, and LlamaIndex for easier accessibility.

The inference engines used by Nvidia for these microservices include the Triton Inference Server, TensorRT, and TensorRT-LLM. Some Nvidia microservices available through NIM include Riva for speech and translation models, cuOpt for routing optimizations, and the Earth-2 model for weather and climate simulations. The company plans to add more capabilities over time, such as making the Nvidia RAG LLM operator available as a NIM to simplify building AI chatbots.

Nvidia also highlighted some of the current users of NIM, including Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp. Jensen Huang, founder and CEO of Nvidia, emphasized that established enterprises have valuable data that can be leveraged to build AI microservices and transform into AI companies with the help of NIM.

Source link