Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference
AWS Machine Learning - AI
DECEMBER 2, 2024
Finally, we delve into the supported frameworks, with a focus on LMI, PyTorch, Hugging Face TGI, and NVIDIA Triton, and conclude by discussing how this feature fits into our broader efforts to enhance machine learning (ML) workloads on AWS. Saurabh Trikande is a Senior Product Manager for Amazon Bedrock and SageMaker Inference.
Let's personalize your content