Artificial Inteligence and Scalability

Artificial Inteligence

Scalability

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

AWS Machine Learning - AI

MAY 1, 2025

For MCP implementation, you need a scalable infrastructure to host these servers and an infrastructure to host the large language model (LLM), which will perform actions with the tools implemented by the MCP server. We will deep dive into the MCP architecture later in this post.

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

Multi-LLM routing strategies for generative AI applications on AWS

Webinars

Trending Sources

Revolutionizing data management: Trends driving security, scalability, and governance in 2025

Webinars

AI in action: How enterprises are scaling AI for real business impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

EXL’s Insurance LLM transforms claims and underwriting

AI in action: Stories of how enterprises are transforming and modernizing

AI dominates Gartner’s 2025 predictions

The key to operational AI: Modern data architecture

MLOps 101: The Foundation for Your AI Strategy

The AI Future According to Google Cloud Next ’25: My Interesting Finds

Top 11 LLM Tools That Ensure Smooth LLM Operations

IT leaders see big business potential in small AI models

Building a Scalable ML Pipeline and API in AWS

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

Faster, Better, Cheaper: How to Measure the Business Impact of LLMs

CAIOs are stepping out from the CIO’s shadow

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

From legacy to lakehouse: Centralizing insurance data with Delta Lake

Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Gartner projects major IT spending increases for 2025

John Snow Labs Releases Generative AI Lab 7.0 to Help Domain Experts Evaluate and Improve LLM Applications and Conduct HCC Coding Reviews

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

Model customization, RAG, or both: A case study with Amazon Nova

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

9 IT skills where expertise pays the most

The Power of Small LLMs in Healthcare: A RAG Framework Alternative to Large Language Models

Scaling AI talent: An AI apprenticeship model that works

EBSCOlearning scales assessment generation for their online learning content with generative AI

Revolutionize trip planning with Amazon Bedrock and Amazon Location Service

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Reimagine application modernisation with the power of generative AI

AI brings order to observability disorder

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

Build and deploy a UI for your generative AI applications with AWS and Python

AI market evolution: Data and infrastructure transformation through AI

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

Eye On AI: As Big Money Rolls Into Data Centers, Startup Investment Gains

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

FloQast builds an AI-powered accounting transformation solution with Anthropic’s Claude 3 on Amazon Bedrock

CIO hiring on the rise: How to land a top tech exec role in 2025

Host concurrent LLMs with LoRAX

Dubai and the UAE partner with Google to reshape the digital future

Stay Connected