Architecture, Artificial Inteligence and Scalability

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

AWS Machine Learning - AI

MAY 1, 2025

We will deep dive into the MCP architecture later in this post. For MCP implementation, you need a scalable infrastructure to host these servers and an infrastructure to host the large language model (LLM), which will perform actions with the tools implemented by the MCP server.

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

The key to operational AI: Modern data architecture

Webinars

Trending Sources

Revolutionizing data management: Trends driving security, scalability, and governance in 2025

Webinars

What is data architecture? A framework to manage data

Multi-LLM routing strategies for generative AI applications on AWS

The AI Future According to Google Cloud Next ’25: My Interesting Finds

AI in action: Stories of how enterprises are transforming and modernizing

Top 11 LLM Tools That Ensure Smooth LLM Operations

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

From legacy to lakehouse: Centralizing insurance data with Delta Lake

Building a Scalable ML Pipeline and API in AWS

CAIOs are stepping out from the CIO’s shadow

How today’s enterprise architect juggles strategy, tech and innovation

Techniques and approaches for monitoring large language models on AWS

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

Reimagine application modernisation with the power of generative AI

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AI brings order to observability disorder

Can serverless fix fintech’s scaling problem?

Build and deploy a UI for your generative AI applications with AWS and Python

Revolutionize trip planning with Amazon Bedrock and Amazon Location Service

Use zero-shot large language models on Amazon Bedrock for custom named entity recognition

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

ServiceNow reimagines platform for agentic AI

Host concurrent LLMs with LoRAX

CIO hiring on the rise: How to land a top tech exec role in 2025

Accelerate AWS Well-Architected reviews with Generative AI

Model customization, RAG, or both: A case study with Amazon Nova

FloQast builds an AI-powered accounting transformation solution with Anthropic’s Claude 3 on Amazon Bedrock

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

From LLM Mess to LLM Mesh: Building Scalable AI Applications

CIOs contend with gen AI growing pains

Build a video insights and summarization engine using generative AI with Amazon Bedrock

A blueprint for successfully executing business-aligned IT strategies

Creating asynchronous AI agents with Amazon Bedrock

Foundation Model vs LLM: Choosing the Best AI Model

Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Development Support Program

2025 Middle East tech trends: How CIOs will drive innovation with AI

Invest in AI search as an enterprise business asset

Insights in implementing production-ready solutions with generative AI

From automation to transformation: How AI is reshaping business

Stay Connected