Architecture, Artificial Inteligence and AWS

Multi-LLM routing strategies for generative AI applications on AWS

AWS Machine Learning - AI

APRIL 9, 2025

Organizations are increasingly using multiple large language models (LLMs) when building generative AI applications. Although an individual LLM can be highly capable, it might not optimally address a wide range of use cases or meet diverse performance requirements.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Applications

Build and deploy a UI for your generative AI applications with AWS and Python

AWS Machine Learning - AI

NOVEMBER 6, 2024

Traditionally, building frontend and backend applications has required knowledge of web development frameworks and infrastructure management, which can be daunting for those with expertise primarily in data science and machine learning. Access to Amazon Bedrock foundation models is not granted by default.

Generative AI

Generative AI AWS Artificial Inteligence Applications

AI in action: Stories of how enterprises are transforming and modernizing

CIO

MARCH 20, 2025

Generative and agentic artificial intelligence (AI) are paving the way for this evolution. Built on top of EXLerate.AI, EXLs AI orchestration platform, and Amazon Web Services (AWS), Code Harbor eliminates redundant code and optimizes performance, reducing manual assessment, conversion and testing effort by 60% to 80%.

Artificial Inteligence

Artificial Inteligence Enterprise Insurance Artificial Intelligence

Webinars

Going Beyond Chatbots: Connecting AI to Your Tools, Systems, & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

MORE WEBINARS

Introducing AWS MCP Servers for code assistants (Part 1)

AWS Machine Learning - AI

APRIL 1, 2025

Were excited to announce the open source release of AWS MCP Servers for code assistants a suite of specialized Model Context Protocol (MCP) servers that bring Amazon Web Services (AWS) best practices directly to your development workflow. This post is the first in a series covering AWS MCP Servers.

AWS

AWS Software Review Knowledge Base Artificial Inteligence

Building a Scalable ML Pipeline and API in AWS

Dzone - DevOps

MARCH 28, 2025

With rapid progress in the fields of machine learning (ML) and artificial intelligence (AI), it is important to deploy the AI/ML model efficiently in production environments. The architecture downstream ensures scalability, cost efficiency, and real-time access to applications.

Scalability

Scalability Artificial Inteligence AWS Artificial Intelligence

Accelerate AWS Well-Architected reviews with Generative AI

AWS Machine Learning - AI

MARCH 4, 2025

To achieve these goals, the AWS Well-Architected Framework provides comprehensive guidance for building and improving cloud architectures. This allows teams to focus more on implementing improvements and optimizing AWS infrastructure. This systematic approach leads to more reliable and standardized evaluations.

Generative AI

Generative AI Technical Review Software Review Systems Review

United Airlines sets its flight plan for gen AI success

CIO

DECEMBER 20, 2024

With the core architectural backbone of the airlines gen AI roadmap in place, including United Data Hub and an AI and ML platform dubbed Mars, Birnbaum has released a handful of models into production use for employees and customers alike.

Airlines

Airlines Generative AI Artificial Inteligence Weak Development Team

Techniques and approaches for monitoring large language models on AWS

AWS Machine Learning - AI

FEBRUARY 26, 2024

Large Language Models (LLMs) have revolutionized the field of natural language processing (NLP), improving tasks such as language translation, text summarization, and sentiment analysis. Monitoring the performance and behavior of LLMs is a critical task for ensuring their safety and effectiveness.

Artificial Inteligence

Artificial Inteligence AWS Lambda Metrics

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker. You can use AWS services such as Application Load Balancer to implement this approach. It consists of one or more components depending on the number of FM providers and number and types of custom models used.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

Using Amazon Q Business with AWS HealthScribe to gain insights from patient consultations

AWS Machine Learning - AI

OCTOBER 17, 2024

With the advent of generative AI and machine learning, new opportunities for enhancement became available for different industries and processes. AWS HealthScribe combines speech recognition and generative AI trained specifically for healthcare documentation to accelerate clinical documentation and enhance the consultation experience.

AWS

AWS Artificial Inteligence Generative AI Machine Learning

Integrate foundation models into your code with Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 6, 2024

The rise of large language models (LLMs) and foundation models (FMs) has revolutionized the field of natural language processing (NLP) and artificial intelligence (AI). Development environment – Set up an integrated development environment (IDE) with your preferred coding language and tools.

Software Review

Software Review Artificial Inteligence Generative AI AWS

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

The introduction of Amazon Nova models represent a significant advancement in the field of AI, offering new opportunities for large language model (LLM) optimization. In this post, we demonstrate how to effectively perform model customization and RAG with Amazon Nova models as a baseline.

Case Study

Case Study Artificial Inteligence Study Generative AI

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

AWS Machine Learning - AI

DECEMBER 24, 2024

Training large language models (LLMs) models has become a significant expense for businesses. For many use cases, companies are looking to use LLM foundation models (FM) with their domain-specific data. To learn more about Trainium chips and the Neuron SDK, see Welcome to AWS Neuron.

AWS

AWS Artificial Inteligence Generative AI Training

Revolutionize trip planning with Amazon Bedrock and Amazon Location Service

AWS Machine Learning - AI

NOVEMBER 14, 2024

This is where AWS and generative AI can revolutionize the way we plan and prepare for our next adventure. With the significant developments in the field of generative AI , intelligent applications powered by foundation models (FMs) can help users map out an itinerary through an intuitive natural conversation interface.

Artificial Inteligence

Artificial Inteligence Generative AI Travel AWS

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

AWS Machine Learning - AI

DECEMBER 4, 2024

At AWS re:Invent 2024, we are excited to introduce Amazon Bedrock Marketplace. This a revolutionary new capability within Amazon Bedrock that serves as a centralized hub for discovering, testing, and implementing foundation models (FMs). Prior to joining AWS, Dr. Li held data science roles in the financial and retail industries.

Artificial Inteligence

Artificial Inteligence Microservices Generative AI AWS

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

AWS Machine Learning - AI

APRIL 23, 2025

National Laboratory has implemented an AI-driven document processing platform that integrates named entity recognition (NER) and large language models (LLMs) on Amazon SageMaker AI. In this post, we discuss how you can build an AI-powered document processing platform with open source NER and LLMs on SageMaker.

Artificial Inteligence

Artificial Inteligence Open Source AWS Serverless

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

Principal wanted to use existing internal FAQs, documentation, and unstructured data and build an intelligent chatbot that could provide quick access to the right information for different roles. Principal also used the AWS open source repository Lex Web UI to build a frontend chat interface with Principal branding.

Generative AI

Generative AI AWS Groups Artificial Inteligence

Harness the power of MCP servers with Amazon Bedrock Agents

AWS Machine Learning - AI

APRIL 1, 2025

AI agents extend large language models (LLMs) by interacting with external systems, executing complex workflows, and maintaining contextual awareness across operations. This gives you an AI agent that can transform the way you manage your AWS spend.

Generative AI

Generative AI AWS Artificial Inteligence Software Review

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

AWS Machine Learning - AI

DECEMBER 2, 2024

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI models for inference. It supports a wide range of popular open source LLMs, making it a popular choice for diverse AI applications.

Generative AI

Generative AI Artificial Inteligence Machine Learning AWS

Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 20, 2024

The effectiveness of RAG heavily depends on the quality of context provided to the large language model (LLM), which is typically retrieved from vector stores based on user queries. The relevance of this context directly impacts the model’s ability to generate accurate and contextually appropriate responses.

Artificial Inteligence

Artificial Inteligence Applications Knowledge Base Generative AI

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning - AI

OCTOBER 16, 2024

The following were some initial challenges in automation: Language diversity – The services host both Dutch and English shows. Some local shows feature Flemish dialects, which can be difficult for some large language models (LLMs) to understand. The secondary LLM is used to evaluate the summaries on a large scale.

Media

Media Video Artificial Inteligence Generative AI

Enable Amazon Bedrock cross-Region inference in multi-account environments

AWS Machine Learning - AI

MARCH 27, 2025

Amazon Bedrock cross-Region inference capability that provides organizations with flexibility to access foundation models (FMs) across AWS Regions while maintaining optimal performance and availability. We provide practical examples for both SCP modifications and AWS Control Tower implementations.

Artificial Inteligence

Artificial Inteligence AWS Technical Review Policies

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning - AI

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Generative AI Technical Review Artificial Inteligence

Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Development Support Program

AWS Machine Learning - AI

JULY 31, 2024

Amazon Web Services (AWS) is committed to supporting the development of cutting-edge generative artificial intelligence (AI) technologies by companies and organizations across the globe. Let’s dive in and explore how these organizations are transforming what’s possible with generative AI on AWS.

Artificial Inteligence

Artificial Inteligence AWS Programming Innovation

Transcribe, translate, and summarize live streams in your browser with AWS AI and generative AI services

AWS Machine Learning - AI

NOVEMBER 13, 2024

However, as the reach of live streams expands globally, language barriers and accessibility challenges have emerged, limiting the ability of viewers to fully comprehend and participate in these immersive experiences. The following diagram illustrates the architecture of the application.

Generative AI

Generative AI AWS Lambda Authentication

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 31, 2024

AWS offers powerful generative AI services , including Amazon Bedrock , which allows organizations to create tailored use cases such as AI chat-based assistants that give answers based on knowledge contained in the customers’ documents, and much more. The following figure illustrates the high-level design of the solution.

Generative AI

Generative AI Lambda Applications AWS

Use zero-shot large language models on Amazon Bedrock for custom named entity recognition

AWS Machine Learning - AI

JUNE 18, 2024

Some examples include extracting players and positions in an NFL game summary, products mentioned in an AWS keynote transcript, or key names from an article on a favorite tech company. This process must be repeated for every new document and entity type, making it impractical for processing large volumes of documents at scale.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Machine Learning

Host concurrent LLMs with LoRAX

AWS Machine Learning - AI

APRIL 16, 2025

Out-of-the-box models often lack the specific knowledge required for certain domains or organizational terminologies. To address this, businesses are turning to custom fine-tuned models, also known as domain-specific large language models (LLMs). Why LoRAX for LoRA deployment on AWS?

Artificial Inteligence

Artificial Inteligence Generative AI AWS Storage

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

AWS Machine Learning - AI

NOVEMBER 27, 2024

Large language models (LLMs) have witnessed an unprecedented surge in popularity, with customers increasingly using publicly available models such as Llama, Stable Diffusion, and Mistral. With activations being partitioned along the sequence dimension, we need to consider how our model’s computations are affected.

Training

Training Artificial Inteligence AWS Machine Learning

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

AWS Machine Learning - AI

APRIL 21, 2025

In this blog post, we discuss how Prompt Optimization improves the performance of large language models (LLMs) for intelligent text processing task in Yuewen Group. Evolution from Traditional NLP to LLM in Intelligent Text Processing Yuewen Group leverages AI for intelligent analysis of extensive web novel texts.

Artificial Inteligence

Artificial Inteligence Groups Applications Innovation

Reduce ML training costs with Amazon SageMaker HyperPod

AWS Machine Learning - AI

APRIL 10, 2025

The failed instance also needs to be isolated and terminated manually, either through the AWS Management Console , AWS Command Line Interface (AWS CLI), or tools like kubectl or eksctl. Frontier model builders can further enhance model performance using built-in ML tools within SageMaker HyperPod.

Training

Training Artificial Inteligence Hardware Systems Review

Revolutionizing clinical trials with the power of voice and AI

AWS Machine Learning - AI

MARCH 18, 2025

This is where the integration of cutting-edge technologies, such as audio-to-text translation and large language models (LLMs), holds the potential to revolutionize the way patients receive, process, and act on vital medical information. These insights can include: Potential adverse event detection and reporting.

Artificial Inteligence

Artificial Inteligence Technical Review Healthcare Systems Review

Build a video insights and summarization engine using generative AI with Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 29, 2024

This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call. This post provides guidance on how you can create a video insights and summarization engine using AWS AI/ML services.

Generative AI

Generative AI Video Engineering Artificial Inteligence

Stability AI backs effort to bring machine learning to biomed

TechCrunch

NOVEMBER 4, 2022

Called OpenBioML , the endeavor’s first projects will focus on machine learning-based approaches to DNA sequencing, protein folding and computational biochemistry. Stability AI’s ethically questionable decisions to date aside, machine learning in medicine is a minefield. Predicting protein structures.

Artificial Inteligence

Artificial Inteligence Machine Learning Biotech Training

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

AWS Machine Learning - AI

MARCH 13, 2025

DeepSeek-R1 , developed by AI startup DeepSeek AI , is an advanced large language model (LLM) distinguished by its innovative, multi-stage training process. Instead of relying solely on traditional pre-training and fine-tuning, DeepSeek-R1 integrates reinforcement learning to achieve more refined outputs.

Artificial Inteligence

Artificial Inteligence AWS Machine Learning Load Balancer

The future of data: A 5-pillar approach to modern data management

CIO

DECEMBER 11, 2024

Digital transformation started creating a digital presence of everything we do in our lives, and artificial intelligence (AI) and machine learning (ML) advancements in the past decade dramatically altered the data landscape. The choice of vendors should align with the broader cloud or on-premises strategy.

Data

Data Technical Review Software Review Weak Development Team

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 1, 2024

To address these challenges, Amazon Bedrock has launched a capability that organization can use to tag on-demand models and monitor associated costs. Organizations can now label all Amazon Bedrock models with AWS cost allocation tags , aligning usage to specific organizational taxonomies such as cost centers, business units, and applications.

Generative AI

Generative AI AWS Artificial Inteligence Budget

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Their DeepSeek-R1 models represent a family of large language models (LLMs) designed to handle a wide range of tasks, from code generation to general reasoning, while maintaining competitive performance and efficiency. The resulting distilled models, such as DeepSeek-R1-Distill-Llama-8B (from base model Llama-3.1-8B

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Creating asynchronous AI agents with Amazon Bedrock

AWS Machine Learning - AI

MARCH 13, 2025

Advancements in multimodal artificial intelligence (AI), where agents can understand and generate not just text but also images, audio, and video, will further broaden their applications. This post will discuss agentic AI driven architecture and ways of implementing.

Artificial Inteligence

Artificial Inteligence Lambda Travel Generative AI

Can serverless fix fintech’s scaling problem?

CIO

FEBRUARY 11, 2025

The built-in elasticity in serverless computing architecture makes it particularly appealing for unpredictable workloads and amplifies developers productivity by letting developers focus on writing code and optimizing application design industry benchmarks , providing additional justification for this hypothesis. Architecture complexity.

Serverless

Serverless Architecture Microservices Scalability

Amazon Bedrock Flows is now generally available with enhanced safety and traceability

AWS Machine Learning - AI

NOVEMBER 22, 2024

Seamless integration of latest foundation models (FMs), Prompts, Agents, Knowledge Bases, Guardrails, and other AWS services. Configure any auxiliary AWS services needed for your customer service workflow (for example, Amazon DynamoDB for order history). Flexibility to define the workflow based on your business logic.

Generative AI

Generative AI Artificial Inteligence Knowledge Base AWS

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

AWS Machine Learning - AI

NOVEMBER 13, 2024

The Amazon Bedrock single API access, regardless of the models you choose, gives you the flexibility to use different FMs and upgrade to the latest model versions with minimal code changes. Amazon Titan FMs provide customers with a breadth of high-performing image, multimodal, and text model choices, through a fully managed API.

AWS

AWS Engineering Serverless eCommerce

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

CIO

APRIL 15, 2025

During his one hour forty minute-keynote, Thomas Kurian, CEO of Google Cloud showcased updates around most of the companys offerings, including new large language models (LLMs) , a new AI accelerator chip, new open source frameworks around agents, and updates to its data analytics, databases, and productivity tools and services among others.

Cloud

Cloud Innovation Artificial Inteligence Google Cloud

Reduce conversational AI response time through inference at the edge with AWS Local Zones

AWS Machine Learning - AI

MARCH 3, 2025

Hybrid architecture with AWS Local Zones To minimize the impact of network latency on TTFT for users regardless of their locations, a hybrid architecture can be implemented by extending AWS services from commercial Regions to edge locations closer to end users. Next, create a subnet inside each Local Zone.

AWS

AWS Artificial Inteligence Technical Review Systems Review

Multi-LLM routing strategies for generative AI applications on AWS

Build and deploy a UI for your generative AI applications with AWS and Python

Webinars

Trending Sources

AI in action: Stories of how enterprises are transforming and modernizing

Webinars

Introducing AWS MCP Servers for code assistants (Part 1)

Building a Scalable ML Pipeline and API in AWS

Accelerate AWS Well-Architected reviews with Generative AI

United Airlines sets its flight plan for gen AI success

Techniques and approaches for monitoring large language models on AWS

Build a multi-tenant generative AI environment for your enterprise on AWS

Using Amazon Q Business with AWS HealthScribe to gain insights from patient consultations

Integrate foundation models into your code with Amazon Bedrock

Model customization, RAG, or both: A case study with Amazon Nova

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

Revolutionize trip planning with Amazon Bedrock and Amazon Location Service

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

Harness the power of MCP servers with Amazon Bedrock Agents

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

Enable Amazon Bedrock cross-Region inference in multi-account environments

How AWS sales uses Amazon Q Business for customer engagement

Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Development Support Program

Transcribe, translate, and summarize live streams in your browser with AWS AI and generative AI services

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

Use zero-shot large language models on Amazon Bedrock for custom named entity recognition

Host concurrent LLMs with LoRAX

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

Reduce ML training costs with Amazon SageMaker HyperPod

Revolutionizing clinical trials with the power of voice and AI

Build a video insights and summarization engine using generative AI with Amazon Bedrock

Stability AI backs effort to bring machine learning to biomed

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

The future of data: A 5-pillar approach to modern data management

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Creating asynchronous AI agents with Amazon Bedrock

Can serverless fix fintech’s scaling problem?

Amazon Bedrock Flows is now generally available with enhanced safety and traceability

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

Reduce conversational AI response time through inference at the edge with AWS Local Zones

Stay Connected