Applications, AWS and Scalability

Multi-LLM routing strategies for generative AI applications on AWS

AWS Machine Learning - AI

APRIL 9, 2025

Organizations are increasingly using multiple large language models (LLMs) when building generative AI applications. This strategy results in more robust, versatile, and efficient applications that better serve diverse user needs and business objectives. In this post, we provide an overview of common multi-LLM applications.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Applications

Build and deploy a UI for your generative AI applications with AWS and Python

AWS Machine Learning - AI

NOVEMBER 6, 2024

Traditionally, building frontend and backend applications has required knowledge of web development frameworks and infrastructure management, which can be daunting for those with expertise primarily in data science and machine learning. Choose the us-east-1 AWS Region from the top right corner. Choose Manage model access.

Generative AI

Generative AI AWS Artificial Inteligence Applications

Accelerate AWS Well-Architected reviews with Generative AI

AWS Machine Learning - AI

MARCH 4, 2025

To achieve these goals, the AWS Well-Architected Framework provides comprehensive guidance for building and improving cloud architectures. This allows teams to focus more on implementing improvements and optimizing AWS infrastructure. This systematic approach leads to more reliable and standardized evaluations.

Generative AI

Generative AI Technical Review Software Review Systems Review

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AWS Machine Learning - AI

OCTOBER 29, 2024

Refer to Supported Regions and models for batch inference for current supporting AWS Regions and models. To address this consideration and enhance your use of batch inference, we’ve developed a scalable solution using AWS Lambda and Amazon DynamoDB. It stores information such as job ID, status, creation time, and other metadata.

Scalability

Scalability Lambda Generative AI AWS

Building a Scalable ML Pipeline and API in AWS

Dzone - DevOps

MARCH 28, 2025

This blog post discusses an end-to-end ML pipeline on AWS SageMaker that leverages serverless computing, event-trigger-based data processing, and external API integrations. The architecture downstream ensures scalability, cost efficiency, and real-time access to applications.

Scalability

Scalability Artificial Inteligence AWS Artificial Intelligence

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 31, 2024

AWS offers powerful generative AI services , including Amazon Bedrock , which allows organizations to create tailored use cases such as AI chat-based assistants that give answers based on knowledge contained in the customers’ documents, and much more. The following figure illustrates the high-level design of the solution.

Generative AI

Generative AI Lambda Applications AWS

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

AWS Machine Learning - AI

NOVEMBER 22, 2024

Cloud providers have recognized the need to offer model inference through an API call, significantly streamlining the implementation of AI within applications. AWS Step Functions is a fully managed service that makes it easier to coordinate the components of distributed applications and microservices using visual workflows.

Generative AI

Generative AI AWS Technical Review Backup

Empower your generative AI application with a comprehensive custom observability solution

AWS Machine Learning - AI

OCTOBER 29, 2024

Recently, we’ve been witnessing the rapid development and evolution of generative AI applications, with observability and evaluation emerging as critical aspects for developers, data scientists, and stakeholders. In this post, we set up the custom solution for observability and evaluation of Amazon Bedrock applications.

Generative AI

Generative AI Applications AWS Knowledge Base

Discover, Protect and Respond with AWS and Prisma Cloud

Prisma Clud

NOVEMBER 22, 2024

Unmanaged cloud resources, human error, misconfigurations and the increasing sophistication of cyber threats, including those from AI-powered applications, create vulnerabilities that can expose sensitive data and disrupt business operations. Enhance Security Posture – Proactively identify and mitigate threats to your AWS infrastructure.

AWS

AWS Cloud Network Compliance

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

While organizations continue to discover the powerful applications of generative AI , adoption is often slowed down by team silos and bespoke workflows. It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

AWS Machine Learning - AI

NOVEMBER 26, 2024

there is an increasing need for scalable, reliable, and cost-effective solutions to deploy and serve these models. AWS Trainium and AWS Inferentia based instances, combined with Amazon Elastic Kubernetes Service (Amazon EKS), provide a performant and low cost framework to run LLMs efficiently in a containerized environment.

AWS

AWS Load Balancer Software Review Artificial Inteligence

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning - AI

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Generative AI Technical Review Artificial Inteligence

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning - AI

DECEMBER 4, 2024

Building generative AI applications presents significant challenges for organizations: they require specialized ML expertise, complex infrastructure management, and careful orchestration of multiple services. You can obtain the SageMaker Unified Studio URL for your domains by accessing the AWS Management Console for Amazon DataZone.

Generative AI

Generative AI Applications Technical Review Software Review

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

With the QnABot on AWS (QnABot), integrated with Microsoft Azure Entra ID access controls, Principal launched an intelligent self-service solution rooted in generative AI. Principal also used the AWS open source repository Lex Web UI to build a frontend chat interface with Principal branding.

Generative AI

Generative AI AWS Groups Artificial Inteligence

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 1, 2024

With demand for generative AI applications surging across projects and multiple lines of business, accurately allocating and tracking spend becomes more complex. Without a scalable approach to controlling costs, organizations risk unbudgeted usage and cost overruns.

Generative AI

Generative AI AWS Artificial Inteligence Budget

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

AWS Machine Learning - AI

APRIL 1, 2025

AWS App Studio is a generative AI-powered service that uses natural language to build business applications, empowering a new set of builders to create applications in minutes. This highlights an interest in a more efficient approach to share and deploy applications across multiple App Studio instances.

AWS

AWS Software Review Technical Review Generative AI

Unlocking the Power of Serverless AI/ML on AWS: Expert Strategies for Scalable and Secure Applications

Dzone - DevOps

APRIL 9, 2025

Amazon Web Services (AWS) provides an expansive suite of tools to help developers build and manage serverless applications with ease. By abstracting the complexities of infrastructure, AWS enables teams to focus on innovation. Why Combine AI, ML, and Serverless Computing?

Serverless

Serverless Artificial Inteligence Scalability AWS

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation

AWS Machine Learning - AI

MARCH 14, 2025

Organizations building and deploying AI applications, particularly those using large language models (LLMs) with Retrieval Augmented Generation (RAG) systems, face a significant challenge: how to evaluate AI outputs effectively throughout the application lifecycle.

Knowledge Base

Knowledge Base Applications Artificial Inteligence Generative AI

9 IT skills where expertise pays the most

CIO

APRIL 25, 2025

Cloud computing Average salary: $124,796 Expertise premium: $15,051 (11%) Cloud computing has been a top priority for businesses in recent years, with organizations moving storage and other IT operations to cloud data storage platforms such as AWS. Its designed to achieve complex results, with a low learning curve for beginners and new users.

Artificial Inteligence

Artificial Inteligence DevOps Virtualization Industry

Can serverless fix fintech’s scaling problem?

CIO

FEBRUARY 11, 2025

Add to this the escalating costs of maintaining legacy systems, which often act as bottlenecks for scalability. The latter option had emerged as a compelling solution, offering the promise of enhanced agility, reduced operational costs, and seamless scalability. Scalability. Cost forecasting. Legacy infrastructure.

Serverless

Serverless Architecture Microservices Scalability

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

AWS Machine Learning - AI

NOVEMBER 26, 2024

Using vLLM on AWS Trainium and Inferentia makes it possible to host LLMs for high performance inference and scalability. Deploy vLLM on AWS Trainium and Inferentia EC2 instances In these sections, you will be guided through using vLLM on an AWS Inferentia EC2 instance to deploy Meta’s newest Llama 3.2 You will use inf2.xlarge

Artificial Inteligence

Artificial Inteligence AWS Artificial Intelligence Generative AI

Unleash the power of generative AI with Amazon Q Business: How CCoEs can scale cloud governance best practices and drive innovation

AWS Machine Learning - AI

NOVEMBER 6, 2024

The challenge: Enabling self-service cloud governance at scale Hearst undertook a comprehensive governance transformation for their Amazon Web Services (AWS) infrastructure. The CCoE implemented AWS Organizations across a substantial number of business units.

Generative AI

Generative AI Government Technical Review Innovation

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

AWS Machine Learning - AI

DECEMBER 4, 2024

At AWS re:Invent 2024, we are excited to introduce Amazon Bedrock Marketplace. Through Bedrock Marketplace, organizations can use Nemotron’s advanced capabilities while benefiting from the scalable infrastructure of AWS and NVIDIA’s robust technologies. You can find him on LinkedIn.

Artificial Inteligence

Artificial Inteligence Microservices Generative AI AWS

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

CIO

APRIL 15, 2025

CIOs are under pressure to accommodate the exponential rise in inferencing workloads within their budgets, fueled by the adoption of LLMs for running generative AI -driven applications. Analysts believe that the frameworks will be a boon for CIOs and help solve challenges around building agents or agentic applications.

Cloud

Cloud Innovation Artificial Inteligence Google Cloud

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

AWS Machine Learning - AI

NOVEMBER 14, 2024

ComfyUI is an open source, node-based application that empowers users to generate images, videos, and audio using advanced AI models, offering a highly customizable workflow for creative projects. Start with 28 denoising steps to balance image quality and generation time. For the Guidance Scale (CFG), set it between 3.5–4.5

Engineering

Engineering AWS 3D Generative AI

Build a video insights and summarization engine using generative AI with Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 29, 2024

This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call. Organizations typically can’t predict their call patterns, so the solution relies on AWS serverless services to scale during busy times.

Generative AI

Generative AI Video Engineering Artificial Inteligence

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

AWS Machine Learning - AI

APRIL 29, 2025

At Data Reply and AWS, we are committed to helping organizations embrace the transformative opportunities generative AI presents, while fostering the safe, responsible, and trustworthy development of AI systems. Post-authentication, users access the UI Layer, a gateway to the Red Teaming Playground built on AWS Amplify and React.

Generative AI

Generative AI Weak Development Team AWS Artificial Inteligence

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO

JANUARY 7, 2025

In todays fast-paced digital landscape, the cloud has emerged as a cornerstone of modern business infrastructure, offering unparalleled scalability, agility, and cost-efficiency. Technology modernization strategy : Evaluate the overall IT landscape through the lens of enterprise architecture and assess IT applications through a 7R framework.

Cloud

Cloud Strategy Architecture Policies

Getting started with computer use in Amazon Bedrock Agents

AWS Machine Learning - AI

MARCH 14, 2025

This capability enables Anthropics Claude models to identify whats on a screen, understand the context of UI elements, and recognize actions that should be performed such as clicking buttons, typing text, scrolling, and navigating between applications. Sonnet V2 and Anthropics Claude Sonnet 3.7 models on Amazon Bedrock.

AWS

AWS Generative AI Linux Groups

Marsh McLennan IT reorg lays foundation for gen AI

CIO

NOVEMBER 1, 2024

As part of MMTech’s unifying strategy, Beswick chose to retire the data centers and form an “enterprisewide architecture organization” with a set of standards and base layers to develop applications and workloads that would run on the cloud, with AWS as the firm’s primary cloud provider. The biggest challenge is data.

Generative AI

Generative AI Technical Advisors Insurance Weak Development Team

Create generative AI agents that interact with your companies’ systems in a few clicks using Amazon Bedrock in Amazon SageMaker Unified Studio

AWS Machine Learning - AI

MARCH 20, 2025

It integrates with existing applications and includes key Amazon Bedrock features like foundation models (FMs), prompts, knowledge bases, agents, flows, evaluation, and guardrails. Solution overview Amazon Bedrock provides a governed collaborative environment to build and share generative AI applications within SageMaker Unified Studio.

Generative AI

Generative AI Systems Review System Lambda

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

AWS Machine Learning - AI

APRIL 21, 2025

A prompt that works well in one scenario may underperform in another, necessitating extensive customization and fine-tuning for different applications. Therefore, developing a universally applicable prompt optimization method that generalizes well across diverse tasks remains a significant challenge.

Artificial Inteligence

Artificial Inteligence Groups Applications Innovation

Enhance customer support with Amazon Bedrock Agents by integrating enterprise data APIs

AWS Machine Learning - AI

NOVEMBER 7, 2024

Although the principles discussed are applicable across various industries, we use an automotive parts retailer as our primary example throughout this post. A web application serves as the frontend interface where users can initiate parts lookup requests. A user interacts with the Car Parts Agent through a web application interface.

Lambda

Lambda Enterprise Automotive Knowledge Base

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

Solution overview To evaluate the effectiveness of RAG compared to model customization, we designed a comprehensive testing framework using a set of AWS-specific questions. Our study used Amazon Nova Micro and Amazon Nova Lite as baseline FMs and tested their performance across different configurations.

Case Study

Case Study Artificial Inteligence Study Generative AI

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

AWS Machine Learning - AI

APRIL 23, 2024

Generative artificial intelligence (AI) has gained significant momentum with organizations actively exploring its potential applications. As successful proof-of-concepts transition into production, organizations are increasingly in need of enterprise scalable solutions.

Knowledge Base

Knowledge Base Scalability Applications Generative AI

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Generative AI can revolutionize organizations by enabling the creation of innovative applications that offer enhanced customer and employee experiences. LOBs have autonomy over their AI workflows, models, and data within their respective AWS accounts.

Generative AI

Generative AI Organization Enterprise Artificial Inteligence

Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

AWS Machine Learning - AI

NOVEMBER 20, 2024

In this post, we explore how you can use Amazon Q Business , the AWS generative AI-powered assistant, to build a centralized knowledge base for your organization, unifying structured and unstructured datasets from different sources to accelerate decision-making and drive productivity. In this post, we use IAM Identity Center as the SAML 2.0-aligned

Data

Data AWS Groups Knowledge Base

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning - AI

OCTOBER 16, 2024

As DPG Media grows, they need a more scalable way of capturing metadata that enhances the consumer experience on online video services and aids in understanding key content characteristics. Irina Radu is a Prototyping Engagement Manager, part of AWS EMEA Prototyping and Cloud Engineering.

Media

Media Video Artificial Inteligence Generative AI

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Open foundation models (FMs) have become a cornerstone of generative AI innovation, enabling organizations to build and customize AI applications while maintaining control over their costs and deployment strategies. Prerequisites You should have the following prerequisites: An AWS account with access to Amazon Bedrock.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies. Amazon Bedrocks broad choice of FMs from leading AI companies, along with its scalability and security features, made it an ideal solution for MaestroQA.

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Enable Amazon Bedrock cross-Region inference in multi-account environments

AWS Machine Learning - AI

MARCH 27, 2025

Amazon Bedrock cross-Region inference capability that provides organizations with flexibility to access foundation models (FMs) across AWS Regions while maintaining optimal performance and availability. We provide practical examples for both SCP modifications and AWS Control Tower implementations.

Artificial Inteligence

Artificial Inteligence AWS Technical Review Policies

Automate invoice processing with Streamlit and Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 14, 2024

In this post, we provide a step-by-step guide with the building blocks needed for creating a Streamlit application to process and review invoices from multiple vendors. Streamlit is an open source framework for data scientists to efficiently create interactive web-based data applications in pure Python. Install Python 3.7

Software Review

Software Review Technical Review AWS Artificial Inteligence

Insights in implementing production-ready solutions with generative AI

AWS Machine Learning - AI

APRIL 30, 2025

This post explores key insights and lessons learned from AWS customers in Europe, Middle East, and Africa (EMEA) who have successfully navigated this transition, providing a roadmap for others looking to follow suit. Il Sole 24 Ore leveraged its vast internal knowledge with a Retrieval Augmented Generation (RAG) solution powered by AWS.

Generative AI

Generative AI Artificial Inteligence AWS Machine Learning

Marsh McLellan IT reorg lays foundation for gen AI

CIO

NOVEMBER 1, 2024

As part of MMTech’s unifying strategy, Beswick chose to retire the data centers and form an “enterprisewide architecture organization” with a set of standards and base layers to develop applications and workloads that would run on the cloud, with AWS as the firm’s primary cloud provider. The biggest challenge is data.

Generative AI

Generative AI Technical Advisors Insurance Weak Development Team

Multi-LLM routing strategies for generative AI applications on AWS

Build and deploy a UI for your generative AI applications with AWS and Python

Webinars

Trending Sources

Accelerate AWS Well-Architected reviews with Generative AI

Webinars

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

Building a Scalable ML Pipeline and API in AWS

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

Empower your generative AI application with a comprehensive custom observability solution

Discover, Protect and Respond with AWS and Prisma Cloud

Build a multi-tenant generative AI environment for your enterprise on AWS

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

How AWS sales uses Amazon Q Business for customer engagement

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

Unlocking the Power of Serverless AI/ML on AWS: Expert Strategies for Scalable and Secure Applications

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation

9 IT skills where expertise pays the most

Can serverless fix fintech’s scaling problem?

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

Unleash the power of generative AI with Amazon Q Business: How CCoEs can scale cloud governance best practices and drive innovation

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

Build a video insights and summarization engine using generative AI with Amazon Bedrock

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

Getting started with computer use in Amazon Bedrock Agents

Marsh McLennan IT reorg lays foundation for gen AI

Create generative AI agents that interact with your companies’ systems in a few clicks using Amazon Bedrock in Amazon SageMaker Unified Studio

Amazon Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

Enhance customer support with Amazon Bedrock Agents by integrating enterprise data APIs

Model customization, RAG, or both: A case study with Amazon Nova

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

Generative AI operating models in enterprise organizations with Amazon Bedrock

Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Enable Amazon Bedrock cross-Region inference in multi-account environments

Automate invoice processing with Streamlit and Amazon Bedrock

Insights in implementing production-ready solutions with generative AI

Marsh McLellan IT reorg lays foundation for gen AI

Stay Connected