AWS, Reference and Scalability

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AWS Machine Learning - AI

OCTOBER 29, 2024

Refer to Supported Regions and models for batch inference for current supporting AWS Regions and models. To address this consideration and enhance your use of batch inference, we’ve developed a scalable solution using AWS Lambda and Amazon DynamoDB. Access to your selected models hosted on Amazon Bedrock.

Scalability

Scalability Lambda Generative AI AWS

Multi-LLM routing strategies for generative AI applications on AWS

AWS Machine Learning - AI

APRIL 9, 2025

Software-as-a-service (SaaS) applications with tenant tiering SaaS applications are often architected to provide different pricing and experiences to a spectrum of customer profiles, referred to as tiers. The user prompt is then routed to the LLM associated with the task category of the reference prompt that has the closest match.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Applications

Accelerate AWS Well-Architected reviews with Generative AI

AWS Machine Learning - AI

MARCH 4, 2025

To achieve these goals, the AWS Well-Architected Framework provides comprehensive guidance for building and improving cloud architectures. This allows teams to focus more on implementing improvements and optimizing AWS infrastructure. This scalability allows for more frequent and comprehensive reviews.

Generative AI

Generative AI Technical Review Software Review Systems Review

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

AWS Machine Learning - AI

NOVEMBER 22, 2024

This post discusses how to use AWS Step Functions to efficiently coordinate multi-step generative AI workflows, such as parallelizing API calls to Amazon Bedrock to quickly gather answers to lists of submitted questions.

Generative AI

Generative AI AWS Technical Review Backup

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker. Shared components refer to the functionality and features shared by all tenants. You can use AWS services such as Application Load Balancer to implement this approach. API Gateway also provides a WebSocket API.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

AWS Machine Learning - AI

NOVEMBER 26, 2024

there is an increasing need for scalable, reliable, and cost-effective solutions to deploy and serve these models. AWS Trainium and AWS Inferentia based instances, combined with Amazon Elastic Kubernetes Service (Amazon EKS), provide a performant and low cost framework to run LLMs efficiently in a containerized environment.

AWS

AWS Load Balancer Software Review Artificial Inteligence

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning - AI

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Generative AI Technical Review Artificial Inteligence

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

AWS Machine Learning - AI

NOVEMBER 14, 2024

Large Medium – This refers to the material or technique used in creating the artwork. This might involve incorporating additional data such as reference images or rough sketches as conditioning inputs alongside your text prompts. You can provide extensive details, such as the gender of a character, their clothing, and the setting.

Engineering

Engineering AWS 3D Generative AI

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

AWS Machine Learning - AI

NOVEMBER 26, 2024

Using vLLM on AWS Trainium and Inferentia makes it possible to host LLMs for high performance inference and scalability. Deploy vLLM on AWS Trainium and Inferentia EC2 instances In these sections, you will be guided through using vLLM on an AWS Inferentia EC2 instance to deploy Meta’s newest Llama 3.2 You will use inf2.xlarge

Artificial Inteligence

Artificial Inteligence AWS Artificial Intelligence Generative AI

Unleash the power of generative AI with Amazon Q Business: How CCoEs can scale cloud governance best practices and drive innovation

AWS Machine Learning - AI

NOVEMBER 6, 2024

This solution can serve as a valuable reference for other organizations looking to scale their cloud governance and enable their CCoE teams to drive greater impact. The challenge: Enabling self-service cloud governance at scale Hearst undertook a comprehensive governance transformation for their Amazon Web Services (AWS) infrastructure.

Generative AI

Generative AI Government Technical Review Innovation

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 31, 2024

AWS offers powerful generative AI services , including Amazon Bedrock , which allows organizations to create tailored use cases such as AI chat-based assistants that give answers based on knowledge contained in the customers’ documents, and much more. The following figure illustrates the high-level design of the solution.

Generative AI

Generative AI Lambda Applications AWS

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

Model customization refers to adapting a pre-trained language model to better fit specific tasks, domains, or datasets. Solution overview To evaluate the effectiveness of RAG compared to model customization, we designed a comprehensive testing framework using a set of AWS-specific questions.

Case Study

Case Study Artificial Inteligence Study Generative AI

Enhance customer support with Amazon Bedrock Agents by integrating enterprise data APIs

AWS Machine Learning - AI

NOVEMBER 7, 2024

Developer tools The solution also uses the following developer tools: AWS Powertools for Lambda – This is a suite of utilities for Lambda functions that generates OpenAPI schemas from your Lambda function code. After deployment, the AWS CDK CLI will output the web application URL. Python 3.9 or later Node.js

Lambda

Lambda Enterprise Automotive Knowledge Base

Pixtral Large is now available in Amazon Bedrock

AWS Machine Learning - AI

APRIL 10, 2025

With this launch, you can now access Mistrals frontier-class multimodal model to build, experiment, and responsibly scale your generative AI ideas on AWS. AWS is the first major cloud provider to deliver Pixtral Large as a fully managed, serverless model. Additionally, Pixtral Large supports the Converse API and tool usage.

Generative AI

Generative AI AWS Technical Review Artificial Inteligence

Best Practices for IaC using AWS CloudFormation

Perficient

MARCH 11, 2025

IaC enables developers to define infrastructure configurations using code, ensuring consistency, automation, and scalability. AWS CloudFormation, a key service in the AWS ecosystem, simplifies IaC by allowing users to easily model and set up AWS resources. Why Use AWS CloudFormation? Example: 3.

AWS

AWS Software Review Systems Review Policies

Empower your generative AI application with a comprehensive custom observability solution

AWS Machine Learning - AI

OCTOBER 29, 2024

Observability refers to the ability to understand the internal state and behavior of a system by analyzing its outputs, logs, and metrics. Security – The solution uses AWS services and adheres to AWS Cloud Security best practices so your data remains within your AWS account.

Generative AI

Generative AI Applications AWS Knowledge Base

Enable Amazon Bedrock cross-Region inference in multi-account environments

AWS Machine Learning - AI

MARCH 27, 2025

Amazon Bedrock cross-Region inference capability that provides organizations with flexibility to access foundation models (FMs) across AWS Regions while maintaining optimal performance and availability. We provide practical examples for both SCP modifications and AWS Control Tower implementations.

Artificial Inteligence

Artificial Inteligence AWS Technical Review Policies

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning - AI

OCTOBER 16, 2024

As DPG Media grows, they need a more scalable way of capturing metadata that enhances the consumer experience on online video services and aids in understanding key content characteristics. To evaluate the metadata quality, the team used reference-free LLM metrics, inspired by LangSmith.

Media

Media Video Artificial Inteligence Generative AI

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Large organizations often have many business units with multiple lines of business (LOBs), with a central governing entity, and typically use AWS Organizations with an Amazon Web Services (AWS) multi-account strategy. LOBs have autonomy over their AI workflows, models, and data within their respective AWS accounts.

Generative AI

Generative AI Organization Enterprise Artificial Inteligence

Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

AWS Machine Learning - AI

NOVEMBER 20, 2024

In this post, we explore how you can use Amazon Q Business , the AWS generative AI-powered assistant, to build a centralized knowledge base for your organization, unifying structured and unstructured datasets from different sources to accelerate decision-making and drive productivity.

Data

Data AWS Groups Knowledge Base

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO

JANUARY 7, 2025

In todays fast-paced digital landscape, the cloud has emerged as a cornerstone of modern business infrastructure, offering unparalleled scalability, agility, and cost-efficiency. First, cloud provisioning through automation is better in AWS CloudFormation and Azure Azure Resource Manager compared to the other cloud providers.

Cloud

Cloud Strategy Architecture Policies

Generative AI-powered game design: Accelerating early development with Stability AI models on Amazon Bedrock

AWS Machine Learning - AI

MARCH 26, 2025

Use the us-west-2 AWS Region to run this demo. Prerequisites This notebook is designed to run on AWS, using Amazon Bedrock for both Anthropics Claude 3 Sonnet and Stability AI model access. Make sure you have the following set up before moving forward: An AWS account. An Amazon SageMaker domain. Access to Stability AIs SD3.5

Generative AI

Generative AI Games Development AWS

Amazon S3 Reference for the Cloud Practitioner

Linux Academy

JANUARY 22, 2020

If you’re studying for the AWS Cloud Practitioner exam, there are a few Amazon S3 (Simple Storage Service) facts that you should know and understand. This post will guide you through how to utilize S3 in AWS environments, for the correct use cases. Objects are what AWS calls the files stored in S3. Using S3 Buckets in Regions.

Cloud

Cloud Storage Backup AWS

Getting started with computer use in Amazon Bedrock Agents

AWS Machine Learning - AI

MARCH 14, 2025

The computer use agent demo powered by Amazon Bedrock Agents provides the following benefits: Secure execution environment Execution of computer use tools in a sandbox environment with limited access to the AWS ecosystem and the web. Prerequisites AWS Command Line Interface (CLI), follow instructions here. Require Python 3.11

AWS

AWS Generative AI Linux Groups

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

In this post, we explore how to deploy distilled versions of DeepSeek-R1 with Amazon Bedrock Custom Model Import, making them accessible to organizations looking to use state-of-the-art AI capabilities within the secure and scalable AWS infrastructure at an effective cost. You can monitor costs with AWS Cost Explorer.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Benchmarking Amazon Nova and GPT-4o models with FloTorch

AWS Machine Learning - AI

MARCH 11, 2025

OpenAI launched GPT-4o in May 2024, and Amazon introduced Amazon Nova models at AWS re:Invent in December 2024. is helping enterprise customers design and manage agentic workflows in a secure and scalable manner. Interested users are invited to try out FloTorch from AWS Marketplace or from GitHub. About FloTorch FloTorch.ai

Artificial Inteligence

Artificial Inteligence Knowledge Base Comparison Generative AI

Create generative AI agents that interact with your companies’ systems in a few clicks using Amazon Bedrock in Amazon SageMaker Unified Studio

AWS Machine Learning - AI

MARCH 20, 2025

Users can access these AI capabilities through their organizations single sign-on (SSO), collaborate with team members, and refine AI applications without needing AWS Management Console access. The workflow is as follows: The user logs into SageMaker Unified Studio using their organizations SSO from AWS IAM Identity Center.

Generative AI

Generative AI Systems Review System Lambda

Automate invoice processing with Streamlit and Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 14, 2024

Prerequisites To perform this solution, complete the following: Create and activate an AWS account. Make sure your AWS credentials are configured correctly. This tutorial assumes you have the necessary AWS Identity and Access Management (IAM) permissions. If you’re new to Amazon EC2, refer to the Amazon EC2 User Guide.

Software Review

Software Review Technical Review AWS Artificial Inteligence

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

AWS Machine Learning - AI

DECEMBER 2, 2024

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI models for inference. This feature is only supported when using inference components.

Generative AI

Generative AI Artificial Inteligence Machine Learning AWS

Host concurrent LLMs with LoRAX

AWS Machine Learning - AI

APRIL 16, 2025

This challenge is further compounded by concerns over scalability and cost-effectiveness. Why LoRAX for LoRA deployment on AWS? The surge in popularity of fine-tuning LLMs has given rise to multiple inference container methods for deploying LoRA adapters on AWS. Two prominent approaches among our customers are LoRAX and vLLM.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Storage

12 AI predictions for 2025

CIO

DECEMBER 30, 2024

In these uses case, we have enough reference implementations to point to and say, Theres value to be had here.' Weve seen so many reference implementations, and weve done so many reference implementations, that were going to see massive adoption.

Fractional CTO

Fractional CTO Software Development CTO Coach Architecture

Mastering AWS IaC with Pulumi and Python – Part 2

Perficient

APRIL 4, 2025

In Part 1 of this series, we learned about the importance of AWS and Pulumi. Now, lets explore the demo part in this practical session, which will create a service on AWS VPC by using Pulumi. AdministratorAccess or a custom policy). us-east-1) Output format (e.g., us-east-1) Output format (e.g., py to create a VPC: Step 3.

AWS

AWS Google Cloud Azure Policies

Modernizing on AWS: Strategies, Benefits, and Partnerships with Xebia

Xebia

MAY 21, 2024

Cloud modernization has become a prominent topic for organizations, and AWS plays a crucial role in helping them modernize their IT infrastructure, applications, and services. Overall, discussions on AWS modernization are focused on security, faster releases, efficiency, and steps towards GenAI and improved innovation.

AWS

AWS Strategy Serverless Microservices

The AWS Cloud Migration Checklist Every Business Needs for a Smooth Transition

Mobilunity

DECEMBER 26, 2024

In the current digital environment, migration to the cloud has emerged as an essential tactic for companies aiming to boost scalability, enhance operational efficiency, and reinforce resilience. Get AWS developers A step-by-step AWS migration checklist Mobilunity helps hiring dedicated development teams to businesses worldwide for 14+ years.

AWS

AWS Cloud Weak Development Team DevOps

AWS at NVIDIA GTC 2024: Accelerate innovation with generative AI on AWS

AWS Machine Learning - AI

APRIL 11, 2024

AWS was delighted to present to and connect with over 18,000 in-person and 267,000 virtual attendees at NVIDIA GTC, a global artificial intelligence (AI) conference that took place March 2024 in San Jose, California, returning to a hybrid, in-person experience for the first time since 2019.

Generative AI

Generative AI AWS Artificial Inteligence Innovation

High-performance computing on AWS

Xebia

AUGUST 29, 2023

How does High-Performance Computing on AWS differ from regular computing? HPC services on AWS Compute Technically you could design and build your own HPC cluster on AWS, it will work but you will spend time on plumbing and undifferentiated heavy lifting. AWS has two services to support your HPC workload.

AWS

AWS Performance Storage Linux

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

AWS Machine Learning - AI

APRIL 23, 2024

As successful proof-of-concepts transition into production, organizations are increasingly in need of enterprise scalable solutions. The AWS Well-Architected Framework provides best practices and guidelines for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud.

Knowledge Base

Knowledge Base Scalability Applications Generative AI

Building a virtual meteorologist using Amazon Bedrock Agents

AWS Machine Learning - AI

FEBRUARY 11, 2025

Although weather information is accessible through multiple channels, businesses that heavily rely on meteorological data require robust and scalable solutions to effectively manage and use these critical insights and reduce manual processes. Complete the following steps: Download the front-end code AWS-Amplify-Frontend.zip from GitHub.

Virtualization

Virtualization Lambda AWS Authentication

Building Resilient Public Networking on AWS: Part 2

Xebia

JANUARY 18, 2024

Deploy Secure Public Web Endpoints Welcome to Building Resilient Public Networking on AWS—our comprehensive blog series on advanced networking strategies tailored for regional evacuation, failover, and robust disaster recovery. We laid the groundwork for understanding the essentials that underpin the forthcoming discussions.

AWS

AWS Network Load Balancer Software Review

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

AWS Machine Learning - AI

MARCH 20, 2025

With Amazon Bedrock Data Automation, enterprises can accelerate AI adoption and develop solutions that are secure, scalable, and responsible. Cross-Region inference enables seamless management of unplanned traffic bursts by using compute across different AWS Regions. For example, a request made in the US stays within Regions in the US.

Data

Data Generative AI Artificial Inteligence Compliance

Techniques and approaches for monitoring large language models on AWS

AWS Machine Learning - AI

FEBRUARY 26, 2024

Our proposed architecture provides a scalable and customizable solution for online LLM monitoring, enabling teams to tailor your monitoring solution to your specific use cases and requirements. Through AWS Step Functions orchestration, the function calls Amazon Comprehend to detect the sentiment and toxicity.

Artificial Inteligence

Artificial Inteligence AWS Lambda Metrics

Mastering Multi-Cloud with Cloudera: Strategic Data & AI Deployments Across Clouds

Cloudera

JANUARY 7, 2025

While multi-cloud generally refers to the use of multiple cloud providers, hybrid encompasses both cloud and on-premises integrations, as well as multi-cloud setups. A prominent public health organization integrated data from multiple regional health entities within a hybrid multi-cloud environment (AWS, Azure, and on-premise).

Cloud

Cloud Data Scalability Compliance

A secure approach to generative AI with AWS

AWS Machine Learning - AI

APRIL 16, 2024

At AWS, our top priority is safeguarding the security and confidentiality of our customers’ workloads. With the AWS Nitro System , we delivered a first-of-its-kind innovation on behalf of our customers. The Nitro System is an unparalleled computing backbone for AWS, with security and performance at its core.

Generative AI

Generative AI AWS Artificial Inteligence Infrastructure

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

AWS Machine Learning - AI

MARCH 3, 2025

These recipes include a training stack validated by Amazon Web Services (AWS) , which removes the tedious work of experimenting with different model configurations, minimizing the time it takes for iterative evaluation and testing. You can run these recipes using SageMaker HyperPod or as SageMaker training jobs.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

Multi-LLM routing strategies for generative AI applications on AWS

Accelerate AWS Well-Architected reviews with Generative AI

Webinars

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

Build a multi-tenant generative AI environment for your enterprise on AWS

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

How AWS sales uses Amazon Q Business for customer engagement

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

Unleash the power of generative AI with Amazon Q Business: How CCoEs can scale cloud governance best practices and drive innovation

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

Model customization, RAG, or both: A case study with Amazon Nova

Enhance customer support with Amazon Bedrock Agents by integrating enterprise data APIs

Pixtral Large is now available in Amazon Bedrock

Best Practices for IaC using AWS CloudFormation

Empower your generative AI application with a comprehensive custom observability solution

Enable Amazon Bedrock cross-Region inference in multi-account environments

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

Generative AI operating models in enterprise organizations with Amazon Bedrock

Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

Generative AI-powered game design: Accelerating early development with Stability AI models on Amazon Bedrock

Amazon S3 Reference for the Cloud Practitioner

Getting started with computer use in Amazon Bedrock Agents

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Benchmarking Amazon Nova and GPT-4o models with FloTorch

Create generative AI agents that interact with your companies’ systems in a few clicks using Amazon Bedrock in Amazon SageMaker Unified Studio

Automate invoice processing with Streamlit and Amazon Bedrock

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Host concurrent LLMs with LoRAX

12 AI predictions for 2025

Mastering AWS IaC with Pulumi and Python – Part 2

Modernizing on AWS: Strategies, Benefits, and Partnerships with Xebia

The AWS Cloud Migration Checklist Every Business Needs for a Smooth Transition

AWS at NVIDIA GTC 2024: Accelerate innovation with generative AI on AWS

High-performance computing on AWS

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

Building a virtual meteorologist using Amazon Bedrock Agents

Building Resilient Public Networking on AWS: Part 2

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

Techniques and approaches for monitoring large language models on AWS

Mastering Multi-Cloud with Cloudera: Strategic Data & AI Deployments Across Clouds

A secure approach to generative AI with AWS

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

Stay Connected