Architecture, AWS and Performance

Building Resilient Public Networking on AWS: Part 4

Xebia

OCTOBER 23, 2024

Region Evacuation with static anycast IP approach Welcome back to our comprehensive "Building Resilient Public Networking on AWS" blog series, where we delve into advanced networking strategies for regional evacuation, failover, and robust disaster recovery. Find the detailed guide here.

AWS

AWS Network Software Review Lambda

Accelerate AWS Well-Architected reviews with Generative AI

AWS Machine Learning - AI

MARCH 4, 2025

To achieve these goals, the AWS Well-Architected Framework provides comprehensive guidance for building and improving cloud architectures. This allows teams to focus more on implementing improvements and optimizing AWS infrastructure. This systematic approach leads to more reliable and standardized evaluations.

Generative AI

Generative AI Technical Review Software Review Systems Review

Multi-LLM routing strategies for generative AI applications on AWS

AWS Machine Learning - AI

APRIL 9, 2025

Although an individual LLM can be highly capable, it might not optimally address a wide range of use cases or meet diverse performance requirements. In contrast, more complex questions might require the application to summarize a lengthy dissertation by performing deeper analysis, comparison, and evaluation of the research results.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Applications

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Build and deploy a UI for your generative AI applications with AWS and Python

AWS Machine Learning - AI

NOVEMBER 6, 2024

AWS provides a powerful set of tools and services that simplify the process of building and deploying generative AI applications, even for those with limited experience in frontend and backend development. The AWS deployment architecture makes sure the Python application is hosted and accessible from the internet to authenticated users.

Generative AI

Generative AI AWS Artificial Inteligence Applications

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker. It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker. You can use AWS services such as Application Load Balancer to implement this approach.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

AWS Machine Learning - AI

DECEMBER 24, 2024

However, companies are discovering that performing full fine tuning for these models with their data isnt cost effective. In addition to cost, performing fine tuning for LLMs at scale presents significant technical challenges. To learn more about Trainium chips and the Neuron SDK, see Welcome to AWS Neuron.

AWS

AWS Artificial Inteligence Generative AI Training

Transcribe, translate, and summarize live streams in your browser with AWS AI and generative AI services

AWS Machine Learning - AI

NOVEMBER 13, 2024

Recognizing this need, we have developed a Chrome extension that harnesses the power of AWS AI and generative AI services, including Amazon Bedrock , an AWS managed service to build and scale generative AI applications with foundation models (FMs). The following diagram illustrates the architecture of the application.

Generative AI

Generative AI AWS Lambda Authentication

Introducing AWS MCP Servers for code assistants (Part 1)

AWS Machine Learning - AI

APRIL 1, 2025

Were excited to announce the open source release of AWS MCP Servers for code assistants a suite of specialized Model Context Protocol (MCP) servers that bring Amazon Web Services (AWS) best practices directly to your development workflow. This post is the first in a series covering AWS MCP Servers.

AWS

AWS Software Review Knowledge Base Generative AI

Using Amazon Q Business with AWS HealthScribe to gain insights from patient consultations

AWS Machine Learning - AI

OCTOBER 17, 2024

During re:Invent 2023, we launched AWS HealthScribe , a HIPAA eligible service that empowers healthcare software vendors to build their clinical applications to use speech recognition and generative AI to automatically create preliminary clinician documentation. Speaker role identification (clinician or patient).

AWS

AWS Artificial Inteligence Generative AI Machine Learning

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

With the QnABot on AWS (QnABot), integrated with Microsoft Azure Entra ID access controls, Principal launched an intelligent self-service solution rooted in generative AI. Principal also used the AWS open source repository Lex Web UI to build a frontend chat interface with Principal branding.

Generative AI

Generative AI AWS Groups Artificial Inteligence

What is a cloud architect? A vital role for success in the cloud

CIO

APRIL 30, 2025

Cloud architects are responsible for managing the cloud computing architecture in an organization, especially as cloud technologies grow increasingly complex. At organizations that have already completed their cloud adoption, cloud architects help maintain, oversee, troubleshoot, and optimize cloud architecture over time.

Cloud

Cloud AWS Azure Disaster Recovery

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning - AI

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Generative AI Technical Review Artificial Inteligence

Improve Amazon Nova migration performance with data-aware prompt optimization

AWS Machine Learning - AI

APRIL 29, 2025

In the era of generative AI , new large language models (LLMs) are continually emerging, each with unique capabilities, architectures, and optimizations. Among these, Amazon Nova foundation models (FMs) deliver frontier intelligence and industry-leading cost-performance, available exclusively on Amazon Bedrock.

Artificial Inteligence

Artificial Inteligence Performance Data Generative AI

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

AWS Machine Learning - AI

NOVEMBER 13, 2024

Amazon Titan FMs provide customers with a breadth of high-performing image, multimodal, and text model choices, through a fully managed API. The following diagram illustrates the solution architecture: The steps of the solution include: Upload data to Amazon S3 : Store the product images in Amazon Simple Storage Service (Amazon S3).

AWS

AWS Engineering Serverless eCommerce

Can serverless fix fintech’s scaling problem?

CIO

FEBRUARY 11, 2025

Technology leaders in the financial services sector constantly struggle with the daily challenges of balancing cost, performance, and security the constant demand for high availability means that even a minor system outage could lead to significant financial and reputational losses. Architecture complexity. Legacy infrastructure.

Serverless

Serverless Architecture Microservices Scalability

Trade routes of the digital age: How data gravity shapes cloud strategy

CIO

APRIL 15, 2025

However, as companies expand their operations and adopt multi-cloud architectures, they are faced with an invisible but powerful challenge: Data gravity. While centralizing data can improve performance and security, it can also lead to inefficiencies, increased costs and limitations on cloud mobility.

Strategy

Strategy Cloud Data Technical Review

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 31, 2024

AWS offers powerful generative AI services , including Amazon Bedrock , which allows organizations to create tailored use cases such as AI chat-based assistants that give answers based on knowledge contained in the customers’ documents, and much more. In the following sections, we explain how to deploy this architecture.

Generative AI

Generative AI Lambda Applications AWS

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Their DeepSeek-R1 models represent a family of large language models (LLMs) designed to handle a wide range of tasks, from code generation to general reasoning, while maintaining competitive performance and efficiency. 70B-Instruct ), offer different trade-offs between performance and resource requirements.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AWS Machine Learning - AI

OCTOBER 29, 2024

Refer to Supported Regions and models for batch inference for current supporting AWS Regions and models. To address this consideration and enhance your use of batch inference, we’ve developed a scalable solution using AWS Lambda and Amazon DynamoDB. Amazon S3 invokes the {stack_name}-create-batch-queue-{AWS-Region} Lambda function.

Scalability

Scalability Lambda Generative AI AWS

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

AWS Machine Learning - AI

APRIL 1, 2025

AWS App Studio is a generative AI-powered service that uses natural language to build business applications, empowering a new set of builders to create applications in minutes. Cross-instance Import and Export Enabling straightforward and self-service migration of App Studio applications across AWS Regions and AWS accounts.

AWS

AWS Software Review Technical Review Generative AI

Cloudera and AWS Partner to Deliver Cost-Efficient and Sustainable Infrastructure for AI and Analytics

Cloudera

DECEMBER 2, 2024

Cloudera is committed to providing the most optimal architecture for data processing, advanced analytics, and AI while advancing our customers’ cloud journeys. Together, Cloudera and AWS empower businesses to optimize performance for data processing, analytics, and AI while minimizing their resource consumption and carbon footprint.

Sustainability

Sustainability AWS Analytics Infrastructure

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

CIO

APRIL 15, 2025

Cost-performance optimizations via new chip One of the major updates announced last week was Googles seventh generation Tensor Processing Unit (TPU) chip Ironwood targeted at accelerating AI workloads, especially inferencing.

Cloud

Cloud Innovation Artificial Inteligence Google Cloud

High-performance computing on AWS

Xebia

AUGUST 29, 2023

How does High-Performance Computing on AWS differ from regular computing? For this HPC will bring massive parallel computing, cluster and workload managers and high-performance components to the table. AWS has two services to support your HPC workload. However, some tasks are very complex and require a different approach.

AWS

AWS Performance Storage Linux

Reduce conversational AI response time through inference at the edge with AWS Local Zones

AWS Machine Learning - AI

MARCH 3, 2025

Hybrid architecture with AWS Local Zones To minimize the impact of network latency on TTFT for users regardless of their locations, a hybrid architecture can be implemented by extending AWS services from commercial Regions to edge locations closer to end users. Next, create a subnet inside each Local Zone.

AWS

AWS Artificial Inteligence Technical Review Systems Review

Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails

AWS Machine Learning - AI

NOVEMBER 19, 2024

Amazon Bedrock offers a serverless experience so you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using AWS tools without having to manage infrastructure. Monitoring – Monitors system performance and user activity to maintain operational reliability and efficiency.

Knowledge Base

Knowledge Base Generative AI Technical Review Lambda

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

AWS Machine Learning - AI

APRIL 29, 2025

At Data Reply and AWS, we are committed to helping organizations embrace the transformative opportunities generative AI presents, while fostering the safe, responsible, and trustworthy development of AI systems. The following diagram illustrates the solution architecture.

Generative AI

Generative AI Weak Development Team AWS Artificial Inteligence

Empower your generative AI application with a comprehensive custom observability solution

AWS Machine Learning - AI

OCTOBER 29, 2024

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

Generative AI

Generative AI Applications AWS Knowledge Base

Enable Amazon Bedrock cross-Region inference in multi-account environments

AWS Machine Learning - AI

MARCH 27, 2025

Amazon Bedrock cross-Region inference capability that provides organizations with flexibility to access foundation models (FMs) across AWS Regions while maintaining optimal performance and availability. We provide practical examples for both SCP modifications and AWS Control Tower implementations.

Artificial Inteligence

Artificial Inteligence AWS Technical Review Policies

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO

JANUARY 7, 2025

It prevents vendor lock-in, gives a lever for strong negotiation, enables business flexibility in strategy execution owing to complicated architecture or regional limitations in terms of security and legal compliance if and when they rise and promotes portability from an application architecture perspective.

Cloud

Cloud Strategy Architecture Policies

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

In this post, we demonstrate how to effectively perform model customization and RAG with Amazon Nova models as a baseline. Fine-tuning is one such technique, which helps in injecting task-specific or domain-specific knowledge for improving model performance.

Case Study

Case Study Artificial Inteligence Study Generative AI

Getting started with computer use in Amazon Bedrock Agents

AWS Machine Learning - AI

MARCH 14, 2025

This capability enables Anthropics Claude models to identify whats on a screen, understand the context of UI elements, and recognize actions that should be performed such as clicking buttons, typing text, scrolling, and navigating between applications. The following diagram illustrates the solution architecture. Require Python 3.11

AWS

AWS Generative AI Linux Groups

12 AI predictions for 2025

CIO

DECEMBER 30, 2024

The company says it can achieve PhD-level performance in challenging benchmark tests in physics, chemistry, and biology. Agents will begin replacing services Software has evolved from big, monolithic systems running on mainframes, to desktop apps, to distributed, service-based architectures, web applications, and mobile apps.

Fractional CTO

Fractional CTO Software Development CTO Coach Architecture

AI in action: Stories of how enterprises are transforming and modernizing

CIO

MARCH 20, 2025

Built on top of EXLerate.AI, EXLs AI orchestration platform, and Amazon Web Services (AWS), Code Harbor eliminates redundant code and optimizes performance, reducing manual assessment, conversion and testing effort by 60% to 80%. The EXLerate.AI

Artificial Inteligence

Artificial Inteligence Enterprise Insurance Artificial Intelligence

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 1, 2024

Organizations can now label all Amazon Bedrock models with AWS cost allocation tags , aligning usage to specific organizational taxonomies such as cost centers, business units, and applications. By assigning AWS cost allocation tags, the organization can effectively monitor and track their Bedrock spend patterns.

Generative AI

Generative AI AWS Artificial Inteligence Budget

Build a video insights and summarization engine using generative AI with Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 29, 2024

This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call. Organizations typically can’t predict their call patterns, so the solution relies on AWS serverless services to scale during busy times.

Generative AI

Generative AI Video Engineering Artificial Inteligence

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies. MaestroQA integrated Amazon Bedrock into their existing architecture using Amazon Elastic Container Service (Amazon ECS).

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

AWS Machine Learning - AI

NOVEMBER 27, 2024

To maximize performance and optimize training, organizations frequently need to employ advanced distributed training strategies. In a transformer architecture, such layers are the embedding layers and the multilayer perceptron (MLP) layers. and prior Llama models) and Mistral model architectures for context parallelism.

Training

Training Artificial Inteligence AWS Machine Learning

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning - AI

OCTOBER 16, 2024

For some content, additional screening is performed to generate subtitles and captions. The general architecture of the metadata pipeline consists of two primary steps: Generate transcriptions of audio tracks: use speech recognition models to generate accurate transcripts of the audio content.

Media

Media Video Artificial Inteligence Generative AI

Amazon Bedrock Flows is now generally available with enhanced safety and traceability

AWS Machine Learning - AI

NOVEMBER 22, 2024

Seamless integration of latest foundation models (FMs), Prompts, Agents, Knowledge Bases, Guardrails, and other AWS services. They lack visibility into performance bottlenecks affecting customer experience. Configure any auxiliary AWS services needed for your customer service workflow (for example, Amazon DynamoDB for order history).

Generative AI

Generative AI Artificial Inteligence Knowledge Base AWS

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

AWS Machine Learning - AI

DECEMBER 4, 2024

At AWS re:Invent 2024, we are excited to introduce Amazon Bedrock Marketplace. Nemotron-4 15B, with its impressive 15-billion-parameter architecture trained on 8 trillion text tokens, brings powerful multilingual and coding capabilities to the Amazon Bedrock.

Artificial Inteligence

Artificial Inteligence Microservices Generative AI AWS

Why GreenOps will succeed where FinOps is failing

CIO

FEBRUARY 4, 2025

By emphasizing immediate cost-cutting, FinOps often encourages behaviors that compromise long-term goals such as performance, availability, scalability and sustainability. The result was a compromised availability architecture. This lack of engagement results in inertia and minimal progress.

Sustainability

Sustainability Technical Review Architecture Fractional CTO

CBRE and AWS perform natural language queries of structured data using Amazon Bedrock

AWS Machine Learning - AI

MAY 30, 2024

Because Amazon Bedrock is serverless, you don’t have to manage infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with. AWS Prototyping developed an AWS Cloud Development Kit (AWS CDK) stack for deployment following AWS best practices.

AWS

AWS Lambda Performance Artificial Inteligence

Amazon Q Business simplifies integration of enterprise knowledge bases at scale

AWS Machine Learning - AI

FEBRUARY 11, 2025

Enhancing AWS Support Engineering efficiency The AWS Support Engineering team faced the daunting task of manually sifting through numerous tools, internal sources, and AWS public documentation to find solutions for customer inquiries. Then we introduce the solution deployment using three AWS CloudFormation templates.

Knowledge Base

Knowledge Base Lambda Enterprise AWS

How BQA streamlines education quality reporting using Amazon Bedrock

AWS Machine Learning - AI

JANUARY 13, 2025

BQA reviews the performance of all education and training institutions, including schools, universities, and vocational institutes, thereby promoting the professional advancement of the nations human capital. The following diagram illustrates the solution architecture.

Education

Education Report Technical Review Generative AI

Building Resilient Public Networking on AWS: Part 4

Accelerate AWS Well-Architected reviews with Generative AI

Webinars

Trending Sources

Multi-LLM routing strategies for generative AI applications on AWS

Webinars

Build and deploy a UI for your generative AI applications with AWS and Python

Build a multi-tenant generative AI environment for your enterprise on AWS

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

Transcribe, translate, and summarize live streams in your browser with AWS AI and generative AI services

Introducing AWS MCP Servers for code assistants (Part 1)

Using Amazon Q Business with AWS HealthScribe to gain insights from patient consultations

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

What is a cloud architect? A vital role for success in the cloud

How AWS sales uses Amazon Q Business for customer engagement

Improve Amazon Nova migration performance with data-aware prompt optimization

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

Can serverless fix fintech’s scaling problem?

Trade routes of the digital age: How data gravity shapes cloud strategy

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

Cloudera and AWS Partner to Deliver Cost-Efficient and Sustainable Infrastructure for AI and Analytics

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

High-performance computing on AWS

Reduce conversational AI response time through inference at the edge with AWS Local Zones

Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

Empower your generative AI application with a comprehensive custom observability solution

Enable Amazon Bedrock cross-Region inference in multi-account environments

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

Model customization, RAG, or both: A case study with Amazon Nova

Getting started with computer use in Amazon Bedrock Agents

12 AI predictions for 2025

AI in action: Stories of how enterprises are transforming and modernizing

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

Build a video insights and summarization engine using generative AI with Amazon Bedrock

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

Amazon Bedrock Flows is now generally available with enhanced safety and traceability

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

Why GreenOps will succeed where FinOps is failing

CBRE and AWS perform natural language queries of structured data using Amazon Bedrock

Amazon Q Business simplifies integration of enterprise knowledge bases at scale

How BQA streamlines education quality reporting using Amazon Bedrock

Stay Connected