AWS, Metrics and Scalability

Can serverless fix fintech’s scaling problem?

CIO

FEBRUARY 11, 2025

Add to this the escalating costs of maintaining legacy systems, which often act as bottlenecks for scalability. The latter option had emerged as a compelling solution, offering the promise of enhanced agility, reduced operational costs, and seamless scalability. Scalability. Scalability. Cost forecasting. The results?

Serverless

Serverless Architecture Microservices Scalability

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

With the QnABot on AWS (QnABot), integrated with Microsoft Azure Entra ID access controls, Principal launched an intelligent self-service solution rooted in generative AI. Principal also used the AWS open source repository Lex Web UI to build a frontend chat interface with Principal branding.

Generative AI

Generative AI AWS Groups Artificial Inteligence

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker. You can use AWS services such as Application Load Balancer to implement this approach. On AWS, you can use the fully managed Amazon Bedrock Agents or tools of your choice such as LangChain agents or LlamaIndex agents.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

AWS Machine Learning - AI

NOVEMBER 26, 2024

there is an increasing need for scalable, reliable, and cost-effective solutions to deploy and serve these models. AWS Trainium and AWS Inferentia based instances, combined with Amazon Elastic Kubernetes Service (Amazon EKS), provide a performant and low cost framework to run LLMs efficiently in a containerized environment.

AWS

AWS Load Balancer Software Review Artificial Inteligence

WordFinder app: Harnessing generative AI on AWS for aphasia communication

AWS Machine Learning - AI

MAY 2, 2025

David Copland, from QARC, and Scott Harding, a person living with aphasia, used AWS services to develop WordFinder, a mobile, cloud-based solution that helps individuals with aphasia increase their independence through the use of AWS generative AI technology. The following diagram illustrates the solution architecture on AWS.

Generative AI

Generative AI AWS Lambda Authentication

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 1, 2024

Without a scalable approach to controlling costs, organizations risk unbudgeted usage and cost overruns. Organizations can now label all Amazon Bedrock models with AWS cost allocation tags , aligning usage to specific organizational taxonomies such as cost centers, business units, and applications.

Generative AI

Generative AI AWS Artificial Inteligence Budget

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning - AI

OCTOBER 16, 2024

As DPG Media grows, they need a more scalable way of capturing metadata that enhances the consumer experience on online video services and aids in understanding key content characteristics. Word information lost (WIL) – This metric quantifies the amount of information lost due to transcription errors.

Media

Media Video Artificial Inteligence Generative AI

Empower your generative AI application with a comprehensive custom observability solution

AWS Machine Learning - AI

OCTOBER 29, 2024

Observability refers to the ability to understand the internal state and behavior of a system by analyzing its outputs, logs, and metrics. Security – The solution uses AWS services and adheres to AWS Cloud Security best practices so your data remains within your AWS account.

Generative AI

Generative AI Applications AWS Knowledge Base

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

Solution overview To evaluate the effectiveness of RAG compared to model customization, we designed a comprehensive testing framework using a set of AWS-specific questions. Our study used Amazon Nova Micro and Amazon Nova Lite as baseline FMs and tested their performance across different configurations. To do so, we create a knowledge base.

Case Study

Case Study Artificial Inteligence Study Generative AI

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

AWS Machine Learning - AI

APRIL 29, 2025

At Data Reply and AWS, we are committed to helping organizations embrace the transformative opportunities generative AI presents, while fostering the safe, responsible, and trustworthy development of AI systems. Post-authentication, users access the UI Layer, a gateway to the Red Teaming Playground built on AWS Amplify and React.

Generative AI

Generative AI Weak Development Team AWS Artificial Inteligence

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation

AWS Machine Learning - AI

MARCH 14, 2025

Although automated metrics are fast and cost-effective, they can only evaluate the correctness of an AI response, without capturing other evaluation dimensions or providing explanations of why an answer is problematic. Human evaluation, although thorough, is time-consuming and expensive at scale.

Knowledge Base

Knowledge Base Applications Artificial Inteligence Generative AI

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

In this post, we explore how to deploy distilled versions of DeepSeek-R1 with Amazon Bedrock Custom Model Import, making them accessible to organizations looking to use state-of-the-art AI capabilities within the secure and scalable AWS infrastructure at an effective cost. Review the model response and metrics provided.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Build a video insights and summarization engine using generative AI with Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 29, 2024

This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call. All of this data is centralized and can be used to improve metrics in scenarios such as sales or call centers.

Generative AI

Generative AI Video Engineering Artificial Inteligence

Asure’s approach to enhancing their call center experience using generative AI and Amazon Q in Quicksight

AWS Machine Learning - AI

MARCH 20, 2025

The Asure team was manually analyzing thousands of call transcripts to uncover themes and trends, a process that lacked scalability. Our partnership with AWS and our commitment to be early adopters of innovative technologies like Amazon Bedrock underscore our dedication to making advanced HCM technology accessible for businesses of any size.

Generative AI

Generative AI Artificial Inteligence Metrics AWS

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies. Amazon Bedrocks broad choice of FMs from leading AI companies, along with its scalability and security features, made it an ideal solution for MaestroQA.

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

How BQA streamlines education quality reporting using Amazon Bedrock

AWS Machine Learning - AI

JANUARY 13, 2025

The collaboration between BQA and AWS was facilitated through the Cloud Innovation Center (CIC) program, a joint initiative by AWS, Tamkeen , and leading universities in Bahrain, including Bahrain Polytechnic and University of Bahrain. The extracted text data is placed into another SQS queue for the next processing step.

Education

Education Report Technical Review Generative AI

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning - AI

DECEMBER 4, 2024

SageMaker Unified Studio combines various AWS services, including Amazon Bedrock , Amazon SageMaker , Amazon Redshift , Amazon Glue , Amazon Athena , and Amazon Managed Workflows for Apache Airflow (MWAA) , into a comprehensive data and AI development platform. Navigate to the AWS Secrets Manager console and find the secret -api-keys.

Generative AI

Generative AI Applications Technical Review Software Review

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO

JANUARY 7, 2025

In todays fast-paced digital landscape, the cloud has emerged as a cornerstone of modern business infrastructure, offering unparalleled scalability, agility, and cost-efficiency. First, cloud provisioning through automation is better in AWS CloudFormation and Azure Azure Resource Manager compared to the other cloud providers.

Cloud

Cloud Strategy Architecture Policies

Why GreenOps will succeed where FinOps is failing

CIO

FEBRUARY 4, 2025

This surge is driven by the rapid expansion of cloud computing and artificial intelligence, both of which are reshaping industries and enabling unprecedented scalability and innovation. GreenOps incorporates financial, environmental and operational metrics, ensuring a balanced strategy that aligns with broader organizational goals.

Sustainability

Sustainability Technical Review Architecture Fractional CTO

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

AWS Machine Learning - AI

APRIL 30, 2025

We recommend referring to the Submit a model distillation job in Amazon Bedrock in the official AWS documentation for the most up-to-date and comprehensive information. You can track these job status details in both the AWS Management Console and AWS SDK. Prior to joining AWS, he obtained his Ph.D. David received a M.S.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Large organizations often have many business units with multiple lines of business (LOBs), with a central governing entity, and typically use AWS Organizations with an Amazon Web Services (AWS) multi-account strategy. LOBs have autonomy over their AI workflows, models, and data within their respective AWS accounts.

Generative AI

Generative AI Organization Enterprise Artificial Inteligence

Benchmarking Amazon Nova and GPT-4o models with FloTorch

AWS Machine Learning - AI

MARCH 11, 2025

OpenAI launched GPT-4o in May 2024, and Amazon introduced Amazon Nova models at AWS re:Invent in December 2024. How do Amazon Nova Micro and Amazon Nova Lite perform against GPT-4o mini in these same metrics? Vector database FloTorch selected Amazon OpenSearch Service as a vector database for its high-performance metrics.

Artificial Inteligence

Artificial Inteligence Knowledge Base Comparison Generative AI

Techniques and approaches for monitoring large language models on AWS

AWS Machine Learning - AI

FEBRUARY 26, 2024

Our proposed architecture provides a scalable and customizable solution for online LLM monitoring, enabling teams to tailor your monitoring solution to your specific use cases and requirements. Overview of solution The first thing to consider is that different metrics require different computation considerations.

Artificial Inteligence

Artificial Inteligence AWS Lambda Metrics

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

AWS Machine Learning - AI

MARCH 13, 2025

Amazon SageMaker AI provides a managed way to deploy TGI-optimized models, offering deep integration with Hugging Faces inference stack for scalable and cost-efficient LLM deployment. Optimizing these metrics directly enhances user experience, system reliability, and deployment feasibility at scale. xlarge across all metrics.

Artificial Inteligence

Artificial Inteligence AWS Machine Learning Load Balancer

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

AWS Machine Learning - AI

DECEMBER 2, 2024

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI models for inference. To run this benchmark, we use sub-minute metrics to detect the need for scaling. The following table summarizes our setup.

Generative AI

Generative AI Artificial Inteligence Machine Learning AWS

The AWS Cloud Migration Checklist Every Business Needs for a Smooth Transition

Mobilunity

DECEMBER 26, 2024

In the current digital environment, migration to the cloud has emerged as an essential tactic for companies aiming to boost scalability, enhance operational efficiency, and reinforce resilience. Get AWS developers A step-by-step AWS migration checklist Mobilunity helps hiring dedicated development teams to businesses worldwide for 14+ years.

AWS

AWS Cloud Weak Development Team DevOps

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

AWS Machine Learning - AI

MARCH 20, 2025

With Amazon Bedrock Data Automation, enterprises can accelerate AI adoption and develop solutions that are secure, scalable, and responsible. Cross-Region inference enables seamless management of unplanned traffic bursts by using compute across different AWS Regions. For example, a request made in the US stays within Regions in the US.

Data

Data Generative AI Artificial Inteligence Compliance

Pixtral Large is now available in Amazon Bedrock

AWS Machine Learning - AI

APRIL 10, 2025

With this launch, you can now access Mistrals frontier-class multimodal model to build, experiment, and responsibly scale your generative AI ideas on AWS. AWS is the first major cloud provider to deliver Pixtral Large as a fully managed, serverless model. Take a look at the Mistral-on-AWS repo.

Generative AI

Generative AI AWS Technical Review Artificial Inteligence

Spend Smarter, Not More: A Guide to AWS Storage Cost Optimization

Xebia

JANUARY 8, 2024

The cloud, particularly Amazon Web Services (AWS), has made storing vast amounts of data more uncomplicated than ever before. S3 Storage Undoubtedly, anyone who uses AWS will inevitably encounter S3, one of the platform’s most popular storage services. The following table gives you an overview of AWS storage costs.

Storage

Storage AWS Backup Policies

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

AWS Machine Learning - AI

APRIL 23, 2024

As successful proof-of-concepts transition into production, organizations are increasingly in need of enterprise scalable solutions. The AWS Well-Architected Framework provides best practices and guidelines for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud.

Knowledge Base

Knowledge Base Scalability Applications Generative AI

Accelerating insurance policy reviews with generative AI: Verisk’s Mozart companion

AWS Machine Learning - AI

MARCH 7, 2025

An AWS Batch job reads these documents, chunks them into smaller slices, then creates embeddings of the text chunks using the Amazon Titan Text Embeddings model through Amazon Bedrock and stores them in an Amazon OpenSearch Service vector database. In the future, Verisk intends to use the Amazon Titan Embeddings V2 model.

Generative AI

Generative AI Technical Review Insurance Policies

CBRE and AWS perform natural language queries of structured data using Amazon Bedrock

AWS Machine Learning - AI

MAY 30, 2024

Because Amazon Bedrock is serverless, you don’t have to manage infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with. AWS Prototyping developed an AWS Cloud Development Kit (AWS CDK) stack for deployment following AWS best practices.

AWS

AWS Lambda Performance Artificial Inteligence

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

AWS Machine Learning - AI

AUGUST 26, 2024

At AWS, we are transforming our seller and customer journeys by using generative artificial intelligence (AI) across the sales lifecycle. It will be able to answer questions, generate content, and facilitate bidirectional interactions, all while continuously using internal AWS and external data to deliver timely, personalized insights.

Generative AI

Generative AI AWS Artificial Inteligence Technical Review

Revolutionize trip planning with Amazon Bedrock and Amazon Location Service

AWS Machine Learning - AI

NOVEMBER 14, 2024

This is where AWS and generative AI can revolutionize the way we plan and prepare for our next adventure. This innovative service goes beyond traditional trip planning methods, offering real-time interaction through a chat-based interface and maintaining scalability, reliability, and data security through AWS native services.

Artificial Inteligence

Artificial Inteligence Generative AI Travel AWS

Deploy DeepSeek-R1 distilled models on Amazon SageMaker using a Large Model Inference container

AWS Machine Learning - AI

MARCH 11, 2025

By integrating this model with Amazon SageMaker AI , you can benefit from the AWS scalable infrastructure while maintaining high-quality language model capabilities. Solution overview You can use DeepSeeks distilled models within the AWS managed machine learning (ML) infrastructure. For details, refer to Create an AWS account.

Artificial Inteligence

Artificial Inteligence AWS Generative AI Metrics

TechCrunch+ roundup: Pricing strategy, technical due diligence, pitch deck appendix fever

TechCrunch

JULY 12, 2022

In his latest TC+ post, Michael Perez, director of growth and data at VC firm M13, shares five questions he uses to devise pricing strategy frameworks , along with three value metrics and a detailed measurement plan for GTM strategy. How your company can adopt a usage-based business model like AWS. Here’s why.

Technical Review

Technical Review Development Team Review Software Review Systems Review

Part 1: A Survey of Analytics Engineering Work at Netflix

Netflix Tech

DECEMBER 17, 2024

DataJunction: Unifying Experimentation and Analytics Yian Shang , AnhLe At Netflix, like in many organizations, creating and using metrics is often more complex than it should be. DJ acts as a central store where metric definitions can live and evolve. As an example, imagine an analyst wanting to create a Total Streaming Hours metric.

Analytics

Analytics Engineering Survey Metrics

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

AWS Machine Learning - AI

MARCH 3, 2025

These recipes include a training stack validated by Amazon Web Services (AWS) , which removes the tedious work of experimenting with different model configurations, minimizing the time it takes for iterative evaluation and testing. Alternatively, you can also use AWS Systems Manager and run a command like the following to start the session.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

Your guide to generative AI and ML at AWS re:Invent 2023

AWS Machine Learning - AI

NOVEMBER 22, 2023

Yes, the AWS re:Invent season is upon us and as always, the place to be is Las Vegas! are the sessions dedicated to AWS DeepRacer ! Generative AI is at the heart of the AWS Village this year. You marked your calendars, you booked your hotel, and you even purchased the airfare. And last but not least (and always fun!)

Generative AI

Generative AI Artificial Inteligence AWS Machine Learning

Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

AWS Machine Learning - AI

FEBRUARY 12, 2025

Prerequisites To build the solution yourself, there are the following prerequisites: You need an AWS account with an AWS Identity and Access Management (IAM) role that has permissions to manage resources created as part of the solution (for example AmazonSageMakerFullAccess and AmazonS3FullAccess ).

Artificial Inteligence

Artificial Inteligence Training AWS Machine Learning

How Much Should I Be Spending On Observability?

Honeycomb

APRIL 23, 2025

Some observability platforms are approaching AWS levels of pricing complexity these days. Get your free copy of Charity’s Cost Crisis in Metrics Tooling whitepaper. Metrics-heavy shops are used to blaming custom metrics for their cost spikes, and for good reason. The answer, of course, is its complicated.

Weak Development Team

Weak Development Team Metrics Storage Engineering

Cockroach Labs scores $160M Series E on $2B valuation

TechCrunch

JANUARY 12, 2021

Kimball says that company wasn’t necessarily looking to raise, although he knew that it would continue to need more cash on the balance sheet to run with giant competitors like Oracle, AWS and the other big cloud vendors, along with a slew of other database startups. Series D as scalable database resonates.

Open Source

Open Source Cloud Culture AWS

Scalable training platform with Amazon SageMaker HyperPod for innovation: a video generation case study

AWS Machine Learning - AI

SEPTEMBER 26, 2024

To accelerate iteration and innovation in this field, sufficient computing resources and a scalable platform are essential. SageMaker HyperPod provides several key features and advantages in the scalable training architecture. The SageMaker option thus offers a powerful, efficient, and scalable solution for video generation workflows.

Case Study

Case Study Video Training Scalability

Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval

AWS Machine Learning - AI

MARCH 5, 2025

With deterministic evaluation processes such as the Factual Knowledge and QA Accuracy metrics of FMEval , ground truth generation and evaluation metric implementation are tightly coupled. To learn more about FMEval, see Evaluate large language models for quality and responsibility of LLMs.

Generative AI

Generative AI Systems Review Software Review Artificial Inteligence

Can serverless fix fintech’s scaling problem?

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

Webinars

Trending Sources

Build a multi-tenant generative AI environment for your enterprise on AWS

Webinars

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

WordFinder app: Harnessing generative AI on AWS for aphasia communication

Track, allocate, and manage your generative AI cost and usage with Amazon Bedrock

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

Empower your generative AI application with a comprehensive custom observability solution

Model customization, RAG, or both: A case study with Amazon Nova

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Build a video insights and summarization engine using generative AI with Amazon Bedrock

Asure’s approach to enhancing their call center experience using generative AI and Amazon Q in Quicksight

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

How BQA streamlines education quality reporting using Amazon Bedrock

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

Why GreenOps will succeed where FinOps is failing

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

Generative AI operating models in enterprise organizations with Amazon Bedrock

Benchmarking Amazon Nova and GPT-4o models with FloTorch

Techniques and approaches for monitoring large language models on AWS

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

The AWS Cloud Migration Checklist Every Business Needs for a Smooth Transition

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

Pixtral Large is now available in Amazon Bedrock

Spend Smarter, Not More: A Guide to AWS Storage Cost Optimization

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

Accelerating insurance policy reviews with generative AI: Verisk’s Mozart companion

CBRE and AWS perform natural language queries of structured data using Amazon Bedrock

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

Revolutionize trip planning with Amazon Bedrock and Amazon Location Service

Deploy DeepSeek-R1 distilled models on Amazon SageMaker using a Large Model Inference container

TechCrunch+ roundup: Pricing strategy, technical due diligence, pitch deck appendix fever

Part 1: A Survey of Analytics Engineering Work at Netflix

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

Your guide to generative AI and ML at AWS re:Invent 2023

Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

How Much Should I Be Spending On Observability?

Cockroach Labs scores $160M Series E on $2B valuation

Scalable training platform with Amazon SageMaker HyperPod for innovation: a video generation case study

Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval

Stay Connected