Performance, Reference and Scalability

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AWS Machine Learning - AI

OCTOBER 29, 2024

Refer to Supported Regions and models for batch inference for current supporting AWS Regions and models. To address this consideration and enhance your use of batch inference, we’ve developed a scalable solution using AWS Lambda and Amazon DynamoDB. An AWS Region from the list of batch inference supported Regions for Amazon Bedrock.

Scalability

Scalability Lambda Generative AI AWS

The AI Future According to Google Cloud Next ’25: My Interesting Finds

Xebia

APRIL 17, 2025

Thinking refers to an internal reasoning process using the first output tokens, allowing it to solve more complex tasks. Native Multi-Agent Architecture: Build scalable applications by composing specialized agents in a hierarchy. Built-in Evaluation: Systematically assess agent performance. Gemini 2.5 BigFrames 2.0

Google Cloud

Google Cloud Artificial Inteligence Cloud Video

AI dominates Gartner’s 2025 predictions

CIO

OCTOBER 22, 2024

“AI deployment will also allow for enhanced productivity and increased span of control by automating and scheduling tasks, reporting and performance monitoring for the remaining workforce which allows remaining managers to focus on more strategic, scalable and value-added activities.”

Artificial Inteligence

Artificial Inteligence Energy Healthcare Technical Review

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Multi-LLM routing strategies for generative AI applications on AWS

AWS Machine Learning - AI

APRIL 9, 2025

Although an individual LLM can be highly capable, it might not optimally address a wide range of use cases or meet diverse performance requirements. In contrast, more complex questions might require the application to summarize a lengthy dissertation by performing deeper analysis, comparison, and evaluation of the research results.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Applications

12 AI predictions for 2025

CIO

DECEMBER 30, 2024

The company says it can achieve PhD-level performance in challenging benchmark tests in physics, chemistry, and biology. In these uses case, we have enough reference implementations to point to and say, Theres value to be had here.' If it goes through all of those gates, only then do you let the agent do it autonomously, says Hodjat.

Fractional CTO

Fractional CTO Software Development CTO Coach Architecture

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

AWS Machine Learning - AI

NOVEMBER 22, 2024

Building applications from individual components that each perform a discrete function helps you scale more easily and change applications more quickly. Inline mapping The inline map functionality allows you to perform parallel processing of array elements within a single Step Functions state machine execution.

Generative AI

Generative AI AWS Technical Review Backup

The Importance of Assessing Interpersonal Skills in Recruitment

Hacker Earth Developers Blog

DECEMBER 4, 2024

Tech roles are rarely performed in isolation. Example: A candidate might perform well in a calm, structured interview environment but struggle to collaborate effectively in high-pressure, real-world scenarios like product launches or tight deadlines. Why interpersonal skills matter in tech hiring ?

Recruiting

Recruiting Technical Review Software Review Exercises

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

In this post, we demonstrate how to effectively perform model customization and RAG with Amazon Nova models as a baseline. Model customization refers to adapting a pre-trained language model to better fit specific tasks, domains, or datasets. Optimized for cost-effective performance, they are trained on data in over 200 languages.

Case Study

Case Study Artificial Inteligence Study Generative AI

Benchmarking Amazon Nova and GPT-4o models with FloTorch

AWS Machine Learning - AI

MARCH 11, 2025

A recent evaluation conducted by FloTorch compared the performance of Amazon Nova models with OpenAIs GPT-4o. Amazon Nova is a new generation of state-of-the-art foundation models (FMs) that deliver frontier intelligence and industry-leading price-performance. Hemant Joshi, CTO, FloTorch.ai Each provisioned node was r7g.4xlarge,

Artificial Inteligence

Artificial Inteligence Knowledge Base Comparison Generative AI

Mastering Multi-Cloud with Cloudera: Strategic Data & AI Deployments Across Clouds

Cloudera

JANUARY 7, 2025

While multi-cloud generally refers to the use of multiple cloud providers, hybrid encompasses both cloud and on-premises integrations, as well as multi-cloud setups. The scalable cloud infrastructure optimized costs, reduced customer churn, and enhanced marketing efficiency through improved customer segmentation and retention models.

Cloud

Cloud Data Scalability Compliance

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

AWS Machine Learning - AI

DECEMBER 2, 2024

For generative AI models requiring multiple instances to handle high-throughput inference requests, this added significant overhead to the total scaling time, potentially impacting application performance during traffic spikes. We ran 5+ scaling simulations and observed consistent performance with low variations across trials.

Generative AI

Generative AI Artificial Inteligence Machine Learning AWS

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

AWS Machine Learning - AI

MARCH 13, 2025

The following figure illustrates the performance of DeepSeek-R1 compared to other state-of-the-art models on standard benchmark tests, such as MATH-500 , MMLU , and more. To learn more about Hugging Face TGI support on Amazon SageMaker AI, refer to this announcement post and this documentation on deploy models to Amazon SageMaker AI.

Artificial Inteligence

Artificial Inteligence AWS Machine Learning Load Balancer

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Governance in the context of generative AI refers to the frameworks, policies, and processes that streamline the responsible development, deployment, and use of these technologies. For a comprehensive read about vector store and embeddings, you can refer to The role of vector databases in generative AI applications.

Generative AI

Generative AI Organization Enterprise Artificial Inteligence

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO

JANUARY 7, 2025

In todays fast-paced digital landscape, the cloud has emerged as a cornerstone of modern business infrastructure, offering unparalleled scalability, agility, and cost-efficiency. Cracking this code or aspect of cloud optimization is the most critical piece for enterprises to strike gold with the scalability of AI solutions.

Cloud

Cloud Strategy Architecture Policies

Accelerate AWS Well-Architected reviews with Generative AI

AWS Machine Learning - AI

MARCH 4, 2025

We demonstrate how to harness the power of LLMs to build an intelligent, scalable system that analyzes architecture documents and generates insightful recommendations based on AWS Well-Architected best practices. This scalability allows for more frequent and comprehensive reviews.

Generative AI

Generative AI Technical Review Software Review Systems Review

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

AWS Machine Learning - AI

APRIL 30, 2025

Amazon Bedrock Model Distillation is generally available, and it addresses the fundamental challenge many organizations face when deploying generative AI : how to maintain high performance while reducing costs and latency. This provides optimal performance by maintaining the same structure the model was trained on.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

Pixtral Large is now available in Amazon Bedrock

AWS Machine Learning - AI

APRIL 10, 2025

This is particularly beneficial for tasks like automatically processing receipts or invoices, where it can perform calculations and context-aware evaluations, streamlining processes such as expense tracking or financial analysis. It can effortlessly identify trends, anomalies, and key data points within graphical visualizations.

Generative AI

Generative AI AWS Technical Review Artificial Inteligence

Tuna raises $3M to address complexity of e-commerce payments in Latin America

TechCrunch

AUGUST 26, 2021

Alex Tabor, Paul Ascher and Juan Pascual met each other on the engineering team of Peixe Urbano, a company Tabor co-founded and he referred to as a “Groupon for Brazil.” Tuna is on a mission to “fine tune” the payments space in Latin America and has raised two seed rounds totaling $3 million, led by Canary and by Atlantico.

Technical Review

Technical Review Software Review Banking Fashion

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

Shared components refer to the functionality and features shared by all tenants. If it leads to better performance, your existing default prompt in the application is overridden with the new one. Refer to Perform AI prompt-chaining with Amazon Bedrock for more details. This logic sits in a hybrid search component.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

AWS Machine Learning - AI

NOVEMBER 14, 2024

In this post, we explore advanced prompt engineering techniques that can enhance the performance of these models and facilitate the creation of compelling imagery through text-to-image transformations. Large Medium – This refers to the material or technique used in creating the artwork. A photo of a (red:1.2)

Engineering

Engineering AWS 3D Generative AI

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

AWS Machine Learning - AI

APRIL 23, 2024

As successful proof-of-concepts transition into production, organizations are increasingly in need of enterprise scalable solutions. For details on all the fields and providing configuration of various vector stores supported by Knowledge Bases for Amazon Bedrock, refer to AWS::Bedrock::KnowledgeBase.

Knowledge Base

Knowledge Base Scalability Applications Generative AI

High-performance computing on AWS

Xebia

AUGUST 29, 2023

How does High-Performance Computing on AWS differ from regular computing? For this HPC will bring massive parallel computing, cluster and workload managers and high-performance components to the table. It provides a powerful and scalable platform for executing large-scale batch jobs with minimal setup and management overhead.

AWS

AWS Performance Storage Linux

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

AWS Machine Learning - AI

NOVEMBER 26, 2024

there is an increasing need for scalable, reliable, and cost-effective solutions to deploy and serve these models. AWS Trainium and AWS Inferentia based instances, combined with Amazon Elastic Kubernetes Service (Amazon EKS), provide a performant and low cost framework to run LLMs efficiently in a containerized environment.

AWS

AWS Load Balancer Software Review Artificial Inteligence

Host concurrent LLMs with LoRAX

AWS Machine Learning - AI

APRIL 16, 2025

These models are tailored to perform specialized tasks within specific domains or micro-domains. This challenge is further compounded by concerns over scalability and cost-effectiveness. They can host the different variants on a single EC2 instance instead of a fleet of model endpoints, saving costs without impacting performance.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Storage

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

AWS Machine Learning - AI

MAY 1, 2025

An agent uses a function call to invoke an external tool (like an API or database) to perform specific actions or retrieve information it doesnt possess internally. These tools are integrated as an API call inside the agent itself, leading to challenges in scaling and tool reuse across an enterprise.

Artificial Inteligence

Artificial Inteligence AWS Architecture Generative AI

What is a Workflow?

xmatters

JANUARY 7, 2025

Types of Workflows Types of workflows refer to the method or structure of task execution, while categories of workflows refer to the purpose or context in which they are used. Define the order in which tasks are performed. Manual Workflows: These are processes that require human intervention at each step.

Software Review

Software Review Technical Review DevOps Systems Review

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Their DeepSeek-R1 models represent a family of large language models (LLMs) designed to handle a wide range of tasks, from code generation to general reasoning, while maintaining competitive performance and efficiency. 70B-Instruct ), offer different trade-offs between performance and resource requirements.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Enhance customer support with Amazon Bedrock Agents by integrating enterprise data APIs

AWS Machine Learning - AI

NOVEMBER 7, 2024

The agents also automatically call APIs to perform actions and access knowledge bases to provide additional information. Effective agent instructions are crucial for optimizing the performance of AI-powered assistants. For more information, refer to the PowerTools documentation on Amazon Bedrock Agents.

Lambda

Lambda Enterprise Automotive Knowledge Base

Navigating the future of national tech independence with sovereign AI

CIO

MARCH 31, 2025

Sovereign AI refers to a national or regional effort to develop and control artificial intelligence (AI) systems, independent of the large non-EU foreign private tech platforms that currently dominate the field. high-performance computing GPU), data centers, and energy.

Technical Review

Technical Review Artificial Inteligence Compliance Open Source

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 31, 2024

If you don’t have an AWS account, refer to How do I create and activate a new Amazon Web Services account? If you don’t have an existing knowledge base, refer to Create an Amazon Bedrock knowledge base. Performance optimization The serverless architecture used in this post provides a scalable solution out of the box.

Generative AI

Generative AI Lambda Applications AWS

Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering

AWS Machine Learning - AI

APRIL 24, 2024

We present the reinforcement learning process and the benchmarking results to demonstrate the LLM performance improvement. You can refer to further explanations in the following resources:** * ARS GEN 10.0/05.01.02. Design Criteria & Appendices/Performance Package AR Sortable Design Criteria v20.1.1.pdf

Artificial Inteligence

Artificial Inteligence Engineering Performance Construction

Integrating Key Vault Secrets with Azure Synapse Analytics

Apiumhub

DECEMBER 9, 2024

Give each secret a clear name, as youll use these names to reference them in Synapse. Add a Linked Service to the pipeline that references the Key Vault. When setting up a linked service for these sources, reference the names of the secrets stored in Key Vault instead of hard-coding the credentials.

Azure

Azure Analytics Storage Artificial Inteligence

Boosting Salesforce Einstein’s code generating model performance with Amazon SageMaker

AWS Machine Learning - AI

JULY 24, 2024

They are committed to enhancing the performance and capabilities of AI models, with a particular focus on large language models (LLMs) for use with Einstein product offerings. LMI containers are a set of high-performance Docker Containers purpose built for LLM inference. When the team initially deployed CodeGen 2.5,

Artificial Inteligence

Artificial Inteligence Performance Open Source Machine Learning

Scalable training platform with Amazon SageMaker HyperPod for innovation: a video generation case study

AWS Machine Learning - AI

SEPTEMBER 26, 2024

To accelerate iteration and innovation in this field, sufficient computing resources and a scalable platform are essential. With these capabilities, customers are adopting SageMaker HyperPod as their innovation platform for more resilient and performant model training, enabling them to build state-of-the-art models faster.

Case Study

Case Study Video Training Scalability

Asure’s approach to enhancing their call center experience using generative AI and Amazon Q in Quicksight

AWS Machine Learning - AI

MARCH 20, 2025

Asure anticipated that generative AI could aid contact center leaders to understand their teams support performance, identify gaps and pain points in their products, and recognize the most effective strategies for training customer support representatives using call transcripts. For example, Anthropics Claude Sonnet 3.5

Generative AI

Generative AI Artificial Inteligence Metrics AWS

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning - AI

OCTOBER 16, 2024

For some content, additional screening is performed to generate subtitles and captions. As DPG Media grows, they need a more scalable way of capturing metadata that enhances the consumer experience on online video services and aids in understanding key content characteristics.

Media

Media Video Artificial Inteligence Generative AI

Empower your generative AI application with a comprehensive custom observability solution

AWS Machine Learning - AI

OCTOBER 29, 2024

Observability refers to the ability to understand the internal state and behavior of a system by analyzing its outputs, logs, and metrics. For a detailed breakdown of the features and implementation specifics, refer to the comprehensive documentation in the GitHub repository.

Generative AI

Generative AI Applications AWS Knowledge Base

SAP publishes open source manifesto

CIO

JUNE 27, 2024

It arrives alongside the announcement of SAP’s Open Reference Architecture project as part of the EU’s IPCEI-CIS initiative. Organizations are choosing these platforms based on effective cost, performance, and scalability.”

Open Source

Open Source Architecture Linux Exercises

CBRE and AWS perform natural language queries of structured data using Amazon Bedrock

AWS Machine Learning - AI

MAY 30, 2024

AWS Prototyping successfully delivered a scalable prototype, which solved CBRE’s business problem with a high accuracy rate (over 95%) and supported reuse of embeddings for similar NLQs, and an API gateway for integration into CBRE’s dashboards. CBRE, in parallel, completed UAT testing to confirm it performed as expected.

AWS

AWS Lambda Performance Artificial Inteligence

FloQast builds an AI-powered accounting transformation solution with Anthropic’s Claude 3 on Amazon Bedrock

AWS Machine Learning - AI

APRIL 30, 2025

Similarly, when an incident occurs in IT, the responding team must provide a precise, documented history for future reference and troubleshooting. In his current role, he partners with AWS customers to design and implement scalable, secure, and cost-effective solutions on the AWS platform. Anthropics Claude 3.5

Artificial Inteligence

Artificial Inteligence Technical Review Software Review Generative AI

Generative AI-powered game design: Accelerating early development with Stability AI models on Amazon Bedrock

AWS Machine Learning - AI

MARCH 26, 2025

The model demonstrates improved performance in image quality, typography, and complex prompt understanding. Finally, use the generated images as reference material for 3D artists to create fully realized game environments. For instructions, refer to Clean up Amazon SageMaker notebook instance resources.

Generative AI

Generative AI Games Development AWS

John Snow Labs Releases Generative AI Lab 7.0 to Help Domain Experts Evaluate and Improve LLM Applications and Conduct HCC Coding Reviews

John Snow Labs

APRIL 2, 2025

Key features of the release include: Customizable project templates for LLM output evaluation with support for HTML content, including hyperlinks to references. Two modes are supported: individual and side-by-side response evaluation. Inter-Annotator Agreement (IAA) charts are also available for those projects.

Artificial Inteligence

Artificial Inteligence Software Review Generative AI Technical Review

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation

AWS Machine Learning - AI

MARCH 14, 2025

As these AI technologies become more sophisticated and widely adopted, maintaining consistent quality and performance becomes increasingly complex. For applications requiring high performance content generation with lower latency and costs, model distillation can be an effective solution to use for creating a generator model, for example.

Knowledge Base

Knowledge Base Applications Artificial Inteligence Generative AI

Maintaining conventions in dbt projects with dbt-bouncer

Xebia

NOVEMBER 21, 2024

But when the size of a dbt project grows, and the number of developers increases, then an automated approach is often the only scalable way forward. What other checks can dbt-bouncer perform? check_exposure_based_on_view ensures exposures are not based on views as this may result in poor performance for data consumers.

Weak Development Team

Weak Development Team Testing Analytics Engineering

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

The AI Future According to Google Cloud Next ’25: My Interesting Finds

Webinars

Trending Sources

AI dominates Gartner’s 2025 predictions

Webinars

Multi-LLM routing strategies for generative AI applications on AWS

12 AI predictions for 2025

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

The Importance of Assessing Interpersonal Skills in Recruitment

Model customization, RAG, or both: A case study with Amazon Nova

Benchmarking Amazon Nova and GPT-4o models with FloTorch

Mastering Multi-Cloud with Cloudera: Strategic Data & AI Deployments Across Clouds

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

Generative AI operating models in enterprise organizations with Amazon Bedrock

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

Accelerate AWS Well-Architected reviews with Generative AI

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

Pixtral Large is now available in Amazon Bedrock

Tuna raises $3M to address complexity of e-commerce payments in Latin America

Build a multi-tenant generative AI environment for your enterprise on AWS

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

High-performance computing on AWS

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

Host concurrent LLMs with LoRAX

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

What is a Workflow?

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Enhance customer support with Amazon Bedrock Agents by integrating enterprise data APIs

Navigating the future of national tech independence with sovereign AI

Create a generative AI–powered custom Google Chat application using Amazon Bedrock

Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering

Integrating Key Vault Secrets with Azure Synapse Analytics

Boosting Salesforce Einstein’s code generating model performance with Amazon SageMaker

Scalable training platform with Amazon SageMaker HyperPod for innovation: a video generation case study

Asure’s approach to enhancing their call center experience using generative AI and Amazon Q in Quicksight

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

Empower your generative AI application with a comprehensive custom observability solution

SAP publishes open source manifesto

CBRE and AWS perform natural language queries of structured data using Amazon Bedrock

FloQast builds an AI-powered accounting transformation solution with Anthropic’s Claude 3 on Amazon Bedrock

Generative AI-powered game design: Accelerating early development with Stability AI models on Amazon Bedrock

John Snow Labs Releases Generative AI Lab 7.0 to Help Domain Experts Evaluate and Improve LLM Applications and Conduct HCC Coding Reviews

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation

Maintaining conventions in dbt projects with dbt-bouncer

Stay Connected