AWS, Generative AI and Hardware

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

AWS Machine Learning - AI

DECEMBER 2, 2024

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI models for inference. The implementation of Container Caching for running Llama3.1

Generative AI

Generative AI Artificial Inteligence Machine Learning AWS

Reduce ML training costs with Amazon SageMaker HyperPod

AWS Machine Learning - AI

APRIL 10, 2025

As cluster sizes grow, the likelihood of failure increases due to the number of hardware components involved. Each hardware failure can result in wasted GPU hours and requires valuable engineering time to identify and resolve the issue, making the system prone to downtime that can disrupt progress and delay completion.

Training

Training Artificial Inteligence Hardware Systems Review

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

AWS Machine Learning - AI

NOVEMBER 14, 2024

In the rapidly evolving world of generative AI image modeling, prompt engineering has become a crucial skill for developers, designers, and content creators. Understanding the Prompt Structure Prompt engineering is a valuable technique for effectively using generative AI image models. A photo of a (red:1.2)

Engineering

Engineering AWS 3D Generative AI

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

AWS Machine Learning - AI

NOVEMBER 26, 2024

The use of large language models (LLMs) and generative AI has exploded over the last year. Using vLLM on AWS Trainium and Inferentia makes it possible to host LLMs for high performance inference and scalability. xlarge instances are only available in these AWS Regions. You will use inf2.xlarge

Artificial Inteligence

Artificial Inteligence AWS Artificial Intelligence Generative AI

9 IT skills where expertise pays the most

CIO

APRIL 25, 2025

AI skills broadly include programming languages, database modeling, data analysis and visualization, machine learning (ML), statistics, natural language processing (NLP), generative AI, and AI ethics.

Artificial Inteligence

Artificial Inteligence DevOps Virtualization Industry

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

AWS Machine Learning - AI

NOVEMBER 26, 2024

AWS Trainium and AWS Inferentia based instances, combined with Amazon Elastic Kubernetes Service (Amazon EKS), provide a performant and low cost framework to run LLMs efficiently in a containerized environment. Adjust the following configuration to suit your needs, such as the Amazon EKS version, cluster name, and AWS Region.

AWS

AWS Load Balancer Software Review Artificial Inteligence

Together raises $20M to build open source generative AI models

TechCrunch

MAY 15, 2023

Generative AI — AI that can write essays, create artwork and music, and more — continues to attract outsize investor attention. According to one source, generative AI startups raised $1.7 Google Cloud, AWS, Azure). Google Cloud, AWS, Azure). billion in Q1 2023, with an additional $10.68

Open Source

Open Source Generative AI ChatGPT Hardware

A secure approach to generative AI with AWS

AWS Machine Learning - AI

APRIL 16, 2024

Generative artificial intelligence (AI) is transforming the customer experience in industries across the globe. The biggest concern we hear from customers as they explore the advantages of generative AI is how to protect their highly sensitive data and investments.

Generative AI

Generative AI AWS Artificial Inteligence Infrastructure

Powering tomorrow with Generative AI: The AWS-Capgemini partnership advantage

Capgemini

NOVEMBER 22, 2024

AWS or other providers? The Capgemini-AWS partnership journey Capgemini has spent the last 15 years partnering with AWS to answer these types of questions. Our journey has evolved from basic cloud migrations to cutting-edge AI implementations, earning us recognition as AWS’s Global AI/ML Partner of the Year for 2023.

Generative AI

Generative AI AWS Automotive Energy

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Open foundation models (FMs) have become a cornerstone of generative AI innovation, enabling organizations to build and customize AI applications while maintaining control over their costs and deployment strategies. Prerequisites You should have the following prerequisites: An AWS account with access to Amazon Bedrock.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

CIO

APRIL 15, 2025

CIOs are under pressure to accommodate the exponential rise in inferencing workloads within their budgets, fueled by the adoption of LLMs for running generative AI -driven applications. Taken together, these tools aim to make enterprise AI more practical to deploy, scale, and manage, said Kaustubh K, practice director at Everest Group.

Cloud

Cloud Innovation Artificial Inteligence Google Cloud

Unlocking Innovation: AWS and Anthropic push the boundaries of generative AI together

AWS Machine Learning - AI

MARCH 4, 2024

Amazon Bedrock is the best place to build and scale generative AI applications with large language models (LLM) and other foundation models (FMs). It enables customers to leverage a variety of high-performing FMs, such as the Claude family of models by Anthropic, to build custom generative AI applications.

Generative AI

Generative AI AWS Artificial Inteligence Innovation

Daily Crunch: Amazon’s new Bedrock cloud service lets developers incorporate generative AI

TechCrunch

APRIL 13, 2023

: Bedrock, meet the Bedrock, it’s part of the modern generative AI family. From the town of Seattle comes Amazon’s entrance into the generative AI race with an offering called Bedrock, writes Kyle. But also because Kyle ’s story about Amazon entering the generative AI race was the most-read story on TechCrunch today.

Generative AI

Generative AI Development Cloud Artificial Inteligence

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies. Its serverless architecture allowed the team to rapidly prototype and refine their application without the burden of managing complex hardware infrastructure.

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Best practices to build generative AI applications on AWS

AWS Machine Learning - AI

MARCH 14, 2024

Generative AI applications driven by foundational models (FMs) are enabling organizations with significant business value in customer experience, productivity, process optimization, and innovations. In this post, we explore different approaches you can take when building applications that use generative AI.

Generative AI

Generative AI AWS Applications Artificial Inteligence

Reduce conversational AI response time through inference at the edge with AWS Local Zones

AWS Machine Learning - AI

MARCH 3, 2025

Recent advances in generative AI have led to the proliferation of new generation of conversational AI assistants powered by foundation models (FMs). AWS Local Zones are a type of edge infrastructure deployment that places select AWS services close to large population and industry centers.

AWS

AWS Artificial Inteligence Technical Review Systems Review

Welcome to a New Era of Building in the Cloud with Generative AI on AWS

AWS Machine Learning - AI

NOVEMBER 30, 2023

We believe generative AI has the potential over time to transform virtually every customer experience we know. Innovative startups like Perplexity AI are going all in on AWS for generative AI. AWS innovates to offer the most advanced infrastructure for ML.

Generative AI

Generative AI AWS Artificial Inteligence Software Review

Host concurrent LLMs with LoRAX

AWS Machine Learning - AI

APRIL 16, 2025

The increased usage of generative AI models has offered tailored experiences with minimal technical expertise, and organizations are increasingly using these powerful models to drive innovation and enhance their services across various domains, from natural language processing (NLP) to content generation.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Storage

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

AWS Machine Learning - AI

APRIL 30, 2025

Amazon Bedrock Model Distillation is generally available, and it addresses the fundamental challenge many organizations face when deploying generative AI : how to maintain high performance while reducing costs and latency. You can track these job status details in both the AWS Management Console and AWS SDK.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

GenAI sticker shock sends CIOs in search of solutions

CIO

JULY 18, 2024

The early bills for generative AI experimentation are coming in, and many CIOs are finding them more hefty than they’d like — some with only themselves to blame. CIOs are also turning to OEMs such as Dell Project Helix or HPE GreenLake for AI, IDC points out.

Generative AI

Generative AI Artificial Inteligence Fractional CTO Open Source

How Cisco accelerated the use of generative AI with Amazon SageMaker Inference

AWS Machine Learning - AI

AUGUST 8, 2024

Webex’s focus on delivering inclusive collaboration experiences fuels their innovation, which uses artificial intelligence (AI) and machine learning (ML), to remove the barriers of geography, language, personality, and familiarity with technology. Webex works with the world’s leading business and productivity apps—including AWS.

Generative AI

Generative AI Artificial Inteligence AWS Machine Learning

Learn how Amazon Ads created a generative AI-powered image generation capability using Amazon SageMaker

AWS Machine Learning - AI

MAY 15, 2024

To help advertisers more seamlessly address this challenge, Amazon Ads rolled out an image generation capability that quickly and easily develops lifestyle imagery, which helps advertisers bring their brand stories to life. Regarding the inference, customers using Amazon Ads now have a new API to receive these generated images.

Generative AI

Generative AI Artificial Inteligence Advertising Technical Review

CoreWeave, a GPU-focused cloud compute provider, lands $221M investment

TechCrunch

APRIL 20, 2023

Venturo, a hobbyist Ethereum miner, cheaply acquired GPUs from insolvent cryptocurrency mining farms, choosing Nvidia hardware for the increased memory (hence Nvidia’s investment in CoreWeave, presumably). For perspective, AWS made $80.1 Initially, CoreWeave was focused exclusively on cryptocurrency applications. billion and $26.28

Artificial Inteligence

Artificial Inteligence Cloud Generative AI Google Cloud

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

AWS Machine Learning - AI

MARCH 13, 2025

There are additional optional runtime parameters that are already pre-optimized in TGI containers to maximize performance on host hardware. deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Math-7B deepseek-ai/DeepSeek-R1-Distill-Qwen-7B DeepSeek-R1-Distill-Llama-8B Llama-3.1-8B GenAI Data Scientist at AWS.

Artificial Inteligence

Artificial Inteligence AWS Machine Learning Load Balancer

Private cloud makes its comeback, thanks to AI

CIO

MAY 14, 2024

Private cloud providers may be among the key beneficiaries of today’s generative AI gold rush as, once seemingly passé in favor of public cloud, CIOs are giving private clouds — either on-premises or hosted by a partner — a second look. The Milford, Conn.-based billion in 2024, and more than double by 2027.

Cloud

Cloud Generative AI Airlines AWS

Generative AI in the Enterprise

O'Reilly Media - Ideas

NOVEMBER 28, 2023

Generative AI has been the biggest technology story of 2023. And everyone has opinions about how these language models and art generation programs are going to change the nature of work, usher in the singularity, or perhaps even doom the human race. Many AI adopters are still in the early stages. What’s the reality?

Generative AI

Generative AI Enterprise ChatGPT Open Source

AWS hopes for a savior in AI as revenue growth continues to slow

CIO

AUGUST 4, 2023

Revenue for AWS increased 12% year-on-year in the second quarter to $21.4 However, Amazon CEO Andy Jassy said enterprises subscribing to AWS services have “needed assistance cost optimizing to withstand this challenging time.” Revenue growth for AWS continued to be on a constant decline. and 33% respectively.

AWS

AWS Generative AI Google Cloud Data Center

Taming the cost of AI: Is FinOps the answer?

CIO

APRIL 1, 2025

As artificial intelligence (AI) services, particularly generative AI (genAI), become increasingly integral to modern enterprises, establishing a robust financial operations (FinOps) strategy is essential. For instance, some companies charge based on the number of tasks completed or the success rate of AI applications.

Technical Review

Technical Review Azure Budget Artificial Inteligence

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

AWS Machine Learning - AI

NOVEMBER 27, 2024

Launching a machine learning (ML) training cluster with Amazon SageMaker training jobs is a seamless process that begins with a straightforward API call, AWS Command Line Interface (AWS CLI) command, or AWS SDK interaction. About the Authors Kanwaljit Khurmi is a Principal Worldwide Generative AI Solutions Architect at AWS.

Training

Training Artificial Inteligence AWS Machine Learning

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

AWS Machine Learning - AI

MAY 1, 2024

Llama2 by Meta is an example of an LLM offered by AWS. To learn more about Llama 2 on AWS, refer to Llama 2 foundation models from Meta are now available in Amazon SageMaker JumpStart. Virginia) and US West (Oregon) AWS Regions, and most recently announced general availability in the US East (Ohio) Region.

AWS

AWS Artificial Inteligence Training Generative AI

7 cloud market trends and how they will impact IT

CIO

OCTOBER 17, 2023

The pecking order for cloud infrastructure has been relatively stable, with AWS at around 33% market share, Microsoft Azure second at 22%, and Google Cloud a distant third at 11%. But the emergence of generative AI changes everything. He adds, “This is behind the drive to generative AI by the cloud providers.

Trends

Trends Marketing Cloud Artificial Inteligence

AI to go nuclear? Data center deals say it’s inevitable

CIO

AUGUST 19, 2024

AWS, Microsoft, and Google are going nuclear to build and operate mega data centers better equipped to meet the increasingly hefty demands of generative AI. Earlier this year, AWS paid $650 million to purchase Talen Energy’s Cumulus Data Assets, a 960-megawatt nuclear-powered data center on site at Talen’s Susquehanna, Penn.,

Data Center

Data Center Data Energy Generative AI

Generative AI foundation model training on Amazon SageMaker

AWS Machine Learning - AI

OCTOBER 22, 2024

However, these approaches demand advanced AI expertise, high performance compute, fast storage access and can be prohibitively expensive for many organizations. About the authors Trevor Harvey is a Principal Specialist in Generative AI at Amazon Web Services and an AWS Certified Solutions Architect – Professional.

Generative AI

Generative AI Training Artificial Inteligence Technical Advisors

Deploy DeepSeek-R1 distilled models on Amazon SageMaker using a Large Model Inference container

AWS Machine Learning - AI

MARCH 11, 2025

By integrating this model with Amazon SageMaker AI , you can benefit from the AWS scalable infrastructure while maintaining high-quality language model capabilities. In this post, we show how to use the distilled models in SageMaker AI, which offers several options to deploy the distilled versions of the R1 model.

Artificial Inteligence

Artificial Inteligence AWS Generative AI Metrics

Achieve up to ~2x higher throughput while reducing costs by up to ~50% for generative AI inference on Amazon SageMaker with the new inference optimization toolkit – Part 2

AWS Machine Learning - AI

JULY 9, 2024

As generative artificial intelligence (AI) inference becomes increasingly critical for businesses, customers are seeking ways to scale their generative AI operations or integrate generative AI models into existing workflows.

Generative AI

Generative AI LAN Artificial Inteligence Artificial Intelligence

Unlocking generative AI for enterprises: How SnapLogic powers their low-code Agent Creator using Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 23, 2024

The landscape of enterprise application development is undergoing a seismic shift with the advent of generative AI. This innovative platform empowers employees, regardless of their coding skills, to create generative AI processes and applications through a low-code visual designer.

Generative AI

Generative AI Artificial Inteligence Enterprise AWS

AI on the mainframe? IBM may be onto something

CIO

OCTOBER 3, 2024

Some CIOs, especially from large enterprises that still rely on the mainframe’s batch-processing prowess, are taking a hard look at IBM’s next-gen mainframe to run — but not train — generative AI models. IBM continues to demonstrate that it has an advanced approach to AI, which includes embedding AI into the z16.

Artificial Inteligence

Artificial Inteligence Generative AI Machine Learning Enterprise

Fine-tune Llama 2 using QLoRA and Deploy it on Amazon SageMaker with AWS Inferentia2

AWS Machine Learning - AI

DECEMBER 13, 2023

In this post, we showcase fine-tuning a Llama 2 model using a Parameter-Efficient Fine-Tuning (PEFT) method and deploy the fine-tuned model on AWS Inferentia2. We use the AWS Neuron software development kit (SDK) to access the AWS Inferentia2 device and benefit from its high performance.

Artificial Inteligence

Artificial Inteligence AWS Machine Learning Software Review

Getting started with cross-region inference in Amazon Bedrock

AWS Machine Learning - AI

AUGUST 27, 2024

With the advent of generative AI solutions , a paradigm shift is underway across industries, driven by organizations embracing foundation models to unlock unprecedented opportunities. Ability to choose from a range of pre-configured AWS region sets tailored to your needs. Become more resilient to any traffic bursts.

AWS

AWS Generative AI Load Balancer Applications

Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium

AWS Machine Learning - AI

DECEMBER 12, 2023

In this post, we’ll summarize training procedure of GPT NeoX on AWS Trainium , a purpose-built machine learning (ML) accelerator optimized for deep learning training. M tokens/$) trained such models with AWS Trainium without losing any model quality. We’ll outline how we cost-effectively (3.2 billion in Pythia. 2048 256 10.4

AWS

AWS Training Artificial Inteligence Meeting

Is generative AI bringing back private clouds?

InfoWorld

MAY 17, 2024

In addition, global spending on enterprise private cloud infrastructure, including hardware, software, and support services, will be $51.8 Public clouds, including the big three AWS, Microsoft, and Google, are expected to rake in $815.7 billion in 2024 and grow to $66.4 billion in 2027, according to IDC. billion in 2024.

Generative AI

Generative AI Cloud Survey Hardware

Top 6 Annotation Tools for HITL LLMs Evaluation and Domain-Specific AI Model Training

John Snow Labs

APRIL 29, 2025

In the era of large language models (LLMs)where generative AI can write, summarize, translate, and even reason across complex documentsthe function of data annotation has shifted dramatically. What was once a preparatory task for training AI is now a core part of a continuous feedback and improvement cycle.

Artificial Inteligence

Artificial Inteligence Training Tools Generative AI

Deploy DeepSeek-R1 distilled Llama models with Amazon Bedrock Custom Model Import

AWS Machine Learning - AI

JANUARY 29, 2025

Open foundation models (FMs) have become a cornerstone of generative AI innovation, enabling organizations to build and customize AI applications while maintaining control over their costs and deployment strategies. Prerequisites You should have the following prerequisites: An AWS account with access to Amazon Bedrock.

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

How to get gen AI spend under control

CIO

AUGUST 8, 2024

So to control operational costs, DoIT is strategic in its gen AI investments and expenses, he says. “We When company engineers spin up an AWS server, and a bill arrives, it’s written in a language of SKUs, hourly rates, discounts and credits. Generative AI is a really powerful and incredible tool, but it’s not magic,” he says.

Artificial Inteligence

Artificial Inteligence How To Generative AI Weak Development Team

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Reduce ML training costs with Amazon SageMaker HyperPod

Webinars

Trending Sources

Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS

Webinars

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

9 IT skills where expertise pays the most

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

Together raises $20M to build open source generative AI models

A secure approach to generative AI with AWS

Powering tomorrow with Generative AI: The AWS-Capgemini partnership advantage

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Google’s AI innovations at Cloud Next 2025: What CIOs need to know

Unlocking Innovation: AWS and Anthropic push the boundaries of generative AI together

Daily Crunch: Amazon’s new Bedrock cloud service lets developers incorporate generative AI

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Best practices to build generative AI applications on AWS

Reduce conversational AI response time through inference at the edge with AWS Local Zones

Welcome to a New Era of Building in the Cloud with Generative AI on AWS

Host concurrent LLMs with LoRAX

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

GenAI sticker shock sends CIOs in search of solutions

How Cisco accelerated the use of generative AI with Amazon SageMaker Inference

Learn how Amazon Ads created a generative AI-powered image generation capability using Amazon SageMaker

CoreWeave, a GPU-focused cloud compute provider, lands $221M investment

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

Private cloud makes its comeback, thanks to AI

Generative AI in the Enterprise

AWS hopes for a savior in AI as revenue growth continues to slow

Taming the cost of AI: Is FinOps the answer?

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

7 cloud market trends and how they will impact IT

AI to go nuclear? Data center deals say it’s inevitable

Generative AI foundation model training on Amazon SageMaker

Deploy DeepSeek-R1 distilled models on Amazon SageMaker using a Large Model Inference container

Achieve up to ~2x higher throughput while reducing costs by up to ~50% for generative AI inference on Amazon SageMaker with the new inference optimization toolkit – Part 2

Unlocking generative AI for enterprises: How SnapLogic powers their low-code Agent Creator using Amazon Bedrock

AI on the mainframe? IBM may be onto something

Fine-tune Llama 2 using QLoRA and Deploy it on Amazon SageMaker with AWS Inferentia2

Getting started with cross-region inference in Amazon Bedrock

Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium

Is generative AI bringing back private clouds?

Top 6 Annotation Tools for HITL LLMs Evaluation and Domain-Specific AI Model Training

Deploy DeepSeek-R1 distilled Llama models with Amazon Bedrock Custom Model Import

How to get gen AI spend under control

Stay Connected