Architecture, AWS and Training

Reduce ML training costs with Amazon SageMaker HyperPod

AWS Machine Learning - AI

APRIL 10, 2025

Training a frontier model is highly compute-intensive, requiring a distributed system of hundreds, or thousands, of accelerated instances running for several weeks or months to complete a single job. For example, pre-training the Llama 3 70B model with 15 trillion training tokens took 6.5 During the training of Llama 3.1

Training

Training Artificial Inteligence Hardware Systems Review

Accelerate AWS Well-Architected reviews with Generative AI

AWS Machine Learning - AI

MARCH 4, 2025

To achieve these goals, the AWS Well-Architected Framework provides comprehensive guidance for building and improving cloud architectures. This allows teams to focus more on implementing improvements and optimizing AWS infrastructure. This systematic approach leads to more reliable and standardized evaluations.

Generative AI

Generative AI Technical Review Software Review Systems Review

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

AWS Machine Learning - AI

NOVEMBER 27, 2024

Across diverse industries—including healthcare, finance, and marketing—organizations are now engaged in pre-training and fine-tuning these increasingly larger LLMs, which often boast billions of parameters and larger input sequence length. This approach reduces memory pressure and enables efficient training of large models.

Training

Training Artificial Inteligence AWS Machine Learning

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Multi-LLM routing strategies for generative AI applications on AWS

AWS Machine Learning - AI

APRIL 9, 2025

The Pro tier, however, would require a highly customized LLM that has been trained on specific data and terminology, enabling it to assist with intricate tasks like drafting complex legal documents. This architecture workflow includes the following steps: A user submits a question through a web or mobile application. 70B and 8B.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Applications

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

AWS Machine Learning - AI

DECEMBER 24, 2024

Training large language models (LLMs) models has become a significant expense for businesses. PEFT is a set of techniques designed to adapt pre-trained LLMs to specific tasks while minimizing the number of parameters that need to be updated.

AWS

AWS Artificial Inteligence Generative AI Training

Introducing AWS MCP Servers for code assistants (Part 1)

AWS Machine Learning - AI

APRIL 1, 2025

Were excited to announce the open source release of AWS MCP Servers for code assistants a suite of specialized Model Context Protocol (MCP) servers that bring Amazon Web Services (AWS) best practices directly to your development workflow. This post is the first in a series covering AWS MCP Servers.

AWS

AWS Software Review Knowledge Base Artificial Inteligence

Using Amazon Q Business with AWS HealthScribe to gain insights from patient consultations

AWS Machine Learning - AI

OCTOBER 17, 2024

During re:Invent 2023, we launched AWS HealthScribe , a HIPAA eligible service that empowers healthcare software vendors to build their clinical applications to use speech recognition and generative AI to automatically create preliminary clinician documentation.

AWS

AWS Artificial Inteligence Generative AI Machine Learning

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning - AI

NOVEMBER 7, 2024

It also uses a number of other AWS services such as Amazon API Gateway , AWS Lambda , and Amazon SageMaker. You can use AWS services such as Application Load Balancer to implement this approach. You can also bring your own customized models and deploy them to Amazon Bedrock for supported architectures.

Generative AI

Generative AI AWS Enterprise Artificial Inteligence

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

With the QnABot on AWS (QnABot), integrated with Microsoft Azure Entra ID access controls, Principal launched an intelligent self-service solution rooted in generative AI. Principal also used the AWS open source repository Lex Web UI to build a frontend chat interface with Principal branding.

Generative AI

Generative AI AWS Groups Artificial Inteligence

IT leaders: What’s the gameplan as tech badly outpaces talent?

CIO

MARCH 13, 2025

Hes seeing the need for professionals who can not only navigate the technology itself, but also manage increasing complexities around its surrounding architectures, data sets, infrastructure, applications, and overall security. Or bring in a consulting company that can help you build and train at the same time, he adds.

Part-Time VPE

Part-Time VPE Weak Development Team Fractional VPE Fractional CTO

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

AWS Machine Learning - AI

NOVEMBER 13, 2024

The following diagram illustrates the solution architecture: The steps of the solution include: Upload data to Amazon S3 : Store the product images in Amazon Simple Storage Service (Amazon S3). The AWS Command Line Interface (AWS CLI) installed on your machine to upload the dataset to Amazon S3.

AWS

AWS Engineering Serverless eCommerce

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO

JANUARY 7, 2025

It prevents vendor lock-in, gives a lever for strong negotiation, enables business flexibility in strategy execution owing to complicated architecture or regional limitations in terms of security and legal compliance if and when they rise and promotes portability from an application architecture perspective.

Cloud

Cloud Strategy Architecture Policies

Integrate foundation models into your code with Amazon Bedrock

AWS Machine Learning - AI

NOVEMBER 6, 2024

These powerful models, trained on vast amounts of data, can generate human-like text, answer questions, and even engage in creative writing tasks. However, training and deploying such models from scratch is a complex and resource-intensive process, often requiring specialized expertise and significant computational resources.

Software Review

Software Review Artificial Inteligence Generative AI AWS

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

AWS Machine Learning - AI

APRIL 29, 2025

At Data Reply and AWS, we are committed to helping organizations embrace the transformative opportunities generative AI presents, while fostering the safe, responsible, and trustworthy development of AI systems. These potential vulnerabilities could be exploited by adversaries through various threat vectors.

Generative AI

Generative AI Weak Development Team AWS Artificial Inteligence

Model customization, RAG, or both: A case study with Amazon Nova

AWS Machine Learning - AI

APRIL 10, 2025

Demystifying RAG and model customization RAG is a technique to enhance the capability of pre-trained models by allowing the model access to external domain-specific data sources. Unlike fine-tuning, in RAG, the model doesnt undergo any training and the model weights arent updated to learn the domain knowledge.

Case Study

Case Study Artificial Inteligence Study Generative AI

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

AWS Machine Learning - AI

APRIL 30, 2025

This advancement makes sophisticated agent architectures more accessible and economically viable across a broader range of applications and scales of deployment. We recommend referring to the Submit a model distillation job in Amazon Bedrock in the official AWS documentation for the most up-to-date and comprehensive information.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

AWS Machine Learning - AI

MAY 1, 2025

We will deep dive into the MCP architecture later in this post. Using a client-server architecture (as illustrated in the following screenshot), MCP helps developers expose their data through lightweight MCP servers while building AI applications as MCP clients that connect to these servers.

Artificial Inteligence

Artificial Inteligence AWS Architecture Generative AI

12 AI predictions for 2025

CIO

DECEMBER 30, 2024

Plus, they can be more easily trained on a companys own data, so Upwork is starting to embrace this shift, training its own small language models on more than 20 years of interactions and behaviors on its platform. Agents can be more loosely coupled than services, making these architectures more flexible, resilient and smart.

Fractional CTO

Fractional CTO Software Development CTO Coach Architecture

Marsh McLennan IT reorg lays foundation for gen AI

CIO

NOVEMBER 1, 2024

As part of MMTech’s unifying strategy, Beswick chose to retire the data centers and form an “enterprisewide architecture organization” with a set of standards and base layers to develop applications and workloads that would run on the cloud, with AWS as the firm’s primary cloud provider.

Generative AI

Generative AI Technical Advisors Insurance Weak Development Team

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

AWS Machine Learning - AI

DECEMBER 4, 2024

At AWS re:Invent 2024, we are excited to introduce Amazon Bedrock Marketplace. Nemotron-4 15B, with its impressive 15-billion-parameter architecture trained on 8 trillion text tokens, brings powerful multilingual and coding capabilities to the Amazon Bedrock.

Artificial Inteligence

Artificial Inteligence Microservices Generative AI AWS

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

AWS Machine Learning - AI

MARCH 3, 2025

Tuning model architecture requires technical expertise, training and fine-tuning parameters, and managing distributed training infrastructure, among others. These recipes are processed through the HyperPod recipe launcher, which serves as the orchestration layer responsible for launching a job on the corresponding architecture.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Training

How BQA streamlines education quality reporting using Amazon Bedrock

AWS Machine Learning - AI

JANUARY 13, 2025

The Education and Training Quality Authority (BQA) plays a critical role in improving the quality of education and training services in the Kingdom Bahrain. BQA oversees a comprehensive quality assurance process, which includes setting performance standards and conducting objective reviews of education and training institutions.

Education

Education Report Technical Review Generative AI

Why GreenOps will succeed where FinOps is failing

CIO

FEBRUARY 4, 2025

The result was a compromised availability architecture. For example, the database team we worked with in an organization new to the cloud launched all the AWS RDS database servers from dev through production, incurring a $600K a month cloud bill nine months before the scheduled production launch. Long-term value creation.

Sustainability

Sustainability Technical Review Architecture Fractional CTO

Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails

AWS Machine Learning - AI

NOVEMBER 19, 2024

Amazon Bedrock offers a serverless experience so you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using AWS tools without having to manage infrastructure. The following diagram provides a detailed view of the architecture to enhance email support using generative AI.

Knowledge Base

Knowledge Base Generative AI Technical Review Lambda

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

In this post, we explore how to deploy distilled versions of DeepSeek-R1 with Amazon Bedrock Custom Model Import, making them accessible to organizations looking to use state-of-the-art AI capabilities within the secure and scalable AWS infrastructure at an effective cost. 8B ) and DeepSeek-R1-Distill-Llama-70B (from base model Llama-3.3-70B-Instruct

Generative AI

Generative AI Artificial Inteligence AWS Technical Review

Reduce conversational AI response time through inference at the edge with AWS Local Zones

AWS Machine Learning - AI

MARCH 3, 2025

Hybrid architecture with AWS Local Zones To minimize the impact of network latency on TTFT for users regardless of their locations, a hybrid architecture can be implemented by extending AWS services from commercial Regions to edge locations closer to end users. Next, create a subnet inside each Local Zone.

AWS

AWS Artificial Inteligence Technical Review Systems Review

The future of data: A 5-pillar approach to modern data management

CIO

DECEMBER 11, 2024

Organizations must decide on their hosting provider, whether it be an on-prem setup, cloud solutions like AWS, GCP, Azure or specialized data platform providers such as Snowflake and Databricks. Not my original quote, but a cardinal sin of cloud-native data architecture is copying data from one location to another.

Data

Data Technical Review Software Review Weak Development Team

Host concurrent LLMs with LoRAX

AWS Machine Learning - AI

APRIL 16, 2025

LoRA is a technique for efficiently adapting large pre-trained language models to new tasks or domains by introducing small trainable weight matrices, called adapters, within each linear layer of the pre-trained model. Why LoRAX for LoRA deployment on AWS? The following diagram is the solution architecture.

Artificial Inteligence

Artificial Inteligence Generative AI AWS Storage

Marsh McLellan IT reorg lays foundation for gen AI

CIO

NOVEMBER 1, 2024

As part of MMTech’s unifying strategy, Beswick chose to retire the data centers and form an “enterprisewide architecture organization” with a set of standards and base layers to develop applications and workloads that would run on the cloud, with AWS as the firm’s primary cloud provider.

Generative AI

Generative AI Technical Advisors Insurance Weak Development Team

Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium

AWS Machine Learning - AI

DECEMBER 12, 2023

A generative pre-trained transformer (GPT) uses causal autoregressive updates to make prediction. Variety of tasks such as speech recognition, text generation, and question answering are demonstrated to have stupendous performance by these model architectures. Both are decoder models following similar architectural design as Chat GPT3.

AWS

AWS Training Artificial Inteligence Meeting

What is enterprise architecture? A framework for transformation

CIO

NOVEMBER 23, 2022

Enterprise architecture definition Enterprise architecture (EA) is the practice of analyzing, designing, planning, and implementing enterprise analysis to successfully execute on business strategies. Making it easier to evaluate existing architecture against long-term goals.

Architecture

Architecture Enterprise Agile Artificial Inteligence

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning - AI

JANUARY 29, 2025

Large organizations often have many business units with multiple lines of business (LOBs), with a central governing entity, and typically use AWS Organizations with an Amazon Web Services (AWS) multi-account strategy. In this post, we evaluate different generative AI operating model architectures that could be adopted.

Generative AI

Generative AI Organization Enterprise Artificial Inteligence

Build a video insights and summarization engine using generative AI with Amazon Bedrock

AWS Machine Learning - AI

OCTOBER 29, 2024

This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call. Organizations typically can’t predict their call patterns, so the solution relies on AWS serverless services to scale during busy times.

Generative AI

Generative AI Video Engineering Artificial Inteligence

Overcoming the 6 barriers to IT modernization

CIO

NOVEMBER 26, 2024

For instance, Capital One successfully transitioned from mainframe systems to a cloud-first strategy by gradually migrating critical applications to Amazon Web Services (AWS). It adopted a microservices architecture to decouple legacy components, allowing for incremental updates without disrupting the entire system.

Weak Development Team

Weak Development Team Compliance Culture Budget

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

AWS Machine Learning - AI

MAY 1, 2024

Llama2 by Meta is an example of an LLM offered by AWS. Llama 2 is an auto-regressive language model that uses an optimized transformer architecture and is intended for commercial and research use in English. It comes in a range of parameter sizes—7 billion, 13 billion, and 70 billion—as well as pre-trained and fine-tuned variations.

AWS

AWS Training Artificial Inteligence Generative AI

Salesforce bets big on Saudi Arabia with 500M USD AI investment

CIO

FEBRUARY 11, 2025

A key component of this expansion is the introduction of Hyperforce, Salesforces next-generation platform architecture, to Saudi Arabia. Delivered in partnership with Amazon Web Services (AWS), Hyperforce will enable Salesforce customers to run workloads locally while adhering to regulatory requirements.

AWS

AWS Architecture Training Innovation

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies. MaestroQA integrated Amazon Bedrock into their existing architecture using Amazon Elastic Container Service (Amazon ECS).

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Creating asynchronous AI agents with Amazon Bedrock

AWS Machine Learning - AI

MARCH 13, 2025

This post will discuss agentic AI driven architecture and ways of implementing. Agentic AI architecture Agentic AI architecture is a shift in process automation through autonomous agents towards the capabilities of AI, with the purpose of imitating cognitive abilities and enhancing the actions of traditional autonomous agents.

Artificial Inteligence

Artificial Inteligence Lambda Travel Generative AI

Generative AI-powered game design: Accelerating early development with Stability AI models on Amazon Bedrock

AWS Machine Learning - AI

MARCH 26, 2025

Its improved architecture, based on the Multimodal Diffusion Transformer (MMDiT), combines multiple pre-trained text encoders for enhanced text understanding and uses QK-normalization to improve training stability. Use the us-west-2 AWS Region to run this demo. An Amazon SageMaker domain. Access to Stability AIs SD3.5

Generative AI

Generative AI Games Development AWS

Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Development Support Program

AWS Machine Learning - AI

JULY 31, 2024

Amazon Web Services (AWS) is committed to supporting the development of cutting-edge generative artificial intelligence (AI) technologies by companies and organizations across the globe. Let’s dive in and explore how these organizations are transforming what’s possible with generative AI on AWS.

Artificial Inteligence

Artificial Inteligence AWS Programming Innovation

Enhance speech synthesis and video generation models with RLHF using audio and video segmentation in Amazon SageMaker

AWS Machine Learning - AI

NOVEMBER 21, 2024

As large language models (LLMs) increasingly integrate more multimedia capabilities, human feedback becomes even more critical in training them to generate rich, multi-modal content that aligns with human quality standards. The path to creating effective AI models for audio and video generation presents several distinct challenges.

Video

Video Lambda AWS Generative AI

The AWS Cloud Migration Checklist Every Business Needs for a Smooth Transition

Mobilunity

DECEMBER 26, 2024

For medium to large businesses with outdated systems or on-premises infrastructure, transitioning to AWS can revolutionize their IT operations and enhance their capacity to respond to evolving market needs. AWS migration isnt just about moving data; it requires careful planning and execution. Need to hire skilled engineers?

AWS

AWS Cloud Weak Development Team DevOps

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

AWS Machine Learning - AI

MARCH 13, 2025

DeepSeek-R1 , developed by AI startup DeepSeek AI , is an advanced large language model (LLM) distinguished by its innovative, multi-stage training process. Instead of relying solely on traditional pre-training and fine-tuning, DeepSeek-R1 integrates reinforcement learning to achieve more refined outputs.

Artificial Inteligence

Artificial Inteligence AWS Machine Learning Load Balancer

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

AWS Machine Learning - AI

MARCH 20, 2025

Cross-Region inference enables seamless management of unplanned traffic bursts by using compute across different AWS Regions. Amazon Bedrock Data Automation optimizes for available AWS Regional capacity by automatically routing across regions within the same geographic area to maximize throughput at no additional cost.

Data

Data Generative AI Artificial Inteligence Compliance

Reduce ML training costs with Amazon SageMaker HyperPod

Accelerate AWS Well-Architected reviews with Generative AI

Webinars

Trending Sources

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

Webinars

Multi-LLM routing strategies for generative AI applications on AWS

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

Introducing AWS MCP Servers for code assistants (Part 1)

Using Amazon Q Business with AWS HealthScribe to gain insights from patient consultations

Build a multi-tenant generative AI environment for your enterprise on AWS

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

IT leaders: What’s the gameplan as tech badly outpaces talent?

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

Integrate foundation models into your code with Amazon Bedrock

Responsible AI in action: How Data Reply red teaming supports generative AI safety on AWS

Model customization, RAG, or both: A case study with Amazon Nova

Amazon Bedrock Model Distillation: Boost function calling accuracy while reducing cost and latency

Extend large language models powered by Amazon SageMaker AI using Model Context Protocol

12 AI predictions for 2025

Marsh McLennan IT reorg lays foundation for gen AI

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

How BQA streamlines education quality reporting using Amazon Bedrock

Why GreenOps will succeed where FinOps is failing

Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails

Deploy DeepSeek-R1 Distilled Llama models in Amazon Bedrock

Reduce conversational AI response time through inference at the edge with AWS Local Zones

The future of data: A 5-pillar approach to modern data management

Host concurrent LLMs with LoRAX

Marsh McLellan IT reorg lays foundation for gen AI

Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium

What is enterprise architecture? A framework for transformation

Generative AI operating models in enterprise organizations with Amazon Bedrock

Build a video insights and summarization engine using generative AI with Amazon Bedrock

Overcoming the 6 barriers to IT modernization

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

Salesforce bets big on Saudi Arabia with 500M USD AI investment

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Creating asynchronous AI agents with Amazon Bedrock

Generative AI-powered game design: Accelerating early development with Stability AI models on Amazon Bedrock

Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Development Support Program

Enhance speech synthesis and video generation models with RLHF using audio and video segmentation in Amazon SageMaker

The AWS Cloud Migration Checklist Every Business Needs for a Smooth Transition

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

Stay Connected