Architecture, Data Engineering and Performance

What is data architecture? A framework to manage data

CIO

DECEMBER 20, 2024

Data architecture definition Data architecture describes the structure of an organizations logical and physical data assets, and data management resources, according to The Open Group Architecture Framework (TOGAF). An organizations data architecture is the purview of data architects.

Architecture

Architecture Data Fractional CTO Technical Review

From legacy to lakehouse: Centralizing insurance data with Delta Lake

CIO

APRIL 23, 2025

This is where Delta Lakehouse architecture truly shines. Specifically, within the insurance industry, where data is the lifeblood of innovation and operational effectiveness, embracing such a transformative approach is essential for staying agile, secure and competitive. This unified view makes it easier to manage and access your data.

Insurance

Insurance Artificial Inteligence Data Architecture

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

CIO

NOVEMBER 19, 2024

The challenges of integrating data with AI workflows When I speak with our customers, the challenges they talk about involve integrating their data and their enterprise AI workflows. The core of their problem is applying AI technology to the data they already have, whether in the cloud, on their premises, or more likely both.

Artificial Inteligence

Artificial Inteligence Engineering Data Storage

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Fundamentals of Data Engineering

Xebia

JANUARY 19, 2023

The following is a review of the book Fundamentals of Data Engineering by Joe Reis and Matt Housley, published by O’Reilly in June of 2022, and some takeaway lessons. This book is as good for a project manager or any other non-technical role as it is for a computer science student or a data engineer.

Data Engineering

Data Engineering Engineering Data Technical Review

Ready to transform how your IT organization drives business outcomes with AIOps?

CIO

JANUARY 3, 2025

Because of the adoption of containers, microservices architectures, and CI/CD pipelines, these environments are increasingly complex and noisy. These changes can cause many more unexpected performance and availability issues.

Organization

Organization Artificial Intelligence Artificial Inteligence DevOps

What is Data Engineering: Explaining Data Pipeline, Data Warehouse, and Data Engineer Role

Altexsoft

JUNE 25, 2019

If we look at the hierarchy of needs in data science implementations, we’ll see that the next step after gathering your data for analysis is data engineering. This discipline is not to be underestimated, as it enables effective data storing and reliable data flow while taking charge of the infrastructure.

Data Engineering

Data Engineering Engineering Data Artificial Inteligence

Cloudera Data Engineering 2021 Year End Review

Cloudera

DECEMBER 21, 2021

Since the release of Cloudera Data Engineering (CDE) more than a year ago , our number one goal was operationalizing Spark pipelines at scale with first class tooling designed to streamline automation and observability. Performance boost with Spark 3.1. Modernizing pipelines. With the release of Spark 3.1

Data Engineering

Data Engineering Technical Review Software Review Engineering

Cloudera and AWS Partner to Deliver Cost-Efficient and Sustainable Infrastructure for AI and Analytics

Cloudera

DECEMBER 2, 2024

Cloudera is committed to providing the most optimal architecture for data processing, advanced analytics, and AI while advancing our customers’ cloud journeys. Together, Cloudera and AWS empower businesses to optimize performance for data processing, analytics, and AI while minimizing their resource consumption and carbon footprint.

Sustainability

Sustainability AWS Analytics Infrastructure

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, such as AI21 Labs, Anthropic, Cohere, Meta, Mistral, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI.

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Data Scientist vs Data Engineer: Differences and Why You Need Both

Altexsoft

OCTOBER 30, 2021

If you’re an executive who has a hard time understanding the underlying processes of data science and get confused with terminology, keep reading. We will try to answer your questions and explain how two critical data jobs are different and where they overlap. Data science vs data engineering.

Data Engineering

Data Engineering Engineering Data Artificial Inteligence

The Modern Data Lakehouse: An Architectural Innovation

Cloudera

SEPTEMBER 9, 2022

The promise of a modern data lakehouse architecture. Imagine having self-service access to all business data, anywhere it may be, and being able to explore it all at once. Imagine quickly answering burning business questions nearly instantly, without waiting for data to be found, shared, and ingested.

Architecture

Architecture Innovation Data Open Source

A Recap of the Data Engineering Open Forum at Netflix

Netflix Tech

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale.

Data Engineering

Data Engineering Engineering Data Generative AI

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Netflix Tech

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions. It also becomes inefficient as the data scale increases.

Data Engineering

Data Engineering Engineering Data Systems Review

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

Amazon Q Business is a generative AI-powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. It empowers employees to be more creative, data-driven, efficient, prepared, and productive.

Generative AI

Generative AI AWS Groups Artificial Inteligence

Fueling the Future of GenAI with NiFi: Cloudera DataFlow 2.9 Delivers Enhanced Efficiency and Adaptability

Cloudera

DECEMBER 4, 2024

introduces new features specifically designed to fuel GenAI initiatives: New AI Processors: Harness the power of cutting-edge AI models with new processors that simplify integration and streamline data preparation for GenAI applications. Accelerating GenAI with Powerful New Capabilities Cloudera DataFlow 2.9

Metrics

Metrics Generative AI Open Source Data Engineering

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF) , the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP) , as a Data integration and Democratization fabric. Introduction to the Data Mesh Architecture and its Required Capabilities.

Architecture

Architecture Data Security Technical Review

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

Cloudera

OCTOBER 23, 2024

In August, we wrote about how in a future where distributed data architectures are inevitable, unifying and managing operational and business metadata is critical to successfully maximizing the value of data, analytics, and AI.

Data

Data Analytics Systems Review Architecture

What is DataOps? Collaborative, cross-functional analytics

CIO

DECEMBER 22, 2022

DataOps (data operations) is an agile, process-oriented methodology for developing and delivering analytics. It brings together DevOps teams with data engineers and data scientists to provide the tools, processes, and organizational structures to support the data-focused enterprise. What is DataOps?

Analytics

Analytics Data Engineering Artificial Inteligence Machine Learning

How Automatic Liquid Clustering Supports Databricks FinOps at Scale

Perficient

MARCH 13, 2025

In this case, Liquid Clustering addresses the data management and query optimization aspects of cost control soi simply and elegantly that I’m happy to take my hands off the controls. This made intuitive sense to me as an early Spark developer, and I had deep knowledge of both architectures.

Data Engineering

Data Engineering Government Engineering Data

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Cloudera

OCTOBER 11, 2021

Modak, a leading provider of modern data engineering solutions, is now a certified solution partner with Cloudera. Customers can now seamlessly automate migration to Cloudera’s Hybrid Data Platform — Cloudera Data Platform (CDP) to dynamically auto-scale cloud services with Cloudera Data Engineering (CDE) integration with Modak Nabu.

Data Engineering

Data Engineering Engineering Data Cloud

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

AWS Machine Learning - AI

APRIL 23, 2025

Designed with a serverless, cost-optimized architecture, the platform provisions SageMaker endpoints dynamically, providing efficient resource utilization while maintaining scalability. The following diagram illustrates the solution architecture. Key architectural decisions drive both performance and cost optimization.

Artificial Inteligence

Artificial Inteligence Open Source AWS Serverless

Tools for generating deep neural networks with efficient network architectures

O'Reilly Media - Ideas

DECEMBER 6, 2018

As the use of machine learning and analytics become more widespread, we’re beginning to see tools that enable data scientists and data engineers to scale and tackle many more problems and maintain more systems. Continue reading Tools for generating deep neural networks with efficient network architectures.

Network

Network Architecture Tools Artificial Inteligence

Porsche Carrera Cup Brasil gets real-time data boost

CIO

MAY 21, 2024

In the annual Porsche Carrera Cup Brasil, data is essential to keep drivers safe and sustain optimal performance of race cars. Until recently, getting at and analyzing that essential data was a laborious affair that could take hours, and only once the race was over. You can monitor and act on the data and you can set thresholds.”

Data

Data Azure Engineering Analytics

Big Data Engineer: Role, Responsibilities, and Job Description

Altexsoft

AUGUST 25, 2020

That’s why a data specialist with big data skills is one of the most sought-after IT candidates. Data Engineering positions have grown by half and they typically require big data skills. Data engineering vs big data engineering. Big data processing. maintaining data pipeline.

Big Data

Big Data Data Engineering Engineering Data

SAP and Databricks: Better Together

Perficient

FEBRUARY 13, 2025

Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint in many large enterprises. However, SAP has an opaque data model.

Government

Government Open Source Machine Learning Artificial Inteligence

Cloudera Data Engineering – Integration steps to leverage spark on Kubernetes

Cloudera

APRIL 14, 2021

What is Cloudera Data Engineering (CDE) ? Cloudera Data Engineering is a serverless service for Cloudera Data Platform (CDP) that allows you to submit jobs to auto-scaling virtual clusters. Refer to the following cloudera blog to understand the full potential of Cloudera Data Engineering. .

Data Engineering

Data Engineering Engineering Data Serverless

What is Data Engineer: Role Description, Responsibilities, Skills, and Background

Altexsoft

APRIL 22, 2020

So, along with data scientists who create algorithms, there are data engineers, the architects of data platforms. In this article we’ll explain what a data engineer is, the field of their responsibilities, skill sets, and general role description. What is a data engineer?

Data Engineering

Data Engineering Engineering Artificial Inteligence Data

Cloudera Data Warehouse Demonstrates Best-in-Class Cloud-Native Price-Performance

Cloudera

JANUARY 15, 2021

With the ability to quickly provision on-demand and the lower fixed and administrative costs, the costs of operating a cloud data warehouse are driven mostly by the price-performance of the specific data warehouse platform. CDW is one of several managed services that comprise the broader Cloudera Data Platform (CDP).

Performance

Performance Cloud Data Storage

Snowflake Best Practices for Data Engineering

Perficient

FEBRUARY 13, 2023

Introduction: We often end up creating a problem while working on data. So, here are few best practices for data engineering using snowflake: 1.Transform Each data model has its own advantages and storing intermediate step results has significant architectural advantages.

Data Engineering

Data Engineering Engineering Data Storage

CIOs take note: Platform engineering teams are the future core of IT orgs

CIO

JUNE 19, 2024

They may also ensure consistency in terms of processes, architecture, security, and technical governance. Our platform engineering teams, which support more than 200 applications, have innovated around automation,” says Bob Simms, former director of enterprise infrastructure delivery at the US Patent and Trademark Office (USPTO).

Weak Development Team

Weak Development Team Engineering UI/UX Software Development

Building a Scalable Search Architecture

Confluent

JUNE 18, 2019

As soon as the number of data points involved in your search feature increases, typically we’ll introduce a broker in between all the involved components. This architectural pattern provides several benefits: Better scalability by allowing multiple data producers and consumers to run in parallel.

Scalability

Scalability Architecture Machine Learning Artificial Inteligence

Building a vision for real-time artificial intelligence

CIO

APRIL 12, 2023

After walking his executive team through the data hops, flows, integrations, and processing across different ingestion software, databases, and analytical platforms, they were shocked by the complexity of their current data architecture and technology stack. About George Trujillo: George is principal data strategist at DataStax.

Artificial Inteligence

Artificial Inteligence Artificial Intelligence Machine Learning Agile

Your technology architecture and engineering organization should coevolve as your startup grows

Abhishek Tiwari

FEBRUARY 26, 2020

The evolution of your technology architecture should depend on the size, culture, and skill set of your engineering organization. There are no hard-and-fast rules to figure out interdependency between technology architecture and engineering organization but below is what I think can really work well for product startup.

Architecture

Architecture MVC Engineering Technology

5 key areas for tech leaders to watch in 2020

O'Reilly Media - Ideas

FEBRUARY 18, 2020

This year’s growth in Python usage was buoyed by its increasing popularity among data scientists and machine learning (ML) and artificial intelligence (AI) engineers. Software architecture, infrastructure, and operations are each changing rapidly. Trends in software architecture, infrastructure, and operations.

Technical Review

Technical Review Microservices Data Engineering Architecture

You still don’t need a feature store

Xebia

MARCH 13, 2025

Please have a look at this blog post on machine learning serving architectures if you do not know the difference. Let’s say you are a Data Scientist working in a model development environment. You have complete access to all historical data. As a result, your model will perform worse at serving time than at training time.

Training

Training Artificial Inteligence Machine Learning Data

The rise of the data lakehouse: A new era of data value

CIO

AUGUST 18, 2022

Previously, Walgreens was attempting to perform that task with its data lake but faced two significant obstacles: cost and time. Those challenges are well-known to many organizations as they have sought to obtain analytical knowledge from their vast amounts of data. You can intuitively query the data from the data lake. “You

Data

Data Technical Advisors Technical Review Artificial Inteligence

Firebolt, a data warehouse startup, raises $100M at a $1.4B valuation for faster, cheaper analytics on large data sets

TechCrunch

JANUARY 26, 2022

Firebolt’s pitch is that it has built a SQL-based architecture that handles this challenge better than anything that has come before it, using new techniques in compression that can connect data lakes and result in smaller cloud capacity requirements, resulting in lower costs and better performance, up to 182 times faster than that of other data (..)

Analytics

Analytics Data Big Data Business Intelligence

10 highest-paying IT jobs

CIO

APRIL 27, 2023

The demand for specialized skills has boosted salaries in cybersecurity, data, engineering, development, and program management. Solutions architect Solutions architects are responsible for building, developing, and implementing systems architecture within an organization, ensuring that they meet business or customer needs.

Technical Review

Technical Review Software Review Systems Review Software Engineering

Databand raises $14.5M led by Accel for its data pipeline observability tools

TechCrunch

DECEMBER 1, 2020

DevOps continues to get a lot of attention as a wave of companies develop more sophisticated tools to help developers manage increasingly complex architectures and workloads. “Users didn’t know how to organize their tools and systems to produce reliable data products.” million. .

Tools

Tools Data Weak Development Team Big Data

Deletion Vectors in Delta Live Tables: Identifying and Remediating Compliance Risks

Perficient

MARCH 27, 2025

This could provide both cost savings and performance improvements. Our Databricks Practice holds FinOps as a core architectural tenet, but sometimes compliance overrules cost savings. With a soft delete, deletion vectors are marked rather than physically removed, which is a performance boost.

Compliance

Compliance Systems Review Policies Storage

Avoiding Metadata Contention in Unity Catalog

Perficient

APRIL 7, 2025

Metadata contention in Unity Catalog can occur in high-throughput Databricks environments, slowing down user queries and impacting performance across the platform. Our Finops strategy shifts left on performance. This means that ever time you execute CREATE OR REPLACE TABLE , you are back to step one for performance optimization.

Performance

Performance Software Review Systems Review Exercises

Snowflake and Capgemini powering data and AI at scale

Capgemini

NOVEMBER 21, 2024

Snowflake and Capgemini powering data and AI at scale Capgemini October 13, 2020 Organizations slowed by legacy information architectures are modernizing their data and BI estates to achieve significant incremental value with relatively small capital investments. This evolution is also being driven by many industry factors.

Data

Data Government Innovation Architecture

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

AWS Machine Learning - AI

APRIL 1, 2025

With App Studio, technical professionals such as IT project managers, data engineers, enterprise architects, and solution architects can quickly develop applications tailored to their organizations needswithout requiring deep software development skills. Outside of work, Hao enjoys international traveling, exercising, and streaming.

AWS

AWS Software Review Technical Review Generative AI

Enterprise Data Warehouse: Concepts, Architecture, and Components

Altexsoft

OCTOBER 24, 2019

We will define how enterprise warehouses are different from the usual ones, what types of data warehouses exist, and how they work. The focus of this material is to provide information about the business value of each architectural and conceptual approach to building a warehouse. What is an Enterprise Data Warehouse?

Architecture

Architecture Enterprise Data Technical Review

What is data architecture? A framework to manage data

From legacy to lakehouse: Centralizing insurance data with Delta Lake

Webinars

Trending Sources

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

Webinars

Fundamentals of Data Engineering

Ready to transform how your IT organization drives business outcomes with AIOps?

What is Data Engineering: Explaining Data Pipeline, Data Warehouse, and Data Engineer Role

Cloudera Data Engineering 2021 Year End Review

Cloudera and AWS Partner to Deliver Cost-Efficient and Sustainable Infrastructure for AI and Analytics

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Data Scientist vs Data Engineer: Differences and Why You Need Both

The Modern Data Lakehouse: An Architectural Innovation

A Recap of the Data Engineering Open Forum at Netflix

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

Fueling the Future of GenAI with NiFi: Cloudera DataFlow 2.9 Delivers Enhanced Efficiency and Adaptability

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

What is DataOps? Collaborative, cross-functional analytics

How Automatic Liquid Clustering Supports Databricks FinOps at Scale

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Tools for generating deep neural networks with efficient network architectures

Porsche Carrera Cup Brasil gets real-time data boost

Big Data Engineer: Role, Responsibilities, and Job Description

SAP and Databricks: Better Together

Cloudera Data Engineering – Integration steps to leverage spark on Kubernetes

What is Data Engineer: Role Description, Responsibilities, Skills, and Background

Cloudera Data Warehouse Demonstrates Best-in-Class Cloud-Native Price-Performance

Snowflake Best Practices for Data Engineering

CIOs take note: Platform engineering teams are the future core of IT orgs

Building a Scalable Search Architecture

Building a vision for real-time artificial intelligence

Your technology architecture and engineering organization should coevolve as your startup grows

5 key areas for tech leaders to watch in 2020

You still don’t need a feature store

The rise of the data lakehouse: A new era of data value

Firebolt, a data warehouse startup, raises $100M at a $1.4B valuation for faster, cheaper analytics on large data sets

10 highest-paying IT jobs

Databand raises $14.5M led by Accel for its data pipeline observability tools

Deletion Vectors in Delta Live Tables: Identifying and Remediating Compliance Risks

Avoiding Metadata Contention in Unity Catalog

Snowflake and Capgemini powering data and AI at scale

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

Enterprise Data Warehouse: Concepts, Architecture, and Components

Stay Connected