Applications, Data Engineering and Storage

What is data architecture? A framework to manage data

CIO

DECEMBER 20, 2024

Data architecture definition Data architecture describes the structure of an organizations logical and physical data assets, and data management resources, according to The Open Group Architecture Framework (TOGAF). An organizations data architecture is the purview of data architects. Cloud storage.

Architecture

Architecture Data Fractional CTO Technical Review

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

CIO

NOVEMBER 19, 2024

All industries and modern applications are undergoing rapid transformation powered by advances in accelerated computing, deep learning, and artificial intelligence. The next phase of this transformation requires an intelligent data infrastructure that can bring AI closer to enterprise data. Imagine that you’re a data engineer.

Artificial Inteligence

Artificial Inteligence Engineering Data Storage

What is a data engineer? An analytics role in high demand

CIO

SEPTEMBER 14, 2023

What is a data engineer? Data engineers design, build, and optimize systems for data collection, storage, access, and analytics at scale. They create data pipelines that convert raw data into formats usable by data scientists, data-centric applications, and other data consumers.

Data Engineering

Data Engineering Analytics Engineering Data

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

What is a data engineer? An analytics role in high demand

CIO

AUGUST 9, 2022

What is a data engineer? Data engineers design, build, and optimize systems for data collection, storage, access, and analytics at scale. They create data pipelines used by data scientists, data-centric applications, and other data consumers. The data engineer role.

Data Engineering

Data Engineering Analytics Engineering Data

Fundamentals of Data Engineering

Xebia

JANUARY 19, 2023

The following is a review of the book Fundamentals of Data Engineering by Joe Reis and Matt Housley, published by O’Reilly in June of 2022, and some takeaway lessons. This book is as good for a project manager or any other non-technical role as it is for a computer science student or a data engineer.

Data Engineering

Data Engineering Engineering Data Technical Review

See clearly, spend wisely: The power of data platform observability

Xebia

DECEMBER 23, 2024

A lack of monitoring might result in idle clusters running longer than necessary, overly broad data queries consuming excessive compute resources, or unexpected storage costs due to unoptimized data retention. Once the decision is made, inefficiencies can be categorized into two primary areas: compute and storage.

Data

Data Storage Culture Resources

See clearly, spend wisely: The power of data platform observability

Xebia

DECEMBER 23, 2024

A lack of monitoring might result in idle clusters running longer than necessary, overly broad data queries consuming excessive compute resources, or unexpected storage costs due to unoptimized data retention. Once the decision is made, inefficiencies can be categorized into two primary areas: compute and storage.

Data

Data Storage Culture Resources

Ducklake: A journey to integrate DuckDB with Unity Catalog

Xebia

OCTOBER 18, 2024

Dbt is a popular tool for transforming data in a data warehouse or data lake. It enables data engineers and analysts to write modular SQL transformations, with built-in support for data testing and documentation. This makes dbt a natural choice for the Ducklake setup.

Open Source

Open Source AWS Government Technical Review

What is Data Engineering: Explaining Data Pipeline, Data Warehouse, and Data Engineer Role

Altexsoft

JUNE 25, 2019

If we look at the hierarchy of needs in data science implementations, we’ll see that the next step after gathering your data for analysis is data engineering. This discipline is not to be underestimated, as it enables effective data storing and reliable data flow while taking charge of the infrastructure.

Data Engineering

Data Engineering Engineering Data Artificial Inteligence

Make the leap to Hybrid with Cloudera Data Engineering

Cloudera

FEBRUARY 14, 2022

When we introduced Cloudera Data Engineering (CDE) in the Public Cloud in 2020 it was a culmination of many years of working alongside companies as they deployed Apache Spark based ETL workloads at scale. It’s no longer driven by data volumes, but containerization, separation of storage and compute, and democratization of analytics.

Data Engineering

Data Engineering Engineering Data Storage

Cloudera Data Engineering 2021 Year End Review

Cloudera

DECEMBER 21, 2021

Since the release of Cloudera Data Engineering (CDE) more than a year ago , our number one goal was operationalizing Spark pipelines at scale with first class tooling designed to streamline automation and observability. Securing and scaling storage. Test Drive CDP Pubic Cloud.

Data Engineering

Data Engineering Technical Review Software Review Engineering

Build your gen AI–based text-to-SQL application using RAG, powered by Amazon Bedrock (Claude 3 Sonnet and Amazon Titan for embedding)

AWS Machine Learning - AI

MARCH 18, 2025

Today, generative AI can help bridge this knowledge gap for nontechnical users to generate SQL queries by using a text-to-SQL application. This application allows users to ask questions in natural language and then generates a SQL query for the users request. This can be overwhelming for nontechnical users who lack proficiency in SQL.

Artificial Inteligence

Artificial Inteligence Applications Generative AI Off-The-Shelf

What is a data architect? Skills, salaries, and how to become a data framework master

CIO

OCTOBER 13, 2023

Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes. Application data architect: The application data architect designs and implements data models for specific software applications.

Data

Data Data Engineering Database Administration Artificial Inteligence

Top 10 Highest Paying IT Jobs in India

The Crazy Programmer

NOVEMBER 6, 2021

A cloud architect has a profound understanding of storage, servers, analytics, and many more. Big Data Engineer. Another highest-paying job skill in the IT sector is big data engineering. And as a big data engineer, you need to work around the big data sets of the applications.

Artificial Inteligence

Artificial Inteligence Blockchain Software Review Artificial Intelligence

Data Scientist vs Data Engineer: Differences and Why You Need Both

Altexsoft

OCTOBER 30, 2021

If you’re an executive who has a hard time understanding the underlying processes of data science and get confused with terminology, keep reading. We will try to answer your questions and explain how two critical data jobs are different and where they overlap. Data science vs data engineering.

Data Engineering

Data Engineering Engineering Data Machine Learning

Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

AWS Machine Learning - AI

NOVEMBER 20, 2024

The solution combines data from an Amazon Aurora MySQL-Compatible Edition database and data stored in an Amazon Simple Storage Service (Amazon S3) bucket. Solution overview Amazon Q Business is a fully managed, generative AI-powered assistant that helps enterprises unlock the value of their data and knowledge.

Data

Data AWS Groups Knowledge Base

Introducing CDP Data Engineering: Purpose Built Tooling For Accelerating Data Pipelines

Cloudera

SEPTEMBER 17, 2020

With growing disparate data across everything from edge devices to individual lines of business needing to be consolidated, curated, and delivered for downstream consumption, it’s no wonder that data engineering has become the most in-demand role across businesses — growing at an estimated rate of 50% year over year.

Data Engineering

Data Engineering Engineering Data Tools

Integrating Key Vault Secrets with Azure Synapse Analytics

Apiumhub

DECEMBER 9, 2024

Azure Key Vault Secrets offers a centralized and secure storage alternative for API keys, passwords, certificates, and other sensitive statistics. Azure Key Vault is a cloud service that provides secure storage and access to confidential information such as passwords, API keys, and connection strings. What is Azure Key Vault Secret?

Azure

Azure Analytics Storage Machine Learning

How companies around the world apply machine learning

O'Reilly Media - Data

APRIL 3, 2018

Data Science and Machine Learning sessions will cover tools, techniques, and case studies. This year, we have many sessions on managing and deploying models to production, and applications of deep learning in enterprise applications. Stream Processing and Real-time Applications sessions. Data platforms.

Machine Learning

Machine Learning Artificial Inteligence Company Case Study

How Much Should I Be Spending On Observability?

Honeycomb

APRIL 23, 2025

download Model-specific cost drivers: the pillars model vs consolidated storage model (observability 2.0) All of the observability companies founded post-2020 have been built using a very different approach: a single consolidated storage engine, backed by a columnar store. and observability 2.0. understandably). moving forward.

Weak Development Team

Weak Development Team Metrics Storage Engineering

Is the modern data stack just old wine in a new bottle?

TechCrunch

NOVEMBER 4, 2022

I know this because I used to be a data engineer and built extract-transform-load (ETL) data pipelines for this type of offer optimization. Part of my job involved unpacking encrypted data feeds, removing rows or columns that had missing data, and mapping the fields to our internal data models.

Data

Data Storage Analytics Data Engineering

Cloudera Operational Database application development concepts

Cloudera

FEBRUARY 9, 2021

In this blog post, we’ll look at both Apache HBase and Apache Phoenix concepts relevant to developing applications for Cloudera Operational Database. But first, let’s look at the different form factors in which Cloudera Operational Database is available to developers: Public cloud: CDP Data Hub Operational Database template .

Applications

Applications Development Storage Database Administration

Inferencing holds the clues to AI puzzles

CIO

APRIL 10, 2024

As with many data-hungry workloads, the instinct is to offload LLM applications into a public cloud, whose strengths include speedy time-to-market and scalability. Without data, Holmes’ argument proceeds, one can twist facts to suit their theories, rather than use theories to suit facts.

Artificial Inteligence

Artificial Inteligence Generative AI Storage Artificial Intelligence

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

AWS Machine Learning - AI

APRIL 23, 2025

Solution overview The NER & LLM Gen AI Application is a document processing solution built on AWS that combines NER and LLMs to automate document analysis at scale. Multiple specialized Amazon Simple Storage Service Buckets (Amazon S3 Bucket) store different types of outputs. Click here to open the AWS console and follow along.

Artificial Inteligence

Artificial Inteligence Open Source AWS Serverless

Simplifying machine learning lifecycle management

O'Reilly Media - Data

AUGUST 16, 2018

Today’s data science and data engineering teams work with a variety of machine learning libraries, data ingestion, and data storage technologies. Risk and compliance considerations mean that the ability to reproduce machine learning workflows is essential to meet audits in certain application domains.

Machine Learning

Machine Learning Artificial Inteligence Data Engineering Storage

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Cloudera

OCTOBER 11, 2021

Modak, a leading provider of modern data engineering solutions, is now a certified solution partner with Cloudera. Customers can now seamlessly automate migration to Cloudera’s Hybrid Data Platform — Cloudera Data Platform (CDP) to dynamically auto-scale cloud services with Cloudera Data Engineering (CDE) integration with Modak Nabu.

Data Engineering

Data Engineering Engineering Data Cloud

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Netflix Tech

OCTOBER 28, 2021

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Data Engineering

Data Engineering Engineering Data Software Engineering

Big Data Engineer: Role, Responsibilities, and Job Description

Altexsoft

AUGUST 25, 2020

That’s why a data specialist with big data skills is one of the most sought-after IT candidates. Data Engineering positions have grown by half and they typically require big data skills. Data engineering vs big data engineering. This greatly increases data processing capabilities.

Big Data

Big Data Data Engineering Engineering Data

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, such as AI21 Labs, Anthropic, Cohere, Meta, Mistral, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI.

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Optimizing Cloudera Data Engineering Autoscaling Performance

Cloudera

SEPTEMBER 2, 2021

The shift to cloud has been accelerating, and with it, a push to modernize data pipelines that fuel key applications. That is why cloud native solutions which take advantage of the capabilities such as disaggregated storage & compute, elasticity, and containerization are more paramount than ever.

Data Engineering

Data Engineering Performance Engineering Data

Optimizing data warehouse storage

Netflix Tech

DECEMBER 21, 2020

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage

Storage Data Resources Data Engineering

Enabling Multi-User Fine-Grained Access Control for Cloud Storage in CDP

Cloudera

SEPTEMBER 10, 2021

Shared Data Experience ( SDX ) on Cloudera Data Platform ( CDP ) enables centralized data access control and audit for workloads in the Enterprise Data Cloud. The public cloud (CDP-PC) editions default to using cloud storage (S3 for AWS, ADLS-gen2 for Azure).

Storage

Storage Cloud Azure Pharmaceuticals

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI.

Generative AI

Generative AI AWS Groups Artificial Inteligence

Analytics Maturity Model: Levels, Technologies, and Applications

Altexsoft

DECEMBER 9, 2020

We will describe each level from the following perspectives: differences on the operational level; analytics tools companies use to manage and analyze data; business intelligence applications in real life; challenges to overcome and key changes that lead to transition. the specialists, tools, and applications of Descriptive analytics.

Analytics

Analytics Technical Review Technology Applications

Matillion raises $150M at a $1.5B valuation for its low-code approach to integrating disparate data sources

TechCrunch

SEPTEMBER 15, 2021

The company currently has “hundreds” of large enterprise customers, including Western Union, FOX, Sony, Slack, National Grid, Peet’s Coffee and Cisco for projects ranging from business intelligence and visualization through to artificial intelligence and machine learning applications.

Artificial Inteligence

Artificial Inteligence Data Weak Development Team Artificial Intelligence

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 3: Productionization of ML models

Cloudera

JANUARY 20, 2021

In this last installment, we’ll discuss a demo application that uses PySpark.ML to make a classification model based off of training data stored in both Cloudera’s Operational Database (powered by Apache HBase) and Apache HDFS. Afterwards, this model is then scored and served through a simple Web Application. Serving The Model

Machine Learning

Machine Learning Artificial Inteligence Applications Data

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 1: The Set-Up & Basics

Cloudera

JANUARY 6, 2021

Python is used extensively among Data Engineers and Data Scientists to solve all sorts of problems from ETL/ELT pipelines to building machine learning models. Apache HBase is an effective data storage system for many workflows but accessing this data specifically through Python can be a struggle.

Machine Learning

Machine Learning Artificial Inteligence Data Applications

CIOs take note: Platform engineering teams are the future core of IT orgs

CIO

JUNE 19, 2024

BSH’s previous infrastructure and operations teams, which supported the European appliance manufacturer’s application development groups, simply acted as suppliers of infrastructure services for the software development organizations. If we have a particular type of outage, our observability tool can also restart the application.”

Weak Development Team

Weak Development Team Engineering UI/UX Software Development

Databand raises $14.5M led by Accel for its data pipeline observability tools

TechCrunch

DECEMBER 1, 2020

And as data workloads continue to grow in size and use, they continue to become ever more complex. On top of that, today there are a wide range of applications and platforms that a typical organization will use to manage source material, storage, usage and so on.

Tools

Tools Data Weak Development Team Big Data

The 10 most in-demand IT jobs in finance

CIO

SEPTEMBER 2, 2022

In the finance industry, software engineers are often tasked with assisting in the technical front-end strategy, writing code, contributing to open-source projects, and helping the company deliver customer-facing services. Data engineer.

Software Engineering

Software Engineering Data Engineering DevOps AWS

The 10 most in-demand IT jobs in finance

CIO

AUGUST 31, 2022

In the finance industry, software engineers are often tasked with assisting in the technical front-end strategy, writing code, contributing to open-source projects, and helping the company deliver customer-facing services. Data engineer.

Software Engineering

Software Engineering Data Engineering DevOps AWS

Hire Big Data Engineer: Salaries, Stack and Roles

Mobilunity

AUGUST 3, 2021

The cloud offers excellent scalability, while graph databases offer the ability to display incredible amounts of data in a way that makes analytics efficient and effective. Who is Big Data Engineer? Big Data requires a unique engineering approach. Big Data Engineer vs Data Scientist.

Big Data

Big Data Data Engineering Engineering Data

Enhancing the Business Strategy with Data Engineering Solutions

Trigent

JUNE 20, 2022

To do this, they are constantly looking to partner with experts who can guide them on what to do with that data. This is where data engineering services providers come into play. Data engineering consulting is an inclusive term that encompasses multiple processes and business functions.

Data Engineering

Data Engineering Engineering Data Strategy

Unlocking the Power of AI with a Real-Time Data Strategy

CIO

FEBRUARY 14, 2023

Streaming data technologies unlock the ability to capture insights and take instant action on data that’s flowing into your organization; they’re a building block for developing applications that can respond in real-time to user actions, security threats, or other events. The application will contain ML mathematical algorithms.

Artificial Inteligence

Artificial Inteligence Strategy Data Machine Learning

What is data architecture? A framework to manage data

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

Webinars

Trending Sources

What is a data engineer? An analytics role in high demand

Webinars

What is a data engineer? An analytics role in high demand

Fundamentals of Data Engineering

See clearly, spend wisely: The power of data platform observability

See clearly, spend wisely: The power of data platform observability

Ducklake: A journey to integrate DuckDB with Unity Catalog

What is Data Engineering: Explaining Data Pipeline, Data Warehouse, and Data Engineer Role

Make the leap to Hybrid with Cloudera Data Engineering

Cloudera Data Engineering 2021 Year End Review

Build your gen AI–based text-to-SQL application using RAG, powered by Amazon Bedrock (Claude 3 Sonnet and Amazon Titan for embedding)

What is a data architect? Skills, salaries, and how to become a data framework master

Top 10 Highest Paying IT Jobs in India

Data Scientist vs Data Engineer: Differences and Why You Need Both

Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

Introducing CDP Data Engineering: Purpose Built Tooling For Accelerating Data Pipelines

Integrating Key Vault Secrets with Azure Synapse Analytics

How companies around the world apply machine learning

How Much Should I Be Spending On Observability?

Is the modern data stack just old wine in a new bottle?

Cloudera Operational Database application development concepts

Inferencing holds the clues to AI puzzles

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Simplifying machine learning lifecycle management

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Big Data Engineer: Role, Responsibilities, and Job Description

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Optimizing Cloudera Data Engineering Autoscaling Performance

Optimizing data warehouse storage

Enabling Multi-User Fine-Grained Access Control for Cloud Storage in CDP

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

Analytics Maturity Model: Levels, Technologies, and Applications

Matillion raises $150M at a $1.5B valuation for its low-code approach to integrating disparate data sources

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 3: Productionization of ML models

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 1: The Set-Up & Basics

CIOs take note: Platform engineering teams are the future core of IT orgs

Databand raises $14.5M led by Accel for its data pipeline observability tools

The 10 most in-demand IT jobs in finance

The 10 most in-demand IT jobs in finance

Hire Big Data Engineer: Salaries, Stack and Roles

Enhancing the Business Strategy with Data Engineering Solutions

Unlocking the Power of AI with a Real-Time Data Strategy

Stay Connected