Remove Data Engineering Remove Demo Remove Storage
article thumbnail

Ducklake: A journey to integrate DuckDB with Unity Catalog

Xebia

Dbt is a popular tool for transforming data in a data warehouse or data lake. It enables data engineers and analysts to write modular SQL transformations, with built-in support for data testing and documentation. Jaffle Shop Demo To demonstrate our setup, we’ll use the jaffle_shop example.

article thumbnail

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 3: Productionization of ML models

Cloudera

In this last installment, we’ll discuss a demo application that uses PySpark.ML to make a classification model based off of training data stored in both Cloudera’s Operational Database (powered by Apache HBase) and Apache HDFS. In this demo, half of this training data is stored in HDFS and the other half is stored in an HBase table.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

What I have been working on: Modal

Erik Bernhardsson

We've been focusing a lot on machine learning recently, in particular model inference — Stable Diffusion is obviously the coolest thing right now, but we also support a wide range of other things: Using OpenAI's Whisper model for transcription , Dreambooth , object detection (with a webcam demo!). I will be posting a lot more about it!

CTO Coach 242
article thumbnail

Machine Learning with Python, Jupyter, KSQL and TensorFlow

Confluent

This blog post focuses on how the Kafka ecosystem can help solve the impedance mismatch between data scientists, data engineers and production engineers. Impedance mismatch between data scientists, data engineers and production engineers. For now, we’ll focus on Kafka.

article thumbnail

Forget the Rules, Listen to the Data

Hu's Place - HitachiVantara

For this reason, many financial institutions are converting their fraud detection systems to machine learning and advanced analytics and letting the data detect fraudulent activity. This will require another product for data governance. Data Preparation : Data integrationthat is intuitive and powerful.

Data 90
article thumbnail

Monitoring dbt model and test executions using Elementary Data

Xebia

This dashboard is in the form of one single HTML file, including all the required data in a base64 encoded json string. You can let Elementary automatically upload this dashboard file to object storage such as GCS , S3 , or Azure Blob. packages: - package: elementary-data/elementary version: 0.13.1

Testing 130
article thumbnail

Digital Transformation is a Data Journey From Edge to Insight

Cloudera

STEP 3: Monitor data throughput from each factory. With all the data now flowing into individual Kafka streams, a data architect is monitoring data throughput from each factory as well as adjusting compute and storage resources needed to make sure that each factory has the required throughput to send data into the platform.

Data 110