article thumbnail

Datafold raises seed from NEA to keep improving the lives of data engineers

TechCrunch

Data engineering is one of these new disciplines that has gone from buzzword to mission critical in just a few years. As data has exploded, so has their challenge of doing this key work, which is why a new set of tools has arrived to make data engineering easier, faster and better than ever.

article thumbnail

Ducklake: A journey to integrate DuckDB with Unity Catalog

Xebia

Dbt is a popular tool for transforming data in a data warehouse or data lake. It enables data engineers and analysts to write modular SQL transformations, with built-in support for data testing and documentation. Jaffle Shop Demo To demonstrate our setup, we’ll use the jaffle_shop example.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The best way to start an AI project? Don’t think about the models

TechCrunch

.” For example, a factory that wishes to embed smart fault inspection on a production assembly line will be able to demo the AI project pretty fast by using a single camera on a machine for a few minutes. This will require many months or even years to bring the value the AI provides in the demo across the finish line.

article thumbnail

Simplify your workflow deployment with Databricks Asset Bundles: Part I

Xebia

Databricks Asset Bundles: How We’ll work on a demo use case to show the power of bundles. You must build a data ingestion app. In our demo it will contain the.whl files related to our Python wheel package being deployed. Alternatively, you could deploy manually, but that was error-prone and hard to maintain long-term.

Resources 130
article thumbnail

Hightouch raises $2.1M to help businesses get more value from their data warehouses

TechCrunch

There’s no industry term for that yet, but we really believe that that’s the future of where data engineering is going. Hightouch originally raised its round after its participation in the Y Combinator demo day but decided not to disclose it until it felt like it had found the right product/market fit.

Data 251
article thumbnail

No-code business intelligence service y42 raises $2.9M seed round

TechCrunch

Given his background, it’s maybe no surprise that y42’s focus is on making life easier for data engineers and, at the same time, putting the power of these platforms in the hands of business analysts. y42 is a powerful single source of truth for data experts and non-data experts alike.

article thumbnail

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 3: Productionization of ML models

Cloudera

In this last installment, we’ll discuss a demo application that uses PySpark.ML to make a classification model based off of training data stored in both Cloudera’s Operational Database (powered by Apache HBase) and Apache HDFS. In this demo, half of this training data is stored in HDFS and the other half is stored in an HBase table.