Data Engineering, Development and Open Source

Prisma raises $40M for its open source ‘Rosetta stone’ for database languages

TechCrunch

MAY 3, 2022

When it comes to building databases and other backend software development, different organizations and developers do not always speak the same language. Its open-source-based Prisma ORM, launched last year, now has more than 150,000 developers using it for Node.js ”

Open Source

Open Source Programming Tools Development

The future of data: A 5-pillar approach to modern data management

CIO

DECEMBER 11, 2024

This approach is repeatable, minimizes dependence on manual controls, harnesses technology and AI for data management and integrates seamlessly into the digital product development process. Operational errors because of manual management of data platforms can be extremely costly in the long run.

Data

Data Technical Review Software Review Weak Development Team

Ducklake: A journey to integrate DuckDB with Unity Catalog

Xebia

OCTOBER 18, 2024

This summer, Databricks announced the open-sourcing of Unity Catalog. In this post, we’ll dive into how you can integrate DuckDB with the open-source Unity Catalog, walking you through our hands-on experience, sharing the setup process, and exploring both the opportunities and challenges of combining these two technologies.

Open Source

Open Source AWS Government Technical Review

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Heartex raises $25M for its AI-focused, open source data labeling platform

TechCrunch

MAY 18, 2022

Heartex, a startup that bills itself as an “open source” platform for data labeling, today announced that it landed $25 million in a Series A funding round led by Redpoint Ventures. ” Software developers Malyuk, Maxim Tkachenko, and Nikolay Lyubimov co-founded Heartex in 2019. Heartex’s dashboard.

Open Source

Open Source Weak Development Team Data Artificial Inteligence

How FiveStars re-engineered its data engineering stack

CIO

JANUARY 17, 2023

It shows in his reluctance to run his own servers but it’s perhaps most obvious in his attitude to data engineering, where he’s nearing the end of a five-year journey to automate or outsource much of the mundane maintenance work and focus internal resources on data analysis. It’s not a good use of our time either.”

Data Engineering

Data Engineering Engineering Data CTO Coach

Airbyte raises $5.2M for its open-source data integration platform

TechCrunch

MARCH 2, 2021

Airbyte , an open-source data integration platform, today announced that it has raised a $5.2 The company was co-founded by Michel Tricot, the former director of engineering and head of integrations at LiverRamp and RideOS, and John Lafleur, a serial entrepreneur who focuses on developer tools and B2B services.

Open Source

Open Source Data B2B Recruiting

IDC chief research officer: GenAI, from experimentation to adoption

CIO

DECEMBER 19, 2024

As the chief research officer at IDC, I lead a global team of analysts who develop research and provide advice to help our clients navigate the technology landscape. Fast forward to 2024, and our data shows that organizations have conducted an average of 37 proofs of concept, but only about five have moved into production.

Artificial Inteligence

Artificial Inteligence Research Artificial Intelligence Enterprise

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

AWS Machine Learning - AI

APRIL 23, 2025

This approach supports the broader goal of digital transformation, making sure that archival data can be effectively used for research, policy development, and institutional knowledge retention. In this post, we discuss how you can build an AI-powered document processing platform with open source NER and LLMs on SageMaker.

Artificial Inteligence

Artificial Inteligence Open Source AWS Serverless

What is data architecture? A framework to manage data

CIO

DECEMBER 20, 2024

Data streaming is data flowing continuously from a source to a destination for processing and analysis in real-time or near real-time. A container orchestration system, such as open-source Kubernetes, is often used to automate software deployment, scaling, and management. The Open Group Architecture Framework.

Architecture

Architecture Data Fractional CTO Technical Review

Iterative raises $20M for its MLOps platform

TechCrunch

JUNE 2, 2021

Iterative , an open-source startup that is building an enterprise AI platform to help companies operationalize their models, today announced that it has raised a $20 million Series A round led by 468 Capital and Mesosphere co-founder Florian Leibert. He noted that the industry has changed quite a bit since then. ”

Artificial Inteligence

Artificial Inteligence Machine Learning Open Source Data Engineering

Why generic marketing approaches don’t work on software developers

TechCrunch

OCTOBER 7, 2021

“Most of the technical content published misses the mark with developers. I think we can all do a better job,” author and developer marketing expert Adam DuVander says. DuVander was recommended to us by Karl Hughes, the CEO of Draft.dev, which specializes in content production for developer-focused companies.

Weak Development Team

Weak Development Team Software Development Marketing Technical Advisors

CloudQuery raises $15M to demystify your cloud infrastructure setup

TechCrunch

JUNE 22, 2022

CloudQuery CEO and co-founder Yevgeny Pats helped launch the startup because he needed a tool to give him visibility into his cloud infrastructure resources, and he couldn’t find one on the open market. He built his own SQL-based tool to help understand exactly what resources he was using, based on data engineering best practices.

Infrastructure

Infrastructure Cloud Open Source Data Engineering

Are you ready for MLOps? 🫵

Xebia

FEBRUARY 28, 2025

Gartner reported that on average only 54% of AI models move from pilot to production: Many AI models developed never even reach production. These days Data Science is not anymore a new domain by any means. In 2019 alone the Data Scientist job postings on Indeed rose by 256% [2]. First let’s throw in a statistic.

Technical Review

Technical Review Weak Development Team Artificial Inteligence Machine Learning

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

CIO

NOVEMBER 19, 2024

The core of their problem is applying AI technology to the data they already have, whether in the cloud, on their premises, or more likely both. Imagine that you’re a data engineer. You build your model, but the history and context of the data you used are lost, so there is no way to trace your model back to the source.

Artificial Inteligence

Artificial Inteligence Engineering Data Storage

DBeaver takes $6M seed investment to build on growing popularity

TechCrunch

APRIL 11, 2023

When DBeaver creator Serge Rider began building an open source database admin tool in 2013, he probably had no idea that 10 years later, it would boast more than 8 million users. So actually anyone who needs to work with data can use DBeaver,” she told TechCrunch.

Open Source

Open Source Database Administration Artificial Inteligence Machine Learning

Maintaining conventions in dbt projects with dbt-bouncer

Xebia

NOVEMBER 21, 2024

Regardless of location, documentation is a great starting point, writing down the outcome of discussions allows new developers to quickly get up to speed. But when the size of a dbt project grows, and the number of developers increases, then an automated approach is often the only scalable way forward. repos: - repo: [link] rev: v2.0.6

Weak Development Team

Weak Development Team Testing Analytics Engineering

Astronomer ready for its next mission after Datakin acquisition, $213M Series C

TechCrunch

MARCH 23, 2022

At that time, the scrappy data analytics company had scooped up $3.5 million in funding to develop its tool for what happens after you’ve collected a bunch of data, namely assembling and organizing it so the data can be analyzed. Data collection isn’t the problem: It’s what companies are doing with it.

Open Source

Open Source Data Engineering Strategic Planning Analytics

Rill wants to rethink BI dashboards with embedded database and instant UX

TechCrunch

AUGUST 4, 2022

While at Metamarkets, the company built a database, based on the open source Apache Druid project. Most BI tools are thin applications with no data engine of their own, and only as fast as the database they sit atop. The company also recently released a second product called Rill Developer, which is open source.

Open Source

Open Source Metrics Enterprise Business Intelligence

What is Data Engineering: Explaining Data Pipeline, Data Warehouse, and Data Engineer Role

Altexsoft

JUNE 25, 2019

If we look at the hierarchy of needs in data science implementations, we’ll see that the next step after gathering your data for analysis is data engineering. This discipline is not to be underestimated, as it enables effective data storing and reliable data flow while taking charge of the infrastructure.

Data Engineering

Data Engineering Engineering Data Artificial Inteligence

Airbyte launches a hosted version of its integration platform

TechCrunch

OCTOBER 12, 2021

Airbyte , the well-funded open source data integration startup, always made it easy for data teams to set up their ELT (extract, load and transform) pipelines, but until now, that meant self-hosting and managing the service, with all the complications that come with that.

Open Source

Open Source Virtualization Data Engineering Enterprise

Union.ai raises $10M to simplify AI and ML workflow orchestration

TechCrunch

APRIL 12, 2022

Union.ai , a startup emerging from stealth with a commercial version of the open source AI orchestration platform Flyte, today announced that it raised $10 million in a round contributed by NEA and “select” angel investors. We need to bridge both these worlds in a structured and repeatable way.”

Artificial Inteligence

Artificial Inteligence Machine Learning Open Source Biotech

Fueling the Future of GenAI with NiFi: Cloudera DataFlow 2.9 Delivers Enhanced Efficiency and Adaptability

Cloudera

DECEMBER 4, 2024

delivers on this need, providing enhancements that streamline development, boost efficiency, and empower organizations to build cutting-edge GenAI solutions. This release underscores Cloudera’s unwavering commitment to Apache NiFi and its vibrant open-source community. Boosting Developer Productivity DataFlow 2.9

Metrics

Metrics Generative AI Open Source Data Engineering

Capital Group invests big in talent development

CIO

JULY 29, 2022

That focus includes not only the firm’s customer-facing strategies but also its commitment to investing in the development of its employees, a strategy that is paying off, as evidenced by Capital Group’s No. The TREx program gave me the space to learn, develop, and customize an experience for my career development,” she says. “I

Groups

Groups Security Development Programming

A Recap of the Data Engineering Open Forum at Netflix

Netflix Tech

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. Netflix is not the only place where data engineers are solving challenging problems with creative solutions.

Data Engineering

Data Engineering Engineering Data Generative AI

thatDot launches Quine, a streaming graph engine

TechCrunch

FEBRUARY 23, 2022

Portland, Oregon-based startup thatDot , which focuses on streaming event processing, today announced the launch of Quine , a new MIT-licensed open source project for data engineers that combines event streaming with graph data to create what the company calls a “streaming graph.”

Engineering

Engineering Open Source Big Data Fintech

What is data science? Transforming data into value

CIO

APRIL 22, 2022

Data science is a method for gleaning insights from structured and unstructured data using approaches ranging from statistical analysis to machine learning. Data science gives the data collected by an organization a purpose. Data science vs. data analytics. Data science tools.

Data

Data Artificial Inteligence Machine Learning Analytics

No-code business intelligence service y42 raises $2.9M seed round

TechCrunch

MARCH 22, 2021

Like similar startups, y42 extends the idea data warehouse, which was traditionally used for analytics, and helps businesses operationalize this data. At the core of the service is a lot of open source and the company, for example, contributes to GitLabs’ Meltano platform for building data pipelines.

Business Intelligence

Business Intelligence Software Review B2B Analytics

Thinking of building your own AI agents? Don’t do it, advisors say

CIO

SEPTEMBER 19, 2024

Goldcast, a software developer focused on video marketing, has experimented with a dozen open-source AI models to assist with various tasks, says Lauren Creedon, head of product at the company. The company isn’t building its own discrete AI models but is instead harnessing the power of these open-source AIs.

CTO Coach

CTO Coach Artificial Inteligence Open Source Fractional CTO

Meroxa raises $15M Series A for its real-time data platform

TechCrunch

APRIL 13, 2021

Brown and Hamidi met during their time at Heroku, where Brown was a director of product management and Hamidi a lead software engineer. But while Heroku made it very easy for developers to publish their web apps, there wasn’t anything comparable in the highly fragmented database space.

Data

Data Software Engineering Open Source Engineering

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Netflix Tech

OCTOBER 28, 2021

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Data Engineering

Data Engineering Engineering Data Software Engineering

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning - AI

NOVEMBER 15, 2024

Principal sought to develop natural language processing (NLP) and question-answering capabilities to accurately query and summarize this unstructured data at scale. The solution: Principal AI Generative Experience with QnABot Principal began its development of an AI assistant by using the core question-answering capabilities in QnABot.

Generative AI

Generative AI AWS Groups Artificial Inteligence

How Much Should I Be Spending On Observability?

Honeycomb

APRIL 23, 2025

In short, observability costs are spiking because were gathering more signals and more data to describe our increasingly complex systems, and the telemetry data itself has gone from being an operational concern that only a few people care about to being an integral part of the development processsomething everyone has to care about.

Weak Development Team

Weak Development Team Metrics Storage Engineering

Integrate VSCode With Databricks To Build and Run Data Engineering Pipelines and Models

Dzone - DevOps

NOVEMBER 7, 2023

Databricks is a cloud-based platform designed to simplify the process of building data engineering pipelines and developing machine learning models.

Data Engineering

Data Engineering Engineering Artificial Inteligence Machine Learning

Tecton raises $100M, proving that the MLOps market is still hot

TechCrunch

JULY 12, 2022

But building data pipelines to generate these features is hard, requires significant data engineering manpower, and can add weeks or months to project delivery times,” Del Balso told TechCrunch in an email interview. Systems use features to make their predictions. “We are still in the early innings of MLOps.

Artificial Inteligence

Artificial Inteligence Machine Learning Marketing Data Engineering

SAP and Databricks: Better Together

Perficient

FEBRUARY 13, 2025

Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint in many large enterprises. However, SAP has an opaque data model.

Government

Government Open Source Machine Learning Artificial Inteligence

Predibase exits stealth with a low-code platform for building AI models

TechCrunch

MAY 10, 2022

But Piero Molino, the co-founder of AI development platform Predibase , says that inadequate tooling often exacerbates them. As a result, most machine learning tasks in an organization are bottlenecked on an oversubscribed centralized data science team,” Molino told TechCrunch via email. healthcare company.”

Artificial Inteligence

Artificial Inteligence Machine Learning Off-The-Shelf Training

Databand raises $14.5M led by Accel for its data pipeline observability tools

TechCrunch

DECEMBER 1, 2020

DevOps continues to get a lot of attention as a wave of companies develop more sophisticated tools to help developers manage increasingly complex architectures and workloads. “Users didn’t know how to organize their tools and systems to produce reliable data products.”

Tools

Tools Data Weak Development Team Big Data

RudderStack raises $56M for its customer data platform

TechCrunch

FEBRUARY 2, 2022

“We still see segments quite a bit in our competitive deals, but we have an extremely high win rate whenever the buyer persona is developers,” he said. We are thrilled to lead this round and join Souymadeb and his team as they build an amazing customer data platform and company.”. Image Credits: RudderStack.

Data

Data Artificial Inteligence Machine Learning Architecture

The 10 most in-demand IT jobs in finance

CIO

SEPTEMBER 2, 2022

The US financial services industry has fully embraced a move to the cloud, driving a demand for tech skills such as AWS and automation, as well as Python for data analytics, Java for developing consumer-facing apps, and SQL for database work. Back-end software engineer.

Software Engineering

Software Engineering Data Engineering DevOps AWS

The 10 most in-demand IT jobs in finance

CIO

AUGUST 31, 2022

The US financial services industry has fully embraced a move to the cloud, driving a demand for tech skills such as AWS and automation, as well as Python for data analytics, Java for developing consumer-facing apps, and SQL for database work. Back-end software engineer.

Software Engineering

Software Engineering Data Engineering DevOps AWS

Why Reinvent the Wheel? The Challenges of DIY Open Source Analytics Platforms

Cloudera

JULY 24, 2023

In their effort to reduce their technology spend, some organizations that leverage open source projects for advanced analytics often consider either building and maintaining their own runtime with the required data processing engines or retaining older, now obsolete, versions of legacy Cloudera runtimes (CDH or HDP).

Open Source

Open Source Analytics Software Review Metrics

Data observability startup Metaplane lands investment from YC, others

TechCrunch

JANUARY 10, 2023

But it’s not deterring Metaplane, a data observability startup founded by MIT graduate Kevin Hu (CEO), former HubSpot engineer Peter Casinelli and ex-Appcues developer Guru Mahendran in 2020. “Every day, executives are making decisions based on data that is incorrect. ” Image Credits: Metaplane.

Data

Data Software Review Technical Review Systems Review

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

AWS Machine Learning - AI

MARCH 13, 2025

When customers receive incoming calls at their call centers, MaestroQA employs its proprietary transcription technology, built by enhancing open source transcription models, to transcribe the conversations. Consequently, MaestroQA had to develop a solution capable of scaling to meet their clients extensive needs.

Generative AI

Generative AI CTO Coach AWS Artificial Inteligence

Specialized tools for machine learning development and model governance are becoming essential

O'Reilly Media - Ideas

APRIL 2, 2019

About 10 months ago, Databricks announced MLflow , a new open source project for managing machine learning development (full disclosure: Ben Lorica is an advisor to Databricks). We thought that given the lack of clear open source alternatives, MLflow had a decent chance of gaining traction, and this has proven to be the case.

Artificial Inteligence

Artificial Inteligence Machine Learning Government Tools

Prisma raises $40M for its open source ‘Rosetta stone’ for database languages

The future of data: A 5-pillar approach to modern data management

Ducklake: A journey to integrate DuckDB with Unity Catalog

Webinars

Heartex raises $25M for its AI-focused, open source data labeling platform

How FiveStars re-engineered its data engineering stack

Airbyte raises $5.2M for its open-source data integration platform

IDC chief research officer: GenAI, from experimentation to adoption

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

What is data architecture? A framework to manage data

Iterative raises $20M for its MLOps platform

Why generic marketing approaches don’t work on software developers

CloudQuery raises $15M to demystify your cloud infrastructure setup

Are you ready for MLOps? 🫵

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

DBeaver takes $6M seed investment to build on growing popularity

Maintaining conventions in dbt projects with dbt-bouncer

Astronomer ready for its next mission after Datakin acquisition, $213M Series C

Rill wants to rethink BI dashboards with embedded database and instant UX

What is Data Engineering: Explaining Data Pipeline, Data Warehouse, and Data Engineer Role

Airbyte launches a hosted version of its integration platform

Union.ai raises $10M to simplify AI and ML workflow orchestration

Fueling the Future of GenAI with NiFi: Cloudera DataFlow 2.9 Delivers Enhanced Efficiency and Adaptability

Capital Group invests big in talent development

A Recap of the Data Engineering Open Forum at Netflix

thatDot launches Quine, a streaming graph engine

What is data science? Transforming data into value

No-code business intelligence service y42 raises $2.9M seed round

Thinking of building your own AI agents? Don’t do it, advisors say

Meroxa raises $15M Series A for its real-time data platform

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

How Much Should I Be Spending On Observability?

Integrate VSCode With Databricks To Build and Run Data Engineering Pipelines and Models

Tecton raises $100M, proving that the MLOps market is still hot

SAP and Databricks: Better Together

Predibase exits stealth with a low-code platform for building AI models

Databand raises $14.5M led by Accel for its data pipeline observability tools

RudderStack raises $56M for its customer data platform

The 10 most in-demand IT jobs in finance

The 10 most in-demand IT jobs in finance

Why Reinvent the Wheel? The Challenges of DIY Open Source Analytics Platforms

Data observability startup Metaplane lands investment from YC, others

Revolutionizing customer service: MaestroQA’s integration with Amazon Bedrock for actionable insight

Specialized tools for machine learning development and model governance are becoming essential

Stay Connected