Open Source and Training

Cost, security, and flexibility: the business case for open source gen AI

CIO

DECEMBER 11, 2024

To solve the problem, the company turned to gen AI and decided to use both commercial and open source models. With security, many commercial providers use their customers data to train their models, says Ringdahl. So we augment with open source, he says. Its possible to opt-out, but there are caveats.

Open Source

Open Source Artificial Inteligence Technical Review Software Review

Meta offers Llama AI to US government for national security

CIO

NOVEMBER 5, 2024

The move relaxes Meta’s acceptable use policy restricting what others can do with the large language models it develops, and brings Llama ever so slightly closer to the generally accepted definition of open-source AI. As long as Meta keeps the training data confidential, CIOs need not be concerned about data privacy and security.

Government

Government Open Source Artificial Inteligence Training

OpenAI open-sources Whisper, a multilingual speech recognition system

TechCrunch

SEPTEMBER 22, 2022

In a step toward solving it, OpenAI today open-sourced Whisper, an automatic speech recognition system that the company claims enables “robust” transcription in multiple languages as well as translation from those languages into English. Speech recognition remains a challenging problem in AI and machine learning.

Open Source

Open Source System Artificial Inteligence Machine Learning

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

AI coding agents come with legal risk

CIO

NOVEMBER 8, 2024

Media outlets and entertainers have already filed several AI copyright cases in US courts, with plaintiffs accusing AI vendors of using their material to train AI models or copying their material in outputs, notes Jeffrey Gluck, a lawyer at IP-focused law firm Panitch Schwarze. How was the AI trained?

Software Review

Software Review Open Source Technical Review Artificial Inteligence

Speckle snags $5.5M seed to build open source platform for 3D drawings

TechCrunch

APRIL 14, 2022

The founders of Speckle , an early-stage startup based in London, are both trained architects and engineers, probably a rare combination. They wanted to make it easier by building an open source platform to exchange and collaborate on these files. ” The two founders began looking at this problem in 2015.

Open Source

Open Source 3D Construction Architecture

How we dodged risks and raised millions for our open-source machine language startup

TechCrunch

APRIL 9, 2021

Jorge Torres is CEO and co-founder of MindsDB , an open source AI layer for existing databases. Adam Carrigan is a co-founder and COO of MindsDB , an open source AI layer for existing databases. Open-source software gave birth to a slew of useful software in recent years. Contributor. Share on Twitter.

Open Source

Open Source Artificial Inteligence Machine Learning Linux

Qdrant, an open source vector database startup, wants to help AI developers leverage unstructured data

TechCrunch

APRIL 19, 2023

For many, ChatGPT and the generative AI hype train signals the arrival of artificial intelligence into the mainstream. Just last year, a similar proposition to Qdrant called Pinecone nabbed $28 million , though Zayarni considers Qdrant’s open source foundation as a major selling point for would-be customers.

Open Source

Open Source Artificial Inteligence Development Data

Together raises $20M to build open source generative AI models

TechCrunch

MAY 15, 2023

With Together, Prakash, Zhang, Re and Liang are seeking to create open source generative AI models and services that, in their words, “help organizations incorporate AI into their production applications.” The number of open source models both from community groups and large labs grows by the day , practically.

Open Source

Open Source Generative AI ChatGPT Hardware

10 things to watch out for with open source gen AI

CIO

MAY 15, 2024

Even if you don’t have the training data or programming chops, you can take your favorite open source model, tweak it, and release it under a new name. According to Stanford’s AI Index Report, released in April, 149 foundation models were released in 2023, two-thirds of them open source.

Open Source

Open Source Weak Development Team Artificial Inteligence Training

New physics sim trains robots 430,000 times faster than reality

Ooda Loop

DECEMBER 20, 2024

On Thursday, a large group of university and private industry researchers unveiled Genesis, a new open source computer simulation system that lets robots practice tasks in simulated reality 430,000 times faster than in the real world. Researchers can also use an AI agent to generate 3D physics simulations from text prompts.

Training

Training 3D Open Source Research

Hugging Face and ServiceNow launch BigCode, a project to open source code-generating AI systems

TechCrunch

SEPTEMBER 27, 2022

But so far, only a handful of such AI systems have been made freely available to the public and open sourced — reflecting the commercial incentives of the companies building them. billion parameters) — using ServiceNow’s in-house graphics card cluster.

Open Source

Open Source Generative AI Artificial Inteligence System

Iterative launches MLEM, an open-source tool to simplify ML model deployment

TechCrunch

JUNE 1, 2022

MLOps platform Iterative , which announced a $20 million Series A round almost exactly a year ago, today launched MLEM, an open-source git-based machine learning model management and deployment tool. For highly regulated industries, a system like this also offers a single source of truth for figuring out the lineage of a given model.

Open Source

Open Source Tools Artificial Inteligence Machine Learning

The EU’s AI Act could have a chilling effect on open source efforts, experts warn

TechCrunch

SEPTEMBER 6, 2022

The nonpartisan think tank Brookings this week published a piece decrying the bloc’s regulation of open source AI, arguing it would create legal liability for general-purpose AI systems while simultaneously undermining their development. “In the end, the [E.U.’s] “In the end, the [E.U.’s]

Open Source

Open Source Artificial Inteligence Innovation Government

Reduce ML training costs with Amazon SageMaker HyperPod

AWS Machine Learning - AI

APRIL 10, 2025

Training a frontier model is highly compute-intensive, requiring a distributed system of hundreds, or thousands, of accelerated instances running for several weeks or months to complete a single job. For example, pre-training the Llama 3 70B model with 15 trillion training tokens took 6.5 During the training of Llama 3.1

Training

Training Artificial Inteligence Hardware Systems Review

Getting the most out of open source without sacrificing security

CIO

NOVEMBER 13, 2023

Open source has seen a great deal of momentum among mainframers, making collaboration easier and providing greater transparency. But for all of its benefits, open source is not without risks. By its very nature, open-source code is accessible to whoever wants to see it—including potential attackers.

Open Source

Open Source Survey Software Report

Meet Crowd.dev, an open source user-led growth platform for fostering developer communities

TechCrunch

NOVEMBER 1, 2022

Now, another new company has entered the community-led growth fray with a slightly different approach to the existing players, one focused on developer communities and with open source at its core. The open source factor. Transitioning to an open source platform may hold other benefits, too. million ($2.2

Open Source

Open Source Meeting Development Software Review

TechCrunch+ roundup: Minimizing M&A mayhem, cybersecurity PM checklist, open source AI

TechCrunch

MAY 12, 2023

Given the tremendous barrier to entry, is it worth considering whether open source foundation models could level the playing field and also address concerns about privacy and bias?

Open Source

Open Source Software Review Windows Metrics

Heartex raises $25M for its AI-focused, open source data labeling platform

TechCrunch

MAY 18, 2022

Heartex, a startup that bills itself as an “open source” platform for data labeling, today announced that it landed $25 million in a Series A funding round led by Redpoint Ventures. When asked, Heartex says that it doesn’t collect any customer data and open sources the core of its labeling platform for inspection.

Open Source

Open Source Weak Development Team Data Artificial Inteligence

Google expands program to help train the formerly incarcerated

TechCrunch

JUNE 1, 2022

Last April, Google launched Grow with Google Career Readiness for Reentry, a program created in partnership with nonprofits to offer job readiness and digital skills training for formerly incarcerated individuals. ” Meanwhile, Google.org, Google’s charitable arm, will provide $4.25 ”

Training

Training Programming Nonprofit Technical Review

Together lands $102.5M investment to grow its cloud for training generative AI

TechCrunch

NOVEMBER 29, 2023

Generative AI companies continue to raise huge amounts of capital to fuel their commercial — and, in some cases, open source — ambitions. See Together, a startup creating open source generative AI and AI model development infrastructure, which today announced that it closed a $102.5

Generative AI

Generative AI Training Open Source Cloud

This startup wants to train art-generating AI strictly on licensed images

TechCrunch

APRIL 13, 2023

Two companies behind popular AI art tools, Midjourney and Stability AI, are entangled in a legal case that alleges they infringed on the rights of millions of artists by training their tools on web-scraped images. Bria isn’t the only venture exploring a revenue-sharing business model for generative AI.

Generative AI

Generative AI Training Open Source Enterprise

Patients may suffer from hallucinations of AI medical transcription tools

CIO

OCTOBER 28, 2024

In these cases, the AI sometimes fabricated unrelated phrases, such as “Thank you for watching!” — likely due to its training on a large dataset of YouTube videos. million downloads on the open-source AI platform HuggingFace in the past month, Whisper has become one of the most popular speech recognition models. With over 4.2

Tools

Tools Study Healthcare Research

Artificial Intelligence in practice

CIO

NOVEMBER 1, 2024

With those tools involved, users can build new AI models on relatively low-powered machines, saving heavy-duty units for the compute-intensive process of model training. In other cases, the model might scan and process open-source data.

Artificial Inteligence

Artificial Inteligence Artificial Intelligence Open Source Machine Learning

V7 snaps up $33M to automate training data for computer vision AI models

TechCrunch

NOVEMBER 28, 2022

It’s only as good as the models and data used to train it, so there is a need for sourcing and ingesting ever-larger data troves. But annotating and manipulating that training data takes a lot of time and money, slowing down the work or overall effectiveness, and maybe both. V7’s specific USP is automation.

Training

Training Data Technical Review Artificial Inteligence

Explosion snags $6M on $120M valuation to expand machine learning platform

TechCrunch

SEPTEMBER 2, 2021

Explosion , a company that has combined an open source machine learning library with a set of commercial developer tools, announced a $6 million Series A today on a $120 million valuation. Since then, that open source project has been downloaded over 40 million times. .

Artificial Inteligence

Artificial Inteligence Machine Learning Open Source Training

Cybersecurity Snapshot: OpenSSF Unveils Framework for Securing Open Source Projects, While IT-ISAC Says AI Makes Ransomware Stealthier

Tenable

FEBRUARY 28, 2025

Check out a new framework for better securing open source projects. 1 - New cybersecurity framework for open source projects Heres the latest industry effort aimed at boosting open-source software security. Plus, learn how AI is making ransomware harder to detect and mitigate.

Open Source

Open Source Software Review Systems Review Groups

Navigating the future of national tech independence with sovereign AI

CIO

MARCH 31, 2025

There are two main considerations associated with the fundamentals of sovereign AI: 1) Control of the algorithms and the data on the basis of which the AI is trained and developed; and 2) the sovereignty of the infrastructure on which the AI resides and operates.

Technical Review

Technical Review Artificial Inteligence Compliance Open Source

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

CIO

NOVEMBER 19, 2024

You pull an open-source large language model (LLM) to train on your corporate data so that the marketing team can build better assets, and the customer service team can provide customer-facing chatbots. And all of that data is stored on premises, but your training is taking place on the cloud where your GPUs live.

Artificial Inteligence

Artificial Inteligence Engineering Data Storage

Stability AI, the startup behind Stable Diffusion, raises $101M

TechCrunch

OCTOBER 17, 2022

Stability AI, the company funding the development of open source music- and image-generating systems like Dance Diffusion and Stable Diffusion , today announced that it raised $101 million in a funding round led by Coatue and Lightspeed Venture Partners with participation from O’Shaughnessy Ventures LLC. Image Credits: Daniel Jeffries.

Open Source

Open Source Training AWS System

5 Things To Look For When Evaluating AI Startups

Crunchbase News

NOVEMBER 13, 2024

Different ways to customize an LLM include fine-tuning an off-the-shelf model or building a custom one using an open-source LLM like Meta ’s Llama. Vertical-specific training data Does the startup have access to a large volume of proprietary, vertical-specific data to train its LLMs?

Artificial Inteligence

Artificial Inteligence UI/UX Off-The-Shelf Generative AI

Spawning lays out plans for letting creators opt out of generative AI training

TechCrunch

MAY 3, 2023

The legal spats between artists and the companies training AI on their artwork show no sign of abating. Generative AI models “learn” to create art, code and more by “training” on sample images and text, usually scraped indiscriminately from the web. By late April, that figure had eclipsed 1 billion.

Generative AI

Generative AI Training Software Review Technical Review

Emerge Career’s tech-forward job training lets incarcerated folks hit the road on release

TechCrunch

AUGUST 16, 2022

Reentering society after years in prison is difficult for many reasons, among which perhaps the most prosaic is simply that it’s hard to get a job — and what training and transition programs exist are far from sufficient. “There’s already money for training, but it’s under-utilized,” explained Saruhashi.

Training

Training Video Budget Education

Activeloop snags $5M seed to build streaming database for AI applications

TechCrunch

NOVEMBER 2, 2021

. “[We are] introducing a database for AI, specifically a storage layer that helps to very efficiently store the data and then stream this to machine learning applications or training models to do computer vision, audio processing, NLP (natural language processing) and so on,” Buniatyan explained. Activeloop image database.

Artificial Inteligence

Artificial Inteligence Part-Time VPE Applications Open Source

Stability AI releases ChatGPT-like language models

TechCrunch

APRIL 19, 2023

Stability AI , the startup behind the generative AI art tool Stable Diffusion , today open-sourced a suite of text-generating AI models intended to go head to head with systems like OpenAI’s GPT-4. But Stability AI claims it created a custom training set that expands the size of the standard Pile by 3x. make up) facts.

ChatGPT

ChatGPT Open Source Generative AI Artificial Inteligence

Radar Trends to Watch: November 2024

O'Reilly Media - Ideas

NOVEMBER 5, 2024

Google is open-sourcing SynthID, a system for watermarking text so AI-generated documents can be traced to the LLM that generated them. Unlike many of Mistral’s previous small models, these are not open source. This model is based on the open source Llama, and it’s relatively small (70B parameters).

Artificial Inteligence

Artificial Inteligence Trends Software Review Open Source

Webiny nabs $3.5M seed to build serverless development framework on top of serverless CMS

TechCrunch

AUGUST 18, 2021

Webiny , an early-stage startup that launched in 2019 with an open-source, serverless CMS, had also developed a framework to help build the CMS, and found that customers were also interested in that to help build their own serverless apps. Webiny announces $348K seed to build open-source serverless CMS.

Serverless

Serverless Weak Development Team Open Source Development

Cybersecurity Snapshot: Memory Bugs Pervasive in Open Source SW, While Car Dealership Chaos Persists After Ransomware Attack

Tenable

JUNE 28, 2024

Check out why memory vulnerabilities are widespread in open source projects. The agencies analyzed 172 projects that the Open Source Security Foundation has identified as being critically important in the open source ecosystem. And learn how confidential data from U.S. And much more!

Security

Security Open Source Chemicals Artificial Inteligence

9 Best AI Tools for Programming Assistance in 2024

The Crazy Programmer

JUNE 14, 2024

It uses OpenAI’s Codex, a language model trained on a vast amount of code from public repositories on GitHub. Cons Privacy Concerns : Since it is trained on public repositories, there may be concerns about code privacy and intellectual property. Open Source : Being open-source, it is freely available for use and customization.

Programming

Programming Tools Software Review Artificial Inteligence

German startup Kern AI nabs seed funding for modular NLP development platform

TechCrunch

FEBRUARY 16, 2023

Natural language processing ( NLP ), while hardly a new discipline, has catapulted into the public consciousness these past few months thanks in large part to the generative AI hype train that is ChatGPT. The company also says that its basic open source incarnation has been used by data scientists at companies such as Samsung and DocuSign.

Development

Development Open Source ChatGPT Training

Top 11 LLM Tools That Ensure Smooth LLM Operations

Openxcell

JANUARY 20, 2025

LLM or large language models are deep learning models trained on vast amounts of linguistic data so they understand and respond in natural language (human-like texts). It is an open-source model that offers extensive fine-tuning capabilities using reinforcement learning (based on human response).

Artificial Inteligence

Artificial Inteligence Tools Open Source Architecture

5 ways to improve mental health for software developers

TechCrunch

NOVEMBER 11, 2021

Lorna Mitchell is head of Developer Relations at Aiven , a software company that combines the best open source technologies with cloud infrastructure. Some companies offer generous training budgets or time off. Many developers give much of their time to open source projects. Lessons from open source.

Software Development

Software Development Weak Development Team Software Development

AI-tool maker Seldon raises £7.1M Series A from AlbionVC and Cambridge Innovation Capital

TechCrunch

NOVEMBER 17, 2020

Key to its success is that its open-source project Seldon Core has more than 700,000 models deployed to date, drastically reducing friction for users deploying ML models. Seldon has been able to build an impressive open-source community and add immediate productivity value to some of the world’s leading companies.”

Innovation

Innovation Artificial Inteligence Machine Learning Tools

12 AI predictions for 2025

CIO

DECEMBER 30, 2024

Weve also seen the emergence of agentic AI, multi-modal AI, reasoning AI, and open-source AI projects that rival those of the biggest commercial vendors. Developers must comply by the start of 2026, meaning theyll have a little over a year to put systems in place to track the provenance of their training data.

Fractional CTO

Fractional CTO Software Development CTO Coach Architecture

Onehouse is building a neutral data lake integration layer on top of Apache Hudi

TechCrunch

FEBRUARY 2, 2023

Onehouse emerged last year with a cloud data lake product built on top of the open source Apache Hudi project. Company founder and CEO Vinoth Chandar came up with the idea for Hudi while he was an engineer at Uber in 2016, and eventually decided to start a company based on the open source project.

Open Source

Open Source Data Engineering Cloud

Cost, security, and flexibility: the business case for open source gen AI

Meta offers Llama AI to US government for national security

Webinars

Trending Sources

OpenAI open-sources Whisper, a multilingual speech recognition system

Webinars

AI coding agents come with legal risk

Speckle snags $5.5M seed to build open source platform for 3D drawings

How we dodged risks and raised millions for our open-source machine language startup

Qdrant, an open source vector database startup, wants to help AI developers leverage unstructured data

Together raises $20M to build open source generative AI models

10 things to watch out for with open source gen AI

New physics sim trains robots 430,000 times faster than reality

Hugging Face and ServiceNow launch BigCode, a project to open source code-generating AI systems

Iterative launches MLEM, an open-source tool to simplify ML model deployment

The EU’s AI Act could have a chilling effect on open source efforts, experts warn

Reduce ML training costs with Amazon SageMaker HyperPod

Getting the most out of open source without sacrificing security

Meet Crowd.dev, an open source user-led growth platform for fostering developer communities

TechCrunch+ roundup: Minimizing M&A mayhem, cybersecurity PM checklist, open source AI

Heartex raises $25M for its AI-focused, open source data labeling platform

Google expands program to help train the formerly incarcerated

Together lands $102.5M investment to grow its cloud for training generative AI

This startup wants to train art-generating AI strictly on licensed images

Patients may suffer from hallucinations of AI medical transcription tools

Artificial Intelligence in practice

V7 snaps up $33M to automate training data for computer vision AI models

Explosion snags $6M on $120M valuation to expand machine learning platform

Cybersecurity Snapshot: OpenSSF Unveils Framework for Securing Open Source Projects, While IT-ISAC Says AI Makes Ransomware Stealthier

Navigating the future of national tech independence with sovereign AI

Comprehensive data management for AI: The next-gen data management engine that will drive AI to new heights

Stability AI, the startup behind Stable Diffusion, raises $101M

5 Things To Look For When Evaluating AI Startups

Spawning lays out plans for letting creators opt out of generative AI training

Emerge Career’s tech-forward job training lets incarcerated folks hit the road on release

Activeloop snags $5M seed to build streaming database for AI applications

Stability AI releases ChatGPT-like language models

Radar Trends to Watch: November 2024

Webiny nabs $3.5M seed to build serverless development framework on top of serverless CMS

Cybersecurity Snapshot: Memory Bugs Pervasive in Open Source SW, While Car Dealership Chaos Persists After Ransomware Attack

9 Best AI Tools for Programming Assistance in 2024

German startup Kern AI nabs seed funding for modular NLP development platform

Top 11 LLM Tools That Ensure Smooth LLM Operations

5 ways to improve mental health for software developers

AI-tool maker Seldon raises £7.1M Series A from AlbionVC and Cambridge Innovation Capital

12 AI predictions for 2025

Onehouse is building a neutral data lake integration layer on top of Apache Hudi

Stay Connected