Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
- Updated
Nov 10, 2025 - Python
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
A light-weight, flexible, and expressive statistical data testing library
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
Lightweight, extensible data validation library for Python
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Automatically find issues in image datasets and practice data-centric computer vision.
The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.
Data validation toolkit for assessing and monitoring data quality.
The Open Data Editor (ODE) is a no-code application to explore and validate tabular data in a simple way. Forever free and open source project powered by the Frictionless Framework.
A dead simple Python string validation library.
Typical: Fast, simple, & correct data-validation using Python 3 typing.
Open Source Data Quality Monitoring.
A collaborative framework for annotating medical datasets using crowdsourcing.
Accelerates migrations to Databricks by automating key migration activities
A simple and easy to use Data Validation library for Python.
pydantic --> zod data models
A declarative PySpark framework for row- and aggregate-level data quality validation.
Snowflake Database, Schema, and Warehouse provisioning with Access Roles & Generating and Provisioning of Functional Roles & Snowflake Source Export, Snowflake cloning, and data tieout tool
Gere e valide dados randômicos com fordev 🎲
IATI datastore powered by Apache Solr. Automatically Extracts and parses IATI XML files referenced in the IATI Registry & refreshed every 3 hrs. IATI is a global initiative to improve the transparency of development and humanitarian resources and their results for addressing poverty and crises.
Add a description, image, and links to the data-validation topic page so that developers can more easily learn about it.
To associate your repository with the data-validation topic, visit your repo's landing page and select "manage topics."