Machine Learning on Blockchain: Predictions, Oracles, and Data Markets
Updated February 2026 · 12 min read
Running a machine learning model on a blockchain sounds like a terrible idea, and honestly, for most use cases, it is. Blockchains are slow, expensive, and not designed for matrix multiplication. But there's a growing set of applications where you genuinely need ML outputs that are verifiable, tamper-proof, and trustless. And that's where things get interesting.
This page covers the three big intersections of ML and blockchain: on-chain inference, AI oracles, and decentralized data marketplaces. Each one is solving a different problem, and they're all at different stages of maturity.
The Problem: Why Put ML on a Blockchain?
Off-chain ML is fast and cheap. So why bother with blockchain? Because sometimes you need to prove that an ML model produced a specific output from a specific input, without trusting anyone.
Real examples where this matters:
- DeFi lending: A credit scoring model determines your loan terms. You want to verify the model wasn't biased or manipulated.
- Prediction markets: An ML model resolves a market. Participants need proof the model actually ran the computation it claims.
- Insurance: An AI model assesses a claim. You want a tamper-proof record of the decision and the data it used.
- Content moderation: A DAO uses an AI model to flag content. Community members need to verify the model's decisions are consistent.
The common thread: trustless verification of ML outputs. If you don't need trustlessness, just run your model on AWS and call it a day.
Approaches to On-Chain ML
zkML (Zero-Knowledge Machine Learning)
This is the most technically ambitious approach. zkML uses zero-knowledge proofs to verify that an ML model was run correctly without revealing the model weights or the full input data.
How it works: you run the ML inference off-chain, generate a ZK proof that the computation was done correctly, and submit only the proof on-chain. The blockchain verifies the proof without re-running the computation. It's like showing your work on a math test, except the proof is cryptographic.
Projects working on this:
- EZKL: Open-source toolkit for converting ML models (from PyTorch/ONNX) into ZK circuits. Currently handles models up to a few million parameters. Not big enough for LLMs, but works for classifiers, regression models, and small neural networks.
- Modulus Labs: Building infrastructure for verifiable AI on Ethereum. Their Remainder system lets smart contracts verify ML outputs. Raised $6.3M from Variant and 1kx.
- Giza: Converts ONNX models to verifiable computations on Starknet. Focused on DeFi applications where model outputs affect financial decisions.
- Ritual: Building a "superchain for AI." Their Infernet system lets smart contracts call off-chain AI models with on-chain verification. Raised $25M from Archetype. Token expected 2026.
The limitation: ZK proofs for ML are computationally expensive. Proving a simple image classifier can take minutes. Proving something like GPT-4 would take years with current technology. zkML works for small models today and will scale as ZK hardware improves.
Optimistic ML Verification
Instead of cryptographically proving every computation, optimistic approaches assume the ML output is correct unless someone challenges it. Similar to how optimistic rollups (like Optimism and Arbitrum) work for transactions.
An ML result is posted on-chain with a bond. If anyone thinks it's wrong, they can challenge it during a dispute period. If the challenge succeeds, the original poster loses their bond. This is much cheaper than ZK proofs but has a latency trade-off.
Federated and Ensemble Approaches
Bittensor's subnet architecture is an example of this. Multiple nodes run the same model (or different models) on the same input, and the network takes the consensus result. No single node can manipulate the output because it would be outvoted by honest nodes.
This works well for inference but not for training. And it requires significant redundant computation, making it expensive.
AI Oracles
Traditional oracles like Chainlink relay real-world data to smart contracts. Price feeds, weather data, sports scores. AI oracles go further: they run ML models to generate predictions, classifications, or insights that smart contracts can use.
How AI Oracles Work
- A smart contract requests an AI prediction (e.g., "what's the sentiment of crypto Twitter right now?")
- Off-chain AI oracle nodes run the ML model on the requested data
- Multiple nodes run the same query and their results are aggregated
- The consensus result is posted on-chain with a proof or attestation
- The smart contract uses the result to make decisions (e.g., adjust lending parameters)
Who's Building AI Oracles
- Chainlink Functions: Lets smart contracts call external APIs and computation, including AI models. Not AI-specific, but it's the most battle-tested oracle infrastructure.
- Ora (formerly HyperOracle): Focuses on verifiable AI inference for blockchain. Uses optimistic verification for ML results.
- Allora Network: Decentralized AI inference network specifically designed to provide ML predictions to smart contracts. Uses a topic-based architecture where different models compete on prediction accuracy.
- Autonolas' oracle services: Olas agents can function as AI oracles, running models and posting results on-chain.
Data Marketplaces
AI models need data. Good data is expensive and hard to find. Blockchain-based data marketplaces let providers monetize datasets while maintaining control over how they're used.
Ocean Protocol
Ocean is the biggest player here. Founded by Trent McConaghy (who also co-founded BigchainDB), Ocean lets data providers publish datasets and set pricing, licensing, and access terms using smart contracts.
The killer feature is compute-to-data: instead of downloading a dataset, you send your ML model to the data. The model trains on the data without the data ever leaving the provider's server. This is huge for sensitive data like medical records, financial data, or proprietary research.
Ocean also runs "Predictoor," a prediction market where data scientists stake OCEAN tokens on price predictions. It's like Bittensor's approach but focused specifically on financial predictions, and stakers lose tokens when they're wrong.
Other Data Marketplace Projects
- Grass: A decentralized web scraping network where users earn tokens by letting the network route requests through their browser. The scraped data feeds AI training. Raised $3.5M and has millions of active nodes.
- Masa: Focuses on personal data. Users earn MASA tokens for contributing browsing data, social media data, and other personal information for AI training. Privacy-preserving via ZK proofs.
- Vana: Data DAO platform where users collectively own and monetize their data. r/datadao on Reddit was their first experiment.
- Delphi (formerly Numerai): Data scientists compete to build the best financial prediction models using encrypted data. The tournament model has been running since 2017 and paid out millions.
Prediction Markets and ML
Prediction markets are one of the most natural applications of on-chain ML. Markets like Polymarket and Augur let people bet on outcomes. ML models can:
- Provide initial probability estimates for new markets
- Help resolve disputes about market outcomes
- Detect manipulation and wash trading
- Aggregate information from multiple sources to improve market accuracy
Allora Network is specifically built for this, running decentralized ML inference to provide prediction feeds that smart contracts consume. Think of it as a prediction-focused AI oracle.
What's Realistic Today vs. What's Hype
| Application | Status | Notes |
|---|---|---|
| zkML for small models | Working | Classifiers, regression models work now |
| zkML for large models | 2-3 years out | Needs ZK hardware acceleration |
| AI oracles | Early production | Working but limited adoption |
| Data marketplaces | Working, low volume | Ocean has product-market fit for niche use cases |
| On-chain LLM inference | Not feasible | Way too expensive for blockchain execution |
| ML prediction markets | Growing | Allora and Ocean Predictoor are live |
Frequently Asked Questions
What is on-chain machine learning?
It's running ML model inference on blockchain or using cryptographic proofs to verify off-chain ML computations. Includes zkML, AI oracles, and on-chain prediction models. The goal is trustless verification of AI outputs.
What is an AI oracle?
A blockchain oracle that uses ML models to provide predictions or insights to smart contracts. Instead of just relaying data, AI oracles generate new information by running models on inputs and posting verified results on-chain.
What is Ocean Protocol used for?
Ocean Protocol is a decentralized data marketplace. Data providers publish datasets, AI developers buy access, and the compute-to-data feature lets models train on data without the data leaving the provider's server. It's part of the ASI Alliance with Fetch.ai and SingularityNET.