Perturb
Decentralized Adversarial Robustness Network — built on Bittensor.
Abstract
We propose a decentralized adversarial robustness network built on Bittensor. Miners compete to find adversarial examples — imperceptible input perturbations that cause state-of-the-art image classifiers to fail. Validators construct and verify challenges using a real AI model and an LLM-backed semantic checker, then score responses based on perturbation minimality and response speed. On-chain weights are assigned periodically to miners who accumulate a sufficient history of verified results.
The result is the world's first financially incentivized, continuously improving adversarial testing infrastructure. The network produces two commercially valuable outputs — an adversarial training dataset and on-chain model robustness certificates — addressing a $1.43B market growing at 26.1% annually.
Executive Summary
Every AI model deployed in production carries a hidden vulnerability: adversarial examples. An imperceptible change to a single image can cause a medical imaging model to misclassify a tumor, an autonomous vehicle to ignore a stop sign, or a fraud detection system to approve a fraudulent transaction. The tooling to systematically discover these vulnerabilities before deployment is fragmented, expensive, and — critically — static. It does not improve over time.
Perturb changes this. Built on Bittensor, Perturb is a decentralized adversarial robustness network where miners compete to find adversarial examples — inputs that fool AI models while remaining imperceptible to humans. Every challenge is verified by an LLM-backed semantic check. Every attack is scored with mathematical precision. The network gets stronger every single day.
The result is the world's first financially incentivized, continuously improving adversarial testing infrastructure — something no centralized service can replicate.
The Problem We Solve
AI Models Are Brittle
Modern AI models achieve remarkable accuracy on clean test sets, yet remain catastrophically vulnerable to adversarial perturbations — mathematically crafted modifications to input data that are imperceptible to humans but cause models to fail completely. A ResNet-50 achieving 94% accuracy on ImageNet can be fooled by changing fewer than 0.3% of an image's pixels. An EfficientNet-B5 scoring 83.6% top-1 accuracy can misclassify a tabby cat as a fire truck with a perturbation invisible to any human observer.
This is not a theoretical concern. Adversarial attacks have been demonstrated against:
Medical imaging classifiers used for cancer detection and diagnostic triage
Facial recognition systems controlling physical access to secure facilities
Autonomous vehicle perception systems for object and sign recognition
Content moderation models protecting platforms from harmful material
Financial fraud detection systems processing millions of transactions daily
The Market Opportunity
Three powerful forces are converging to create an urgent, large, and underserved market for adversarial robustness testing:
Regulatory Mandates: The EU AI Act classifies AI systems in healthcare, autonomous vehicles, critical infrastructure, and hiring as high-risk, requiring mandatory conformity assessments including robustness testing before deployment. Non-compliance carries fines of up to €30 million or 6% of global annual revenue.
Enterprise Procurement Requirements: Large enterprises increasingly require AI vendors to demonstrate robustness certifications before purchase. A verified robustness certificate is becoming table stakes for selling AI products into regulated industries.
AI Proliferation: As AI moves from research to production across every industry, the attack surface grows exponentially. Every new model deployment is a new vulnerability. Organizations deploying AI without systematic adversarial testing are accepting unknown, unquantified risk.
Why Existing Solutions Fall Short
| Solution | Type | Critical Limitation |
|---|---|---|
| IBM ART / Foolbox / CleverHans | Open-source libraries | Requires deep ML expertise. No managed service. Never improves. |
| Manual red teaming firms | Human service | $50K–$200K per engagement. Weeks to complete. Cannot scale. |
| Internal security teams | In-house | Most organizations lack adversarial ML expertise. |
| Perturb | Decentralized network | Self-improving. Competitive. Scalable. LLM-verified. On-chain certificates. |
No existing solution provides a competitive, financially incentivized, continuously improving approach to adversarial testing. Perturb is the first.
Introducing Perturb
What Perturb Does
Perturb is a Bittensor subnet that incentivizes a global network of miners to find adversarial examples — images that fool AI classifiers while remaining visually indistinguishable from the original. The network produces two commercially valuable outputs:
Adversarial Training Dataset
A continuously growing, LLM-verified dataset of adversarial examples. Sold via subscription to AI teams doing adversarial training — the most effective known defense. Gets better every day as miners improve.
Short-Term RevenueModel Robustness Certificates
On-demand adversarial evaluation reports with on-chain cryptographic proof of testing. Essential for EU AI Act compliance and enterprise AI procurement. Sold as a tiered subscription service.
Long-Term RevenueWhy Bittensor
Competitive Improvement: TAO emissions reward the best-performing miners. Applied to adversarial attacks, this creates the first financially incentivized adversarial research network. Miners earn real money for finding better attacks — driving continuous improvement no salaried team can match.
Perfect Verification Symmetry: Finding an adversarial example is computationally hard. Verifying one is trivially cheap: run the model, check the output with an LLM, measure the perturbation norm. This asymmetry makes Perturb's incentive mechanism clean, objective, and manipulation-resistant.
Decentralized Trust: On-chain records of adversarial evaluations create cryptographically verifiable proof of robustness testing — more credible than any centralized company's self-reported certificate, directly relevant to regulatory bodies seeking auditable compliance records.
Technical Architecture
System Overview
Perturb operates on a challenge-response loop between validators and miners. Validators construct verified challenges, distribute them to a randomly selected pool of miners, and score responses using LLM semantic verification, perturbation minimality, and response speed.
Validator: Challenge Pipeline
The validator constructs each challenge through a verified pipeline ensuring every challenge sent to miners is semantically clean and unambiguous:
validator / build_challenge.pydef build_challenge() -> Challenge:
label = random.choice(LABEL_CONSTANTS)
image = image_hosting_api.fetch(mode="random", label=label)
while True:
raw_output = efficientnet_b5(preprocess(image))
predicted = raw_output.argmax().item()
label_str = IMAGENET_CLASSES[predicted]
if llm_verify_label_match(label, label_str):
break
else:
image = image_hosting_api.fetch(mode="random", label=label)
return Challenge(
model = "efficientnet_b5",
image = image,
true_label = label,
true_label_str = label_str,
)
Validator: LLM Verification
Perturb uses a lightweight LLM for semantic verification at two points: during challenge construction and during response scoring. This approach handles edge cases where related classes such as tabby cat vs Egyptian cat would incorrectly fail a valid attack under integer class-ID comparison.
| Parameter | Value | Description |
|---|---|---|
| PERTURB_LLM_ENDPOINT_MODEL | Qwen2.5-1.5B-Instruct | Model name sent in validator challenge payload |
| Ollama default model | qwen2.5:1.5b-instruct | Actual model name for local Ollama deployment |
| Verification task | Semantic label matching | Does predicted output semantically match true label? |
| Used at | Challenge build + Response score | Both stages share the same LLM endpoint |
Challenge Format
Complete payload sent from validator to all K selected miners:
challenge payload (JSON){
"task_id": "string",
"model": "efficientnet_b5",
"llm_model_hint": "Qwen2.5-1.5B-Instruct",
"image": "base64_encoded_RGB_image",
"true_label": "string // e.g. 'tabby_cat'",
"constraints": {
"norm": "Linf",
"epsilon": 0.06,
"min_delta": 0.002,
"max_delta": 0.12,
"pixel_range": [0.0, 1.0]
},
"scoring_weights": { "perturbation": 0.7, "speed": 0.3 },
"timeout_ms": 60000
}
Miner: Response
Miners return only the perturbed image. The attack method and parameters are entirely proprietary — the miner's competitive edge. Perturb provides a working default miner implementation so new participants can join immediately. Sophisticated miners replace the default with optimized strategies to compete for higher emission shares.
miner response (JSON){ "task_id": "string", "perturbed_image": "base64_encoded_RGB_image" }
Miners return only the perturbed image. Attack method is proprietary. The network evaluates results, not methods.
Scoring: Per-Challenge
Each miner response passes through strict verification gates. Any hard-fail returns 0.0 immediately:
| Condition | Threshold | Result |
|---|---|---|
| Invalid image, wrong shape, or out-of-range pixels | Any violation | score = 0.0 |
| Perturbation norm below minimum | norm < 0.002 | score = 0.0 |
| Perturbation norm above maximum | norm > min(ε, 0.12) | score = 0.0 |
| Perturbed prediction still matches true_label semantically | LLM check | score = 0.0 |
| All checks pass | — | proceed to formula |
Scoring formula for responses that pass all verification gates:
perturbation_ratio = norm / epsilon
perturbation_score = 1.0 - min(perturbation_ratio, 1.0)
speed_score = 1.0 - min(response_time_ms / timeout_ms, 1.0)
score = 0.7 * perturbation_score + 0.3 * speed_score
Scoring: On-Chain Weight Setting
On-chain weights blend each miner's historical average score with a rank-based emission allocation. Only miners with processed_count > 100 are eligible. The emission schedule is differentiated by rank — top miners receive a disproportionately larger share, with inverse-rank decay for all lower positions.
eligible = [uid for uid in miners if processed_count[uid] > 100]
N = len(eligible)
avg_raw = {uid: mean(last_100_scores[uid]) for uid in eligible}
# Top 3: fixed emission percentages
emission[rank_1] = 0.50
emission[rank_2] = 0.30
emission[rank_3] = 0.10
# Ranks 4-10: inverse-rank decay within 5% pool
for k in range(4, 11):
emission[k] = 0.05 * (1/k) / sum(1/j for j in range(4, 11))
# Ranks 11+: inverse-rank decay within remaining 5% pool
for k in range(11, N + 1):
emission[k] = 0.05 * (1/k) / sum(1/j for j in range(11, N + 1))
GAMMA = 0.7
raw = GAMMA * normalize(avg_raw) + (1 - GAMMA) * normalize(emission)
weights = raw / raw.sum()
| Rank | Emission Share | Formula |
|---|---|---|
| 1st | 50% | Fixed — winner takes 50% of miner emission |
| 2nd | 30% | Fixed |
| 3rd | 10% | Fixed |
| 4th – 10th | 5% shared | emission(k) = 0.05 × (1/k) / Σj=4..10(1/j) — rank 4 earns more than rank 10 |
| 11th+ | 5% shared | emission(k) = 0.05 × (1/k) / Σj=11..N(1/j) — decay continues with network growth |
| Ineligible | 0% | processed_count ≤ 100 |
Phase 1 Model: EfficientNet-B5
Perturb launches with EfficientNet-B5 as the sole target model — a deliberate choice prioritising network stability, miner onboarding, and validation credibility over premature complexity.
| Attribute | Value |
|---|---|
| Model | EfficientNet-B5 |
| Parameters | 30.4M |
| ImageNet Top-1 | 83.6% |
| timm name | efficientnet_b5 |
| Input size | 224 × 224 × 3 (RGB) |
| Normalization mean | [0.485, 0.456, 0.406] |
| Normalization std | [0.229, 0.224, 0.225] |
| Output | [batch, 1000] logits |
| GPU requirement | RTX 3080 minimum for PGD-40 attacks |
EfficientNet-B5 sits at the ideal intersection of attack difficulty, hardware accessibility, and research credibility. It is challenging enough that basic FGSM attacks perform poorly — requiring miners to implement stronger methods — but not so large that participation requires data center hardware.
Expansion Roadmap
| Phase | Models Added | New Capabilities |
|---|---|---|
| Phase 1 — Launch | EfficientNet-B5 | Image classification, Linf norm, LLM verification, full scoring pipeline |
| Phase 2 — Month 2–3 | + ConvNeXt-Small, ViT-Small, Swin-Tiny | Architecture diversity: CNN vs Transformer vs Hybrid |
| Phase 3 — Month 4–6 | + EfficientNetV2-M, NFNet-F0, ResNeXt-101 | Mid-range GPU tier, stronger attack difficulty |
| Phase 4 — Month 6+ | + LLM text classification models | NLP attacks: word substitution, prompt injection |
| Phase 5 — Year 2 | + Vision models >1B params | Extreme tier: CLIP ViT-G, EVA-Giant, InternViT-6B |
Revenue Model
Adversarial Training Dataset — Short-Term Revenue
Adversarial training — retraining models on adversarial examples — is the most effective known defense against adversarial attacks. Perturb generates this data continuously as a byproduct of its core operation, creating a dataset that improves every single day. Each entry includes: original image, adversarial image, model name, true label, constraint parameters, perturbation norm, LLM verification status, attack score, and timestamp.
| Tier | Volume | Frequency | Target Customer |
|---|---|---|---|
| Research | 100K examples/month | Weekly | Academic labs, AI safety organizations |
| Professional | 1M examples/month | Daily | AI startups, ML engineering teams |
| Enterprise | Unlimited | Real-time | Large enterprises, frontier AI labs |
Model Robustness Testing Service — Long-Term Revenue
Organizations submit a model for evaluation and receive a comprehensive robustness report generated by directing the full miner network at the target model. Each report includes:
Overall robustness score (0.0–1.0) benchmarked against industry standards
Attack success rate across epsilon budgets and image categories
Worst-case adversarial examples, visualized and downloadable
LLM-verified semantic failure analysis — not just pixel statistics
On-chain cryptographic certificate of evaluation — immutable and auditable
Comparison against published AutoAttack benchmarks for the same architecture
| Tier | Models | Evaluation Depth | Frequency |
|---|---|---|---|
| Starter | 1 model | Standard suite | Monthly |
| Growth | 5 models | Extended suite | Weekly |
| Enterprise | Unlimited | Full suite + custom scope | Continuous |
Go-To-Market Strategy
Target Customers
AI Startups Selling to Enterprise
Enterprise procurement teams now require adversarial robustness certifications before purchase. A Perturb certificate can unblock deals worth orders of magnitude more than the subscription cost.
Regulated Industry Deployments
EU AI Act compliance requires conformity assessments for high-risk AI systems. Perturb provides on-chain proof of robustness testing — immutable, auditable, and defensible to regulators.
AI Research Labs
Standardized, reproducible, LLM-verified robustness benchmarks citable in academic publications. The public leaderboard becomes a recognized reference benchmark in the adversarial ML research community.
AI Safety Organizations
Organizations working on AI safety need large, diverse, high-quality adversarial example datasets for research into robustness and defenses. The dataset subscription provides this at a fraction of the cost of generating equivalent data in-house.
Phased Launch Plan
Competitive Moat
Self-Improving Attack Quality: Every day miners compete, the network gets better. The dataset becomes more valuable. The certificates become more credible. No centralized service has this compounding property.
LLM-Verified Semantic Precision: Unlike systems comparing integer class IDs, Perturb uses LLM semantic verification ensuring attacks are genuinely meaningful — producing higher-quality data and more credible certificates.
On-Chain Immutability: Robustness certificates on Bittensor cannot be altered, backdated, or selectively disclosed — categorically different from any vendor's self-reported compliance documentation.
Architecture Diversity: As Perturb adds target models, miners who specialize build irreplaceable expertise. A miner optimizing attacks for six months outperforms any general-purpose tool.
Market Analysis
| Metric | Value | Notes |
|---|---|---|
| AI Red Teaming Market (2024) | $1.43 billion | Industry research, 2024 |
| AI Red Teaming Market (2033) | $11.61 billion | Projected at 26.1% CAGR |
| CAGR (2025–2033) | 26.1% | Driven by regulatory mandates |
| EU AI Act max fine | €30M or 6% revenue | Non-compliance penalty |
| Fastest growing segment | Adversarial attack simulation | Perturb's exact category |
| Key verticals | Healthcare, BFSI, Government, Automotive | Primary enterprise targets |
| Bittensor active subnets | 128 (expanding to 256) | As of April 2026 |
| Competing subnets in this space | 0 | No existing subnet covers adversarial ML |
Perturb enters a $1.43B market growing at 26.1% annually, with zero competing Bittensor subnets in this category and increasing regulatory tailwinds globally.
Conclusion
AI models are increasingly embedded in decisions that affect human safety, financial stability, and personal rights. Yet the vast majority of these models are deployed without systematic adversarial robustness testing — not because organizations don't care, but because the tooling to do so at scale simply doesn't exist.
Perturb changes this. By applying Bittensor's competitive incentive mechanism to adversarial example generation — and adding LLM-backed semantic verification to ensure quality — Perturb creates a network that gets measurably better every day. Miners are financially motivated to become the world's best adversarial attack researchers. Validators verify results with mathematical precision. The network produces two commercially valuable outputs that address a real, growing, regulatory-driven market.
The validation mechanism is airtight. The market is large and accelerating. The Bittensor architecture provides an unfair advantage no centralized competitor can replicate. And with LLM-verified semantic scoring as a differentiator, Perturb produces higher-quality adversarial data than any existing tool.
For technical documentation, validator setup, and miner onboarding, visit perturbai.io or the official GitHub repository.