What is Data Availability?
Data availability (DA) is the guarantee that the full set of data behind a block or batch—including all transaction details needed to reconstruct the state—is accessible to anyone who wants to verify it. In the context of rollups, data availability is the most critical security property: if the transaction data for a rollup batch is not available, nobody can verify whether the state transition claimed by the rollup operator is correct, and users cannot force-exit their funds.
The concept is often illustrated by the “data availability problem”: how can a light client (or any participant who does not download the entire block) be sure that all the data in a block is actually available, without downloading the entire block? If a block producer publishes only a block header with a state root but makes the underlying transaction data unavailable, the network cannot verify the state transition. This is not just theoretical—it was the fundamental weakness of Plasma, the pre-rollup scaling solution, where unavailable data could trap user funds permanently.
Data availability has become one of the most active areas of research and development in Web3. The emergence of purpose-built DA layers (Celestia, EigenDA, Avail) and Ethereum’s own DA scaling efforts (EIP-4844, PeerDAS, full danksharding) reflect the industry’s recognition that DA is the bottleneck layer in the modular stack. As rollups handle more and more execution, the demand for cheap, available block space grows proportionally—and the DA layer must scale to meet that demand.
How It Works
The Data Availability Problem
The core challenge is this: a block producer could publish a valid block header (including a Merkle root that commits to the transactions) but withhold the actual transaction data. Other nodes cannot verify the transactions without the data, and light clients cannot even detect that the data is missing (they can’t distinguish “data unavailable” from “they just haven’t received it yet”).
Solutions to the DA problem use erasure coding combined with data availability sampling (DAS):
-
Erasure coding: Before publishing, the block producer extends the original data (2×) using Reed-Solomon encoding. If the original block is N chunks, the erasure-coded version is 2N chunks. Any N of the 2N chunks are sufficient to reconstruct the full data. This means 50% data loss is tolerable.
-
Data Availability Sampling (DAS): Light clients randomly sample a small number of chunks from the erasure-coded data (e.g., 20–50 chunks out of potentially 16,384). If any sampled chunk is unavailable, the client rejects the block as unavailable. With correct mathematical assumptions, sampling just 30 random chunks gives a >99.999% confidence that less than 50% of data is missing (and therefore the full block can be reconstructed).
This combination allows light clients to verify data availability with minimal bandwidth, enabling massive block sizes without requiring every participant to download full blocks.
EIP-4844: Proto-Danksharding
EIP-4844 (implemented as part of the Dencun upgrade on March 13, 2024) introduced “blob-carrying transactions” to Ethereum. This was the first major DA-focused upgrade. Key specifications:
- Each block can carry up to 6 blobs (originally specified, later adjusted via target/max system)
- Each blob is 128 KB of data (4,096 field elements × 32 bytes each)
- Blobs are separate from calldata—they are not accessible to the EVM and are not stored in the state trie
- Blobs are pruned from full nodes after approximately 18 days (524,288 slots)
- Blob gas uses a separate fee market from regular gas, with its own base fee that adjusts based on blob demand
- Target blob gas per block: 6 blobs (3 MB), max: 9 blobs (4.5 MB) after PeerDAS
The impact was dramatic. Before EIP-4844, the cost to post 1 MB of rollup data to Ethereum L1 was ~$50–200 in calldata gas fees. After, the same amount of data as blobs costs ~$1–5—a 50–100× reduction. This directly translated to L2 fee reductions: average Arbitrum transactions dropped from ~$0.10–0.50 to ~$0.001–0.01.
Data Availability Layers
Several purpose-built DA layers have emerged to provide alternatives to Ethereum’s native blob space:
Celestia (mainnet October 2023) was the first standalone DA layer. It uses Tendermint consensus with DAS to provide cheap, scalable data availability. Its key innovations include:
- Namespaced Merkle Trees (NMTs): Allow multiple rollups to share block space with namespace isolation
- DAS light nodes: Anyone can run a DAS light node on a laptop (2 GB RAM, 500 MB storage) to verify Celestia’s DA
- Square-based block layout: Data is arranged in a 2D “square” (e.g., 128×128) and erasure-coded along both dimensions, enabling efficient 2D sampling
EigenDA (by EigenLayer) restakes Ethereum validators to provide DA services without running a separate blockchain. Validators who are already securing Ethereum can opt in to verify and store data for rollups. This approach leverages Ethereum’s massive validator set (~900,000) for DA security without requiring new token incentives.
Avail (launched mainnet July 2024) uses a kademlia-based DAS network and a “data availability attestation” mechanism. It supports up to 1 MB per block in its initial launch with plans to scale significantly. Avail was originally part of Polygon (Polygon Avail) before spinning out as an independent project.
DA Cost Comparison (as of early 2025)
| DA Layer | Cost per MB | Minimum Retention | DAS Support | Block Size |
|---|---|---|---|---|
| Ethereum blobs | $0.50–5.00 | 18 days | No (full node download) | ~3 MB per block |
| Celestia | $0.01–0.10 | Permanent (pruning optional) | Yes | ~2 MB per block |
| EigenDA | $0.001–0.01 | Configurable | Planned | Scalable |
| Avail | $0.01–0.05 | Permanent | Yes | ~1 MB per block |
| Ethereum calldata | $50–200 | Permanent | N/A | ~120 KB per block |
Real-World Examples
Arbitrum One migrated from calldata to blobs immediately after EIP-4844, reducing its L1 posting costs by 90%+. Before the upgrade, Arbitrum’s daily L1 posting cost was ~$100,000–300,000. After, it dropped to ~$10,000–30,000. This cost savings was passed to users through lower L2 gas fees.
Base (Coinbase’s L2) also adopted EIP-4844 blobs immediately. The cost reduction enabled several high-volume social applications (like Friend.tech) to operate economically. Base processes over 1 million transactions per day, with blob DA costs averaging ~$5,000–15,000 per day.
Eclipse is a Solana Virtual Machine (SVM) rollup that uses Celestia as its DA layer. This combination provides Solana’s execution speed and parallelism with Celestia’s cheap DA. Eclipse’s architecture demonstrates that rollups can mix and match execution environments (SVM) with DA layers (Celestia) from different ecosystems.
Fuel is a modular execution layer that uses Ethereum for DA and settlement. Its UTXO-based execution model (rather than account-based like the EVM) can process transactions in parallel without the need for sequential ordering, achieving higher throughput for specific use cases.
Key Risks / Considerations
- DA layer failure: If the DA layer goes down or loses data, all dependent rollups are affected. Celestia experienced a brief outage in December 2023 when validators failed to reach consensus, temporarily halting data publication for all dependent rollups.
- Data pruning: Ethereum blobs are deleted after ~18 days. If a rollup user needs to force-exit after 18 days, the data may no longer be available from L1 full nodes. Rollups must ensure they (or their users) store the data long-term, or rely on third-party archival services.
- DA centralization: Light nodes running DAS require real-time participation. If too few nodes run DAS, the sampling guarantees break down. The “DA sampling assumption” requires that at least one honest node is sampling the data at all times.
- Cross-DA migration: Moving a rollup from one DA layer to another is complex and potentially risky. Contracts and state roots committed to one DA layer cannot easily be migrated to another.
- Regulatory ambiguity: DA providers that store and replicate transaction data may face different regulatory classifications than pure settlement or execution providers.
Comparison Table: DA Layer Features
| Feature | Ethereum Blobs | Celestia | EigenDA | Avail |
|---|---|---|---|---|
| Settlement | Yes (L1) | No (separate) | Via ETH restaking | No (separate) |
| DAS | Future (PeerDAS) | Yes (production) | Planned | Yes |
| Max block size | ~3 MB (expanding) | ~2 MB (expanding) | Configurable | ~1 MB (expanding) |
| Data retention | 18 days | Permanent | Configurable | Permanent |
| Trust assumption | ETH validators | Celestia validators | Restaked ETH validators | Avail validators |
Frequently Asked Questions
Q: Why is data availability important for rollups? A: Without data availability, users cannot verify that the rollup’s state transitions are correct, and they cannot force-exit their funds. If the sequencer publishes a fraudulent state root, someone needs the transaction data to generate a fraud proof (optimistic) or verify the claimed execution (ZK). If the data is unavailable, funds are effectively locked.
Q: What’s the difference between DA and storage? A: DA guarantees that data is accessible to network participants for verification purposes (typically for a limited time, e.g., 18 days for Ethereum blobs). Storage (like IPFS, Arweave, or Filecoin) is for long-term persistence. They serve different purposes—DA is for liveness and security, storage is for permanence.
Q: Can I use multiple DA layers simultaneously? A: Yes. Some rollups (especially those building on the OP Stack or Polygon CDK) are designed to support pluggable DA layers. A rollup could post its primary batches to Ethereum blobs for maximum security and post redundant copies to Celestia for lower cost, using the cheapest option available at any given time.
Q: What happens if blob data is pruned and I need it later? A: After 18 days, blob data is pruned from Ethereum full nodes. Rollup operators, indexers, and archive services are expected to store the data long-term. If no one has stored the data and a dispute arises after pruning, the rollup’s security guarantees may be compromised. This is an active area of concern for rollup designers.