Skip to content

Binary Storage

Oak Chain stores large binaries in IPFS, with CID references in Oak. Validators store CIDs only (46 bytes), not the binaries themselves.

Why This Matters

Binary handling drives cost, latency, and trust. Mixing binary blobs into consensus storage creates unnecessary load and weak operational clarity.

What You'll Prove

  • You can keep consensus state lean by storing CID references in Oak.
  • You can host binaries independently while preserving verifiable integrity.
  • You can serve media through edge infrastructure without losing provenance.

Next Action

Follow the upload and retrieval flow below to publish one binary and verify its CID-backed integrity end-to-end.

The Truth → Provenance → Edge Flow

This is the beautiful insight: Oak Chain is the source of truth (CIDs), while binaries flow through author storage to edge CDNs.

How It Works

  1. Oak Chain stores the CID (content-addressed hash). This is the truth.
  2. Author hosts the binary (IPFS, Azure Blob, or pinning service). This is provenance.
  3. Edge CDN caches the binary globally (Cloudflare R2, Fastly). This is delivery.
  4. User can verify binary integrity by hashing and comparing to the CID

Why This Is Beautiful

PropertyBenefit
Immutable provenanceCID in Oak Chain proves what the binary SHOULD be
Decentralized truthValidators consensus on CID, not on binary storage
Perfect cache keyCID never changes = CDN can cache forever
Trustless verificationAnyone can hash binary and compare to Oak Chain CID
Economic separationValidators handle consensus, authors handle storage, CDN handles delivery

Architecture

Why IPFS?

BenefitDescription
Content-addressedCID = hash of content, immutable
DecentralizedNo single point of failure
DeduplicationSame content = same CID
Author-ownedAuthors control their storage

How It Works

1. Author Uploads to IPFS

bash
# Upload binary to IPFS
ipfs add my-image.jpg
# Returns: QmXyz...abc (CID)

2. Author Writes CID to Oak

bash
curl -X POST http://localhost:8090/v1/propose-write \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "proposalId=0x1111111111111111111111111111111111111111111111111111111111111111" \
  -d "walletAddress=0x742d35Cc..." \
  -d "organization=PixelPirates" \
  -d "message={\"jcr:primaryType\":\"dam:Asset\",\"jcr:content\":{\"renditions\":{\"original\":{\"ipfs:cid\":\"QmXyz...abc\",\"jcr:mimeType\":\"image/jpeg\",\"size\":1048576}}}}" \
  -d "contentType=dam:Asset" \
  -d "ethereumTxHash=0x..." \
  -d "signature=0x..."

3. Readers Fetch from IPFS

bash
# Get metadata from Oak
curl "http://localhost:8090/api/explore?path=/oak-chain/.../content/dam/images/hero"

# Response includes CID
{
  "renditions": {
    "original": {
      "ipfs:cid": "QmXyz...abc",
      ...
    }
  }
}

# Fetch binary from IPFS gateway
curl https://ipfs.io/ipfs/QmXyz...abc > image.jpg

Direct Binary Access

For performance, use direct IPFS access:

javascript
async function getAsset(path) {
  // 1. Get metadata from Oak
  const meta = await fetch(`http://validator:8090/api/explore?path=${encodeURIComponent(path)}`);
  const { renditions } = await meta.json();
  
  // 2. Get binary from IPFS (direct, no validator)
  const cid = renditions.original['ipfs:cid'];
  const binary = await fetch(`https://ipfs.io/ipfs/${cid}`);
  
  return binary.blob();
}

Storage Model

Validators Store CIDs Only

Validator Storage:
├── Oak Segments (structured content + CIDs)
└── NO binary blobs

Authors Own Binaries

Authors are responsible for:

  • Uploading to IPFS
  • Pinning (ensuring availability)
  • Paying for IPFS storage (Filecoin, Pinata, etc.)

This separates:

  • Consensus (validators) - structured content
  • Storage (IPFS) - binary blobs

Pinning Services

ServiceDescription
PinataManaged IPFS pinning
Infura IPFSEnterprise IPFS
FilecoinDecentralized storage deals
web3.storageFree tier available

Example: Image Upload Flow

javascript
import { create } from 'ipfs-http-client';

const ipfs = create({ url: 'https://ipfs.infura.io:5001' });

async function uploadImage(file, wallet, org, proposalId, paymentTxHash) {
  // 1. Upload to IPFS
  const { cid } = await ipfs.add(file);
  console.log('IPFS CID:', cid.toString());
  
  // 2. Write metadata to Oak
  const response = await fetch('http://validator:8090/v1/propose-write', {
    method: 'POST',
    headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
    body: new URLSearchParams({
      proposalId,
      walletAddress: wallet,
      organization: org,
      message: JSON.stringify({
        'jcr:primaryType': 'dam:Asset',
        'ipfs:cid': cid.toString(),
        'jcr:mimeType': file.type,
        'size': file.size,
        'uploadedAt': new Date().toISOString()
      }),
      contentType: 'dam:Asset',
      ethereumTxHash: paymentTxHash, // From payment
      signature: '0x...' // Signed message
    })
  });
  
  return { cid: cid.toString(), oakPath: response.path };
}

For author-owned IPFS flows, the important fields are still proposalId, ethereumTxHash, signature, and the CID-backed metadata. If your payment flow uses a contract class, send the same paymentTier value with the write request.

Garbage Collection

When content is deleted from Oak:

  1. CID reference removed from Oak
  2. Binary remains in IPFS (content-addressed)
  3. Author can unpin if no longer needed
  4. IPFS GC eventually removes unpinned content

Next Steps

Apache 2.0 Licensed