Skip to main content

Data sovereignty

Data sovereignty means you own and control your scholarly work. In Chive, your preprints, reviews, and endorsements belong to you, not the platform.

The ownership model

Traditional preprint servers store your work on their servers. You're a guest in their house. Chive inverts this relationship:

Traditional Model                    Chive Model
┌─────────────────────┐ ┌─────────────────────┐
│ Platform Server │ │ Your PDS │
│ │ │ │
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ Your preprint │ │ │ │ Your preprint │ │
│ │ (they own it) │ │ │ │ (you own it) │ │
│ └───────────────┘ │ │ └───────────────┘ │
│ │ │ │
│ Platform controls │ │ You control access │
│ access, deletion, │ │ and can migrate │
│ and terms │ │ anytime │
└─────────────────────┘ └─────────────────────┘


┌─────────────────────┐
│ Chive (AppView) │
│ │
│ Indexes your work │
│ Never stores it │
└─────────────────────┘

What Chive stores (and doesn't)

Chive stores

Data typeWhat Chive keepsPurpose
Metadata indexesTitle, authors, abstractSearch and discovery
BlobRefsPointers to PDFs (not the PDFs)Link to your files
Relationship dataCitations, endorsementsKnowledge graph
Computed metricsView counts, trending scoresAnalytics

Chive never stores

Data typeWhy not
Your PDFsThose live in your PDS
Your private keysYou control your identity
Source-of-truth recordsYour PDS is authoritative
Blob dataOnly BlobRefs (CID pointers)

BlobRefs, not blobs

When you upload a PDF to Chive, it goes to your PDS, not to Chive's servers. Chive only stores a BlobRef, a pointer containing:

{
"$type": "blob",
"ref": {
"$link": "bafyreibvocy34..." // CID (content hash)
},
"mimeType": "application/pdf",
"size": 2847593
}

When someone views your preprint, Chive fetches the PDF from your PDS using the BlobRef. The CID (Content Identifier) ensures the file hasn't been tampered with.

Benefits of BlobRefs

BenefitExplanation
IntegrityCID changes if file is modified
DeduplicationIdentical files share the same CID
PortabilityMove your PDS; BlobRefs still work
VerificationAnyone can verify file authenticity

Rebuildable indexes

Every index in Chive can be rebuilt from the AT Protocol firehose. This design principle ensures:

  1. No data loss: If Chive's database is wiped, user data remains safe in PDSes
  2. No lock-in: Another AppView could index the same data
  3. Auditability: The firehose provides a complete event history
┌────────────────────────────────────────────────────────────────┐
│ Firehose │
│ (complete stream of all repository events) │
└────────────────────────────────────────────────────────────────┘


┌────────────────────────────────────────────────────────────────┐
│ Chive Indexes │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ PostgreSQL │ │ Elasticsearch│ │ Neo4j │ │
│ │ │ │ │ │ │ │
│ │ Can rebuild │ │ Can rebuild │ │ Can rebuild │ │
│ │ from firehose│ │ from firehose│ │ from firehose │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
└────────────────────────────────────────────────────────────────┘

Rebuild process

If Chive needs to rebuild its indexes:

  1. Connect to the relay's firehose
  2. Replay all pub.chive.* events from the beginning
  3. Rebuild PostgreSQL, Elasticsearch, and Neo4j indexes
  4. Verify consistency with live PDS data

No user action required. No data migration. No export/import.

PDS staleness detection

Since your PDS is the source of truth, Chive periodically checks for staleness:

┌──────────┐        ┌──────────┐        ┌──────────┐
│ Chive │───────►│ Your PDS │───────►│ Result │
│ Index │ check │ Record │compare │ │
└──────────┘ └──────────┘ └──────────┘
│ │ │
│ │ │
Indexed Current Match?
version version │
│ │ ├─ Yes: Index valid
│ │ └─ No: Re-index
▼ ▼
rev: 3k5... rev: 3k7...

Staleness detection catches:

  • Missed firehose events
  • PDS migrations
  • Network partitions
  • Record updates outside the firehose window

Portability guarantees

Switching PDS providers

Your DID remains constant. When you migrate to a new PDS:

  1. Export your repository from the old PDS
  2. Import to the new PDS
  3. Update your DID document to point to the new PDS
  4. Chive detects the change and re-indexes

Your preprints, reviews, and endorsements move with you. No broken links. No lost citations.

Using multiple AppViews

Your data is accessible to any compliant AppView:

                    ┌──────────────┐
│ Your PDS │
└──────┬───────┘

┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Chive │ │ Future App │ │ Another App │
│ (preprints) │ │ (analytics) │ │ (citations) │
└──────────────┘ └──────────────┘ └──────────────┘

Each AppView provides a different lens on your data. You don't need to re-upload anything.

Exporting your data

At any time, you can export your complete repository:

# Export from your PDS (hypothetical command)
atproto repo export did:plc:abc123... --output my-research.car

The export contains:

  • All your records (preprints, reviews, endorsements)
  • Cryptographic signatures proving authenticity
  • Complete revision history
  • BlobRefs (pointers to your files)

Cryptographic guarantees

Signed records

Every record you create is cryptographically signed:

{
"record": {
"$type": "pub.chive.preprint.submission",
"title": "My Research",
// ... content
},
"sig": "zQmY8GkP..." // Your signature
}

This signature proves:

  • Authenticity: You created this record
  • Integrity: The content hasn't changed
  • Non-repudiation: You can't deny creating it

Content addressing

Blobs (PDFs, images) use content-addressed storage:

PDF content → SHA-256 hash → CID: bafyreibvocy34...

If even one byte changes, the CID changes. This makes tampering detectable.

What if Chive disappears?

If Chive shuts down:

What happensWhat doesn't happen
Chive's indexes go offlineYour preprints are deleted
Search becomes unavailableYour PDFs become inaccessible
Discovery features stopYour DOIs break
Your citations vanish

Your work remains in your PDS. Another AppView could index it. Your DOIs (registered with external services) continue to resolve.

Comparison with traditional platforms

FeatureTraditional preprint serverChive
Data locationPlatform serversYour PDS
OwnershipPlatform owns copyYou own original
PortabilityExport/import requiredAutomatic via DID
Backup responsibilityPlatformYou (or PDS provider)
Platform shutdownData at riskData unaffected
Terms of serviceCan restrict accessYou set terms
API accessPlatform controlsOpen AT Protocol

Your responsibilities

Data sovereignty comes with responsibility:

ResponsibilityHow to handle it
PDS reliabilityChoose a reputable PDS provider or self-host
BackupMost PDS providers handle backups; verify their policies
Key managementSecure your private keys; recovery is your responsibility
Content moderationYou're responsible for what you publish

Next steps