Skip to main content

Knowledge graph

Chive uses a community-curated knowledge graph to classify and connect scholarly works. This Wikipedia-style approach allows the research community to build and maintain a structured taxonomy of academic fields.

Overview

The knowledge graph serves three purposes:

  1. Discovery: Find related preprints through field classifications
  2. Context: Understand how a work fits into broader research areas
  3. Navigation: Browse preprints by field, subfield, or topic
                    ┌──────────────────┐
│ Mathematics │
└────────┬─────────┘
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Algebra │ │ Analysis │ │ Geometry │
└──────┬───────┘ └──────────────┘ └──────────────┘

┌──────┴──────┬──────────────┐
▼ ▼ ▼
┌────────┐ ┌──────────┐ ┌───────────┐
│ Group │ │ Ring │ │ Linear │
│ Theory │ │ Theory │ │ Algebra │
└────────┘ └──────────┘ └───────────┘

Field nodes

A field node represents an academic discipline, subdiscipline, or topic. Each field has:

PropertyDescription
nameHuman-readable name (e.g., "Algebraic Geometry")
descriptionBrief explanation of the field's scope
aliasesAlternative names (e.g., "Algebraic Geometry" = "AG")
parentFieldsBroader categories this field belongs to
childFieldsNarrower specializations within this field
relatedFieldsFields with conceptual overlap
externalIdsLinks to Wikidata, Library of Congress, etc.

Field relationships

Fields connect through three relationship types:

Broader/Narrower (hierarchical)
Mathematics
└── Algebra
└── Group Theory

Related (conceptual overlap)
Algebraic Topology ──related──► Algebraic Geometry

Cross-disciplinary
Computational Linguistics
├── parent: Linguistics
└── parent: Computer Science

PMEST classification

Beyond hierarchical fields, Chive uses PMEST (Personality, Matter, Energy, Space, Time) faceted classification. This system allows filtering across orthogonal dimensions:

FacetMeaningExample
PersonalityCore subject"Quantum mechanics"
MatterMaterial or substance"Carbon nanotubes"
EnergyProcess or action"Oxidation"
SpaceGeographic scope"Arctic regions"
TimeTemporal scope"Holocene"

A single preprint can be classified across multiple facets:

Preprint: "Climate-Driven Carbon Nanotube Degradation in Arctic Soils"

Facets:
Personality: Environmental Chemistry, Materials Science
Matter: Carbon nanotubes, Soil
Energy: Degradation, Climate change
Space: Arctic
Time: Contemporary (2020-present)

Users can combine facets to narrow searches:

GET /xrpc/pub.chive.graph.browseFaceted?
personality=materials-science&
matter=carbon-nanotubes&
space=arctic

Authority records

Authority records ensure consistency across the knowledge graph. They're like library catalog entries for concepts:

// Example authority record for "Quantum Computing"
{
"name": "Quantum Computing",
"aliases": [
"Quantum Computation",
"QC"
],
"description": "Computational paradigm using quantum-mechanical phenomena",
"externalLinks": {
"wikidata": "Q339",
"lcsh": "sh2008010405",
"viaf": "168470861"
},
"broaderTerms": ["Computer Science", "Quantum Mechanics"],
"narrowerTerms": ["Quantum Error Correction", "Quantum Algorithms"],
"relatedTerms": ["Quantum Information Theory"]
}

Authority records link to external controlled vocabularies:

VocabularyPurpose
WikidataMultilingual structured knowledge
LCSHLibrary of Congress Subject Headings
VIAFVirtual International Authority File
FASTFaceted Application of Subject Terminology

Reconciliation

When users tag preprints, Chive reconciles tags against authority records:

User enters: "quantum computing"

Chive matches: Authority record "Quantum Computing" (Q339)

Preprint linked to canonical concept

This prevents fragmentation ("quantum computing" vs "Quantum Computation" vs "QC" all map to the same concept).

Community governance

The knowledge graph uses Wikipedia-style moderation. Users can:

  1. Propose new fields or changes
  2. Discuss proposals in threaded comments
  3. Vote on whether to accept proposals

Proposal types

TypeWhat it doesApproval threshold
Create fieldAdd a new field to the taxonomy67% with 5+ votes
Update fieldModify name, description, or relationships60% with 3+ votes
Merge fieldsCombine redundant fields67% with 5+ votes
Deprecate fieldMark a field as obsolete75% with 7+ votes
Authority changeUpdate authority records75% with 7+ votes

Voter tiers

Not all votes carry equal weight. Expertise in the relevant field increases vote weight:

TierVote weightCriteria
Community member1.0xAny authenticated user
Active contributor1.5x10+ preprints or reviews
Domain expert2.5xPublications in the field
Trusted editor3.5xAppointed by governance
Authority editor4.5xLibrary science expertise

Proposal workflow

┌──────────┐     ┌─────────────┐     ┌──────────┐     ┌───────────┐
│ Draft │────►│ Discussion │────►│ Voting │────►│ Outcome │
│ │ │ (7 days) │ │ (5 days) │ │ │
└──────────┘ └─────────────┘ └──────────┘ └───────────┘
│ │
│ │
▼ ▼
Revisions allowed Threshold met?
├── Yes → Approved
└── No → Rejected

User tags vs. authority terms

Chive distinguishes between user-generated tags and authority-controlled terms:

User tagsAuthority terms
Free-form textControlled vocabulary
Personal organizationCommunity consensus
No voting requiredProposal + voting
May be reconciledCanonical concepts

Users can tag preprints freely. Popular tags may be promoted to authority terms through a two-stage process:

  1. Automatic nomination: Tag used on 10+ preprints by 3+ users
  2. Community vote: Standard proposal process

Graph algorithms

The knowledge graph enables advanced discovery features:

Citation analysis

Find papers that:
- Cite foundational works in the field
- Bridge multiple subfields
- Introduce new connections

Semantic similarity

Given a preprint about "quantum error correction":
- Find semantically similar preprints
- Suggest related fields to explore
- Identify key authors in adjacent areas

Field evolution

Track how fields change over time:
- New subfields emerging
- Fields merging or splitting
- Terminology shifts

The knowledge graph enhances search in several ways:

Expansion

A search for "machine learning" automatically includes:

  • Narrow terms: "deep learning", "neural networks"
  • Related terms: "artificial intelligence", "statistical learning"

Disambiguation

A search for "network" prompts:

  • Computer networks?
  • Neural networks?
  • Social networks?
  • Network science?

Faceted browsing

Filter results by any PMEST dimension while staying within a field:

Field: Machine Learning
Filter by Matter: Medical imaging
Filter by Time: Last 5 years
Filter by Space: [any]

Wikidata integration

Chive synchronizes with Wikidata to:

  1. Import established classifications
  2. Link local concepts to global identifiers
  3. Contribute new academic concepts back
# Example SPARQL query to find related concepts
SELECT ?item ?itemLabel WHERE {
wd:Q339 wdt:P279* ?item . # Q339 = Quantum computing
?item wdt:P31 wd:Q11862829 . # Instance of academic discipline
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

API endpoints

EndpointPurpose
pub.chive.graph.getFieldGet field details
pub.chive.graph.listFieldsList fields (paginated)
pub.chive.graph.searchAuthoritiesSearch authority records
pub.chive.graph.getAuthorityGet authority record details
pub.chive.graph.browseFacetedFaceted search
pub.chive.graph.getFieldPreprintsPreprints in a field

Next steps