Methodology

How Science AI Journal performs rigorous peer review in under 15 minutes — training data, agent calibration, and the limits of what we claim.

The 5-step pipeline

1
Submission & intake
Authors upload a manuscript (PDF or text). The intake service extracts title, abstract, body, figures, and references; runs OCR if the PDF is scanned; and normalises citation syntax.
2
RAG context retrieval
For each of the 8 specialist agents, 8–40 real peer-review examples are retrieved from a SQLite FTS5 index of 23,000+ real reviews harvested from 15+ academic platforms (OpenReview, eLife, SciPost, PLOS ONE, BMJ Open, Nature Comms, and others). The retrieval is query-aware: each agent pulls examples most similar to the manuscript's domain and structure.
3
Parallel agent review
The 8 agents — methodology, formulas, originality, literature coverage, reproducibility, clarity, figures, and prior publication — run against the manuscript. Each produces a score, a qualitative summary, and a structured report. Two use Claude Sonnet (methodology, literature); the rest use Claude Haiku for cost efficiency.
4
Prior-publication fan-out
The originality and prior-publication agents fan out to CrossRef, arXiv, medRxiv, bioRxiv, Unpaywall, and our 900,000+ paper institutional library in parallel (12-second timeouts). Matches are surfaced with confidence scores so editors can adjudicate.
5
Synthesis & editorial decision
A synthesis pass reconciles the eight agent reports into a single recommendation (accept, minor revision, major revision, reject). A human editor reviews borderline cases; the review report is published alongside the manuscript if accepted, so readers can judge the quality of the review itself.

Training data provenance

Every agent is calibrated against real peer reviews collected from publicly accessible sources — not synthetic data and not proprietary publisher corpora. The training set totals 23,000+ reviews across 15+ platforms:

OpenReview (ICLR, NeurIPS, ICML)
eLife
SciPost
PLOS ONE
BMJ Open
Nature Communications
F1000 Research
Peer Community In
EGUsphere
Copernicus (open peer review)
PeerJ
MDPI Reviewer Reports
Crossref Public Review Data
bioRxiv TRiP project
Review Commons

Aggregated per-agent JSONL files live under training-data/by-agent/ and back the FTS5 retrieval index used during review.

Agent specialisations

Eight specialist agents, each calibrated on a dedicated slice of the training corpus.

Methodology

Audits study design, statistical power, and analytical choices against field-specific rigour standards (CONSORT, STROBE, PRISMA).

Formulas & Equations

Verifies mathematical derivations, checks dimensional analysis, and flags algebraic errors.

Originality

Surfaces overlap with prior work across CrossRef, arXiv, medRxiv, bioRxiv, Unpaywall, and an institutional library of 900,000+ papers.

Literature Coverage

Evaluates citation completeness against OpenAlex's 250M+ scholarly works.

Reproducibility

Inspects code availability, dataset accessibility, and sufficiency of methods detail for independent replication.

Clarity & Language

Assesses readability, structural flow, and adherence to scholarly writing norms.

Figures & Tables

Checks figure quality, caption completeness, and appropriateness of visual encodings.

Prior Publication

Fans out in parallel to six external sources to detect duplicate submissions and predatory overlap.

What we don't claim

AI peer review is not a replacement for domain-expert human review in high-stakes settings (clinical trials, safety-critical systems, paradigm-shift claims). We're transparent about the limits:

Agents can miss subtle methodological flaws that require cutting-edge domain knowledge.
Prior-publication detection is excellent for exact overlap but weaker for paraphrased or translated duplicates.
Originality scoring depends on index coverage — niche non-English work may be under-indexed.
The synthesis recommendation is a starting point; a human editor makes the final call for borderline cases.

Want the full picture?

How it works The dataset Submission guidelines Editorial board

Methodology

The 5-step pipeline

Submission & intake

RAG context retrieval

Parallel agent review

Prior-publication fan-out

Synthesis & editorial decision

Training data provenance

Agent specialisations

Methodology

Formulas & Equations

Originality

Literature Coverage

Reproducibility

Clarity & Language

Figures & Tables

Prior Publication

What we don't claim

Want the full picture?

Command palette