GROBID Documentation
Getting Started
New to GROBID? Start here to get up and running quickly.
-
Quick start — install and launch GROBID in minutes
-
Run with Docker — the easiest way to deploy GROBID
-
Troubleshooting and FAQ — common issues and solutions
Upgrading
- Upgrade guide — what to know when moving between major GROBID versions
User Guide
Everything you need to use GROBID once it's running.
-
Using the REST API — endpoints, parameters, and client libraries
-
Understanding the output (TEI) — structure of the TEI XML results
-
PDF coordinates — extracting bounding boxes for structures in the original PDF
-
Configuration — tuning GROBID for your use case
-
Consolidation service — linking extracted references to external metadata
-
Specialized processes — patents, medical, and other domain-specific workflows
About
-
Introduction — what GROBID is and what it does
-
How GROBID works — architecture and processing pipeline
-
Benchmarks — evaluation methodology and overview of results
-
References — publications about GROBID
-
Community — mailing list, Discord, and how to get involved
Developer Guide
Building, training, and extending GROBID.
-
Build from source — set up a development environment
-
Training and evaluating models — retrain or fine-tune GROBID models
-
End-to-end evaluation — evaluate full pipeline performance
-
Deep Learning models — using DL models instead of default CRF
-
Developer notes — internal conventions and tips for contributors
-
Recompiling CRF libraries — rebuilding native CRF dependencies
Annotation Guidelines
Guidelines for annotating training data.
Benchmarking
Detailed evaluation results on specific datasets.
Archive
Deprecated features kept for reference.