You can simply refer to the github project:
GROBID (2008-2016) https://github.com/kermitt2/grobid
Presentations on Grobid
GROBID in 30 slides (2015).
GROBID in 20 slides (2012).
Papers on Grobid
GROBID: Combining Automatic Bibliographic Data Recognition and Term Extraction for Scholarship Publications. P. Lopez. Proceedings of the 13th European Conference on Digital Library (ECDL), Corfu, Greece, 2009.
Automatic Extraction and Resolution of Bibliographical References in Patent Documents. P. Lopez. First Information Retrieval Facility Conference (IRFC), Vienna, May 2010. LNCS 6107, pp. 120-135. Springer, Heidelberg (2010).
Automatic Metadata Extraction The High Energy Physics Use Case. Joseph Boyd. Master Thesis, EPFL, Switzerland, 2015.
M. Lipinski, K. Yao, C. Breitinger, J. Beel, and B. Gipp, Evaluation of Header Metadata Extraction Approaches and Tools for Scientific PDF Documents, in Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), Indianapolis, IN, USA, 2013.
Phil Gooch and Kris Jack, How well does Mendeley’s Metadata Extraction Work?
Articles on CRF for bibliographical extraction
Accurate Information Extraction from Research Papers using Conditional Random Fields. Fuchun Peng and Andrew McCallum. Proceedings of Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLT-NAACL), 2004.
Isaac G. Councill, C. Lee Giles, Min-Yen Kan. (2008) ParsCit: An open-source CRF reference string parsing package. In Proceedings of the Language Resources and Evaluation Conference (LREC), Marrakesh, Morrocco.
Other similar Open Source tools
CiteSeerX page on Scholarly Information Extraction which list many tools and related information.