GPO Posts All CGP Records as of June 29, 2021, on GitHub

  • Last Updated: December 31, 1969
  • Published: September 07, 2021

Library Services and Content Management (LSCM) has posted all MARC bibliographic records (999,580 records) in the Catalog of U.S. Government Publications (CGP) as of June 29, 2021, with the exception of brief bibliographic records (also known as brief bibs), on GitHub. The records are available in UTF-8 in the cataloging-records-all-cgp-utf8 repository and in MARCXML in the cataloging-records-all-cgp-marcxml repository.

In combination with the monthly CGP files in the CGP_MARC_Records collection, these files essentially represent the entire CGP. GPO will periodically refresh the files with a new snapshot of the whole CGP.

The total size of the files is 1.45 GB (1,562,240,801 bytes). The records in each file are not organized in any particular way. The UTF-8 repository has 21 files, each of which holds approximately 48,000 records. The MARCXML repository has 67 files, each of which holds approximately 15,000 records.

This is the first time GPO has made all CGP records available in a set of files for downloading, thus making it easier to repurpose and reuse our metadata to access U.S. Government publications.

For questions, please contact us via askGPO, and use the category, Cataloging/Metadata (Policy and Records).