Web Harvesting

Documents related to GPO's Web Harvesting pilot project to capture official Environmental Protection Agency (EPA) publications in scope of GPO's information dissemination programs.


Publications from the sample pilot are available here and will be cataloged in the CGP in the future.


SOW on providing a number of different products and/or services related to the discovery, harvesting, and assessment of documents and publications from Web sites using Web crawler and other appropriate technologies (to be specified by vendor).
Reports on the specific context of the results of the pilot, including a summary of analysis done on the work performed, an assessment of lessons learned, and planned future direction and next steps for further development of the harvesting function to be implemented during Release 2 of GPO's Future Digital System (FDsys), currently scheduled for mid-2008.