diff --git a/app/views/pages/api.html.erb b/app/views/pages/api.html.erb index 39a17859..ec817f7b 100644 --- a/app/views/pages/api.html.erb +++ b/app/views/pages/api.html.erb @@ -44,7 +44,7 @@

Harvesting full dumps and lists of deleted records

-

The POD Data Lake uses ResourceSync, an extension of the Sitemaps Protocol, to expose aggregated data in three forms. The links below point to specifc ResourceSync resource lists that serve as the starting point for harvesting.

+

The POD Data Lake uses ResourceSync, an extension of the Sitemaps Protocol, to expose aggregated data in three forms. The links below point to specific ResourceSync resource lists that serve as the starting point for harvesting.

Like uploads, harvesting data requires an access token. To harvest the data, you would start by harvesting the correct sitemap, and retriving linked resource lists and resources. Currently, the POD Data Lake supports baseline synchronization; incremental synchronization (as defined by the ResourceSync specification) is not yet implemented. Please also note that you must parse and inspect the returned sitemaps to determine the URLs to the original or nomralized ata you wish to harvest.

Since harvesting using ResourceSync requires parsing the returned sitemaps to find the additional links to follow, using ResourceSync client like resync is recommended.

API documentation