Skip to content

Latest commit

 

History

History
156 lines (125 loc) · 6.68 KB

README.md

File metadata and controls

156 lines (125 loc) · 6.68 KB

A Read Only RESTful API for DSpace

This forks, refactors and backports the original DSpace REST API

For the most part, features have been fixed or cleanly removed.

Please note that allowing use of some of the end points retained from the original design may be unwise.

End Points

Covered Well By Integration Tests
GETPUTPOSTDELETE
items
bitstream
collections
communities
search
harvest
Light Or No Integration Tests
GETPUTPOSTDELETE
users
groups
stats

PostgreSQL only.

First time, run create_integration_test_db.sh.

Then mvn -DskipTests=false

A fork of DSpace 1.5.2

In particular, to ItemIterator add

public void skip() throws SQLException {
    if (itemRows.hasNext())
    {
        itemRows.next();
    }
}

(to support pagination).

  • Problem — rich data slow to produce and consume
  • Cause — no fine control over richness of data
  • Solution — fetch groups
  • Implementation — optional fetch parameter
Supported Fetch Groups
lightdisplayexample
items/items/5.json?fetch=display
search [items]/search.json?query=search.resourcetype:2&fetch=light
communities/communities/25.json?fetch=light
collections/x/items/collections/1/items.json?fetch=light
  • Problem — too much data using too many resources to produce and consume
  • Cause — inefficient and absent pagination code
  • Solution — push pagination into data access and pagination more end points
Supported Pagination
pageperpagesortexample
search [items]/search.json?query=search.resourcetype:2&_page=2&_perpage=20
collections/x/items/collections/1/items.json?_page=2

Hard coded limit (10000) for the maximum number of items that can be rendered to JSON.

Known Limitations

  • Binder does not stream. Given adequate memory for the required concurrent volume, this shouldn't be an issue. Switching to a streaming binder probably requires replacing Sakai.
  • Entity bus obscures design. Dropping Sakai would allow simplification.
  • Refactoring incomplete. Still difficult to work with the code base.
  • Each new fetch group requires at least one new class. Moving away from Sakai would allow more flexible binding.
  • Sorting needs to be supported in DSpace. This would mean either substantial changes to core DSpace or switching to a more flexible data access standard (for example JPA).
  • Fetch group and pagination added on an ad hoc basis. Code will need to be added for unsupported areas.
  • No support for writing. Those needing write support might route GET here and PUT, DELETE and POST to the original code. Note when reading and writing are mixed in the same security domain, care MUST be taken to limit vulnerability to scripts uploaded as repository content.
  • Allowing access to some end points may not be sensible in production. Given a reasonable volume of repository data, allowing (unpaginated) access may effective deny service. Those willing to break compatibility should consider preventing unsafe calls.
  • Areas of the original specification are not particularly RESTful. Those willing to break compatibility should consider adding RESTful linking and pagination.
  • Exception handling has been improved but more work remains.
  • Some regression tests rely on database order, and may be fragile.
  • Repackaging incomplete. This is a logical fork, and packaging should reflect this.
  • Build script not yet updated. This is a logical fork, and maven packaging should reflect this.
  • This module builds against a fork of DSpace 1.5.2.
    • A multi-module project would allow support for multiple versions.
    • Some changes need to be fed back into core or a public fork created.
  • The hard limit:
    • is applied only to items, and
    • is not configurable.
  • Cache header support poor. Ideally DSpace would expose a feed allowing upstream caches to be invalidated when data changes.