A Read Only RESTful API for DSpace

This forks, refactors and backports the original DSpace REST API

adding integration tests, hard limit and fetch groups;
removing write support; and
improving pagination.

For the most part, features have been fixed or cleanly removed.

Please note that allowing use of some of the end points retained from the original design may be unwise.

End Points

Covered Well By Integration Tests

	GET	PUT	POST	DELETE
items	✔	✘	✘	✘
bitstream	✔	✘	✘	✘
collections	✔	✘	✘	✘
communities	✔	✘	✘	✘
search	✔	✘	✘	✘
harvest	✔	✘	✘	✘

Light Or No Integration Tests

	GET	PUT	POST	DELETE
users	✓	✘	✘	✘
groups	✓	✘	✘	✘
stats	✓	✘	✘	✘

Running Integration And Regression Tests

PostgreSQL only.

First time, run create_integration_test_db.sh.

Then mvn -DskipTests=false

Which DSpace Version?

A fork of DSpace 1.5.2

In particular, to ItemIterator add

public void skip() throws SQLException {
    if (itemRows.hasNext())
    {
        itemRows.next();
    }
}

(to support pagination).

Fetch Groups

Problem — rich data slow to produce and consume
Cause — no fine control over richness of data
Solution — fetch groups
Implementation — optional fetch parameter

Supported Fetch Groups

	light	display	example
items	✔	✔	`/items/5.json?fetch=display`
search [items]	✔	✔	`/search.json?query=search.resourcetype:2&fetch=light`
communities	✔	✘	`/communities/25.json?fetch=light`
collections/x/items	✔	✘	`/collections/1/items.json?fetch=light`

Pagination

Problem — too much data using too many resources to produce and consume
Cause — inefficient and absent pagination code
Solution — push pagination into data access and pagination more end points

Supported Pagination

	page	perpage	sort	example
search [items]	✔	✔	✘	`/search.json?query=search.resourcetype:2&_page=2&_perpage=20`
collections/x/items	✔	✓	✘	`/collections/1/items.json?_page=2`

Hard Limit For Items

Hard coded limit (10000) for the maximum number of items that can be rendered to JSON.

Known Limitations

Binder does not stream. Given adequate memory for the required concurrent volume, this shouldn't be an issue. Switching to a streaming binder probably requires replacing Sakai.
Entity bus obscures design. Dropping Sakai would allow simplification.
Refactoring incomplete. Still difficult to work with the code base.
Each new fetch group requires at least one new class. Moving away from Sakai would allow more flexible binding.
Sorting needs to be supported in DSpace. This would mean either substantial changes to core DSpace or switching to a more flexible data access standard (for example JPA).
Fetch group and pagination added on an ad hoc basis. Code will need to be added for unsupported areas.
No support for writing. Those needing write support might route GET here and PUT, DELETE and POST to the original code. Note when reading and writing are mixed in the same security domain, care MUST be taken to limit vulnerability to scripts uploaded as repository content.
Allowing access to some end points may not be sensible in production. Given a reasonable volume of repository data, allowing (unpaginated) access may effective deny service. Those willing to break compatibility should consider preventing unsafe calls.
Areas of the original specification are not particularly RESTful. Those willing to break compatibility should consider adding RESTful linking and pagination.
Exception handling has been improved but more work remains.
Some regression tests rely on database order, and may be fragile.
Repackaging incomplete. This is a logical fork, and packaging should reflect this.
Build script not yet updated. This is a logical fork, and maven packaging should reflect this.
This module builds against a fork of DSpace 1.5.2.
- A multi-module project would allow support for multiple versions.
- Some changes need to be fed back into core or a public fork created.
The hard limit:
- is applied only to items, and
- is not configurable.
Cache header support poor. Ideally DSpace would expose a feed allowing upstream caches to be invalidated when data changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

A Read Only RESTful API for DSpace

End Points

Running Integration And Regression Tests

Which DSpace Version?

Fetch Groups

Pagination

Hard Limit For Items

Known Limitations

Files

README.md

Latest commit

History

README.md

File metadata and controls

A Read Only RESTful API for DSpace

End Points

Running Integration And Regression Tests

Which DSpace Version?

Fetch Groups

Pagination

Hard Limit For Items

Known Limitations