This forks, refactors and backports the original DSpace REST API
- adding integration tests, hard limit and fetch groups;
- removing write support; and
- improving pagination.
For the most part, features have been fixed or cleanly removed.
Please note that allowing use of some of the end points retained from the original design may be unwise.
GET | PUT | POST | DELETE | |
---|---|---|---|---|
items | ✔ | ✘ | ✘ | ✘ |
bitstream | ✔ | ✘ | ✘ | ✘ |
collections | ✔ | ✘ | ✘ | ✘ |
communities | ✔ | ✘ | ✘ | ✘ |
search | ✔ | ✘ | ✘ | ✘ |
harvest | ✔ | ✘ | ✘ | ✘ |
GET | PUT | POST | DELETE | |
---|---|---|---|---|
users | ✓ | ✘ | ✘ | ✘ |
groups | ✓ | ✘ | ✘ | ✘ |
stats | ✓ | ✘ | ✘ | ✘ |
PostgreSQL only.
First time, run create_integration_test_db.sh
.
Then mvn -DskipTests=false
A fork of DSpace 1.5.2
In particular, to ItemIterator
add
public void skip() throws SQLException {
if (itemRows.hasNext())
{
itemRows.next();
}
}
(to support pagination).
- Problem — rich data slow to produce and consume
- Cause — no fine control over richness of data
- Solution — fetch groups
- Implementation — optional
fetch
parameter
light | display | example | |
---|---|---|---|
items | ✔ | ✔ | /items/5.json?fetch=display |
search [items] | ✔ | ✔ | /search.json?query=search.resourcetype:2&fetch=light |
communities | ✔ | ✘ | /communities/25.json?fetch=light |
collections/x/items | ✔ | ✘ | /collections/1/items.json?fetch=light |
- Problem — too much data using too many resources to produce and consume
- Cause — inefficient and absent pagination code
- Solution — push pagination into data access and pagination more end points
page | perpage | sort | example | |
---|---|---|---|---|
search [items] | ✔ | ✔ | ✘ | /search.json?query=search.resourcetype:2&_page=2&_perpage=20 |
collections/x/items | ✔ | ✓ | ✘ | /collections/1/items.json?_page=2 |
Hard coded limit (10000) for the maximum number of items that can be rendered to JSON.
- Binder does not stream. Given adequate memory for the required concurrent volume, this shouldn't be an issue. Switching to a streaming binder probably requires replacing Sakai.
- Entity bus obscures design. Dropping Sakai would allow simplification.
- Refactoring incomplete. Still difficult to work with the code base.
- Each new fetch group requires at least one new class. Moving away from Sakai would allow more flexible binding.
- Sorting needs to be supported in DSpace. This would mean either substantial changes to core DSpace or switching to a more flexible data access standard (for example JPA).
- Fetch group and pagination added on an ad hoc basis. Code will need to be added for unsupported areas.
- No support for writing. Those needing write support might route
GET
here andPUT
,DELETE
andPOST
to the original code. Note when reading and writing are mixed in the same security domain, care MUST be taken to limit vulnerability to scripts uploaded as repository content. - Allowing access to some end points may not be sensible in production. Given a reasonable volume of repository data, allowing (unpaginated) access may effective deny service. Those willing to break compatibility should consider preventing unsafe calls.
- Areas of the original specification are not particularly RESTful. Those willing to break compatibility should consider adding RESTful linking and pagination.
- Exception handling has been improved but more work remains.
- Some regression tests rely on database order, and may be fragile.
- Repackaging incomplete. This is a logical fork, and packaging should reflect this.
- Build script not yet updated. This is a logical fork, and maven packaging should reflect this.
- This module builds against a fork of DSpace
1.5.2
.- A multi-module project would allow support for multiple versions.
- Some changes need to be fed back into core or a public fork created.
- The hard limit:
- is applied only to items, and
- is not configurable.
- Cache header support poor. Ideally DSpace would expose a feed allowing upstream caches to be invalidated when data changes.