Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Bibtex/HTTP reference for datasets #85

Open
hendriks73 opened this issue Nov 6, 2015 · 7 comments
Open

RFE: Bibtex/HTTP reference for datasets #85

hendriks73 opened this issue Nov 6, 2015 · 7 comments
Labels

Comments

@hendriks73
Copy link
Contributor

Most datasets are the result of a paper and thus can be linked to (web page, pdf, DOI).
It would be nice, if there was a dedicated field for something like a bibtex entry as well as a URI for the paper.

I assume that this can currently go into the sandbox field, but since it's such a regular thing, perhaps a little special something is in order.

The object to extend would be Annotation_Metadata.

@bmcfee bmcfee added the question label Nov 6, 2015
@bmcfee
Copy link
Contributor

bmcfee commented Nov 6, 2015

I suspect this discussion will go down the same path as this thread.

DOI is a great idea, and IMO equivalently canonical to musicbrainz id's for the track data. Since we don't schematize musicbrainz ids, it's hard to justify schematizing DOIs. Even beyond that, it gets pretty dicey as to which things you schematize and which you don't: bibtex? web page? etc?

The sandbox approach will definitely work, and if it's consistently keyed, is easy enough to extract by the search() mechanism.

@hendriks73
Copy link
Contributor Author

So, let's simplify the suggestion a little:

Add one field to annotation_metadata, called publication_uri.

This would allow linking to a PDF or to a DOI via http://dx.doi.org/THE_DOI. Actually, it would allow linking to any resource... be it relative, absolute or whatever.

One could also introduce URI's for songs. E.g. http://musicbrainz.org/recording/ea2cd833-2be9-4150-a48c-55bf9c3c69a2 serves as a great URI.

In the end, we'd arrive at something that is RESTful and could also be used as a basis for generating web-content... Just a thought.

@bmcfee
Copy link
Contributor

bmcfee commented Nov 6, 2015

I like this idea a lot.

Would it make sense to simplify it to just uri (hi @urinieto !) ? Not all content is a publication, after all.

I can see this being applicable in a few places:

  • JAMS (for the jams object, if we're to be self-referential and have a canonical location)
  • FileMetadata (for the track itself)
  • Corpus (for the collection)
  • AnnotationMetadata (for individual annotations)
  • Curator?

The only downside I see is that one uri may not suffice. Maybe additional uris can live in the sandbox by convention?

What do folks think? @ejhumphrey @justinsalamon @rabitt @urinieto ?

Since it's a schema change, I'd suggest that if we do it, it should go into 0.3. (Even though it's backwards-compatible, I'd rather limit minor revisions to implementation stuff as much as possible.)

@hendriks73
Copy link
Contributor Author

If we wanted to go all out HATEOAS, i.e. completely self-describing and -discoverable, things would need to look a little differently. JSON examples can e.g. be found at spring.io. There, each href is also described by a relationship attribute. This would allow for characterizing the link as DOI, another representation of the same data, ...

You may see this as overkill though.

@urinieto
Copy link
Contributor

My (embarrassingly belated) two cents:
I think having the identifiers as a sandbox in the JAMS schema already allows for this kind of URIs insertion.
We could add some sort of document to he official docs to try to normalize the keys of the identifiers dictionary (e.g., use musicbrainz for musicbrainz ids).

Writing the metadata for the new SPAM dataset I realized I didn't know how to name these IDs in the schema, so this document could be helpful.

@justinsalamon
Copy link
Contributor

Since it's a schema change, I'd suggest that if we do it, it should go into 0.3. (Even though it's backwards-compatible, I'd rather limit minor revisions to implementation stuff as much as possible.)

👍

@bmcfee
Copy link
Contributor

bmcfee commented Aug 12, 2019

Linking back to #197 discussion -- if we put a little thought into this for the next round of schema changes, maybe this idea can replace curators entirely? The world looks different now than it did in 2015, and DOIs for datasets are now pretty common and easy to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants