Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update collection metadata #149

Open
ShanaLMoore opened this issue Oct 27, 2022 · 22 comments
Open

Update collection metadata #149

ShanaLMoore opened this issue Oct 27, 2022 · 22 comments
Assignees
Labels

Comments

@ShanaLMoore
Copy link
Contributor

ShanaLMoore commented Oct 27, 2022

Make updates per spreadsheet changes. Re Verify that collection metadata matches the spreadsheet, since it has changed.

Note from client (Meredith):

Noting here that two changes were made on the spreadsheet. License was removed and a new URI was selected for the property for form (http://purl.org/dc/terms/format) due to an oversight on our main MAP for work types. These are in addition to Tuesday's changes of removing rights and adding note. I will not make any other changes after today.

Original ticket #126

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Oct 31, 2022

Q: Will there be consequences for removing Hyrax::BasicMetadata from the collection model? Both default hyrax && hyku includes it.

@ShanaLMoore
Copy link
Contributor Author

timebox removing hyrax metadata - 1hr

otherwise just update property uri and add note property

@DiemBTran
Copy link
Contributor

Needs further review:

tested on:


  1. There are 25 display labels on a new collection form, while the Digital Collections: Vendor Supplied MAP (DC MAP) only has 22 properties. The 3 extra properties I found on the new collection form are:
    1. location
    2. identifier
    3. language
    4. These 3 should be removed from the new collection form
  2. There were 2 fields on the DC MAP that I did not find on the new collection form word-for-word but I thought could be “close enough” matches for each other:
    1. the form has rights notes but the DC MAP has note
    2. the form has related URL but the DC MAP has collection link
    3. These are interchangeable, so either the new collection form or the DC MAP should be updated to reflect that sameness

@ShanaLMoore
Copy link
Contributor Author

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Nov 2, 2022

Client changed collection_link to resource_link and wants to use notes instead of rights_notes to make it more generic.

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Nov 3, 2022

NOTE TO QA: cc @DiemBTran

resource_link will not be available as read only on the form, but the user should be able to set it via bulkrax: https://qa.utk-hyku-staging.notch8.cloud/dashboard/collections/678a5d13-64f9-4140-a68d-d2cdc011e29b/edit?locale=en

Updated sample file:

149-all-collection-metadata.csv

Image

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Nov 14, 2022

QA:

Clicking on Collections is causing a 500 error in staging. Looking into it!

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Nov 14, 2022

Pulling this ticket back to in dev to resolve this issue.

Note, this issue is present after importing a new metadata profile. However Collection isn't controlled be allinson flex so not sure how it could have affected it yet.

Image

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Nov 15, 2022

As suspected, this was caused by a invalid metadata profile. However I'm super surprised it affected the collection metadata. To correct this, the date fields needed multi_value: true

This now passes QA. A user is able to create collections and save metadata per the client's requirement, manually and via bulkrax.

However, clicking on contributor causes 500 error. This will be separated into its own ticket.

Image

@ShanaLMoore
Copy link
Contributor Author

contributor link bug: #193

@mlhale7
Copy link
Collaborator

mlhale7 commented Nov 22, 2022

@ShanaLMoore - I haven't been able to reproduce the contributor bug. I wanted to clarify - if we approve #149, issue #193 will still be kept open as it has been separated out as it's own problem?

@mlhale7
Copy link
Collaborator

mlhale7 commented Dec 6, 2022

@ShanaLMoore - thanks for this. Editing a collection in staging, one issue I noticed was that "Resource type" seems to be linked to set of terms that we will not be using for Digital Collections (Article, Dataset, etc.). Expected values come from this vocabulary - https://id.loc.gov/vocabulary/resourceTypes.html (e.g. "Text, "Still image", etc.). I'm assuming this might be because the property we selected (http://purl.org/dc/terms/type) is used out of the box for IR purposes or something?

I hadn't realized that "Total Items" was a thing for every collection. We don't technically need an extent field then, but we can just keep it and not populate it.

@mlhale7
Copy link
Collaborator

mlhale7 commented Dec 6, 2022

Title and abstract also aren't expected to be arrays or multiple values. It's possible students will accidentally add additional values that we don't need since it's possible, but we can also live with this.

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Dec 14, 2022

Hi @mlhale7 I am so sorry I missed all of your comments until now!

Title and abstract are default hyrax metadata fields. When we redfine their data types, a bunch of fields in hyrax breaks. We can implement a validation to make sure there is only 1 element in the field; the form will refuse the submission if a user selects more. It should already be applied to title actually.

I can also create a ticket to clean up the form later. For example, remove any "add" button if a field should only have one. I understand it's a bad user experience to act like they can add more when they really can't.

I need a moment to look into resource type. I believe by default, resource type is hooked into hyrax's vocab: https://github.com/scientist-softserv/utk-hyku/blob/main/config/authorities/resource_types.yml

If that's the case, could you provide an updated file similar to how you all did for license?

If it's meant to be a remote vocabulary, that work will be completed as part of the Questioning Authority epic #263 For now, since ingests is the priority, I'd just want to make sure that the remote uri saves correctly when importing w bulkrax. A reindex after that epic is complete should replace it with the proper label.

I can also go ahead and remove the extent field, if that's preferred vs keeping it around.

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Dec 14, 2022

@ShanaLMoore - I haven't been able to reproduce the contributor bug. I wanted to clarify - if we approve #149, issue #193 will still be kept open as it has been separated out as it's own problem?

@mlhale7 Yes, that would be correct. Oftentimes when we find minor issues, we'll break it into its own ticket so that we can keep the majority of the work/feature moving forward. Bug tickets would remain open and treated as a separate issue.

@mlhale7
Copy link
Collaborator

mlhale7 commented Dec 14, 2022

@ShanaLMoore - really the only critical problem here stopping approval would be the resource_type issue. The rest of the comments are "nice to haves." For resource_type for collections, we really just want to hard code in "Collection" as the dcterms:type - https://utk-mods-to-rdf.readthedocs.io/en/latest/contents/4_mapping.html#typeofresource-with-collection-yes For individual records we'd want to use other values in LoC's resourceTypes vocab. I've attached a yaml just for the value expected for collections for resource_type resource_type_collection.txt. The reason for this value is more for sharing elsewhere than display on Hyku. When we share collection records in Primo, we like to note that they are for collections rather than individual records.

@ShanaLMoore
Copy link
Contributor Author

@mlhale7 I believe that most of your concerns will be resolved once we finish implementing the remote questioning authorities and update the form/ui portion. Instead of using the local authorities it will resolve the uri and save/display the label.

However, while investigating this ticket I discovered that the resource_type uri is not getting saved at all for work types or collections. So I will work on resolving that asap since that impacts ingests.

sample file:
heilman_full_with_collections_short.csv

Note that resource_type in parsed_metadata is [] when it should be the uri from the spreadsheet:

image

@ShanaLMoore
Copy link
Contributor Author

@mlhale7 I created a placeholder ticket for us to revisit the UI concerns: #275

@ShanaLMoore
Copy link
Contributor Author

ShanaLMoore commented Dec 14, 2022

@mlhale7 please reference this MR for the changes I made.

To reiterate, I believe all of your concerns will be addressed when we fix the UI and implement remote questioning authority.

To me, the most important part right now is to make sure the uri values gets saved to the metadata properties. Regarding this ticket, if the metadata properties are present as required, please consider passing this along as we will address functionality at a later time (post our ingest priority). Doing so will also unblock you all from doing additional ingests.

But if there are any additional questions or concerns, please let me know.

@ShanaLMoore
Copy link
Contributor Author

Tested on staging and verified the value gets saved for image and collection: https://qa.utk-hyku-staging.notch8.cloud/importers/34?locale=en

@mlhale7
Copy link
Collaborator

mlhale7 commented Dec 15, 2022

@ShanaLMoore - thanks for this. I'm seeing values I'd expect for resource_type on staging now and approve of the change to be deployed.

@mlhale7
Copy link
Collaborator

mlhale7 commented Jan 17, 2023

This should be good to go @ShanaLMoore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

3 participants