Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase FAIRness of openwashdata by adding functions that export to different metadata schemas / enrich metadata #25

Open
larnsce opened this issue Oct 24, 2024 · 2 comments
Assignees
Milestone

Comments

@larnsce
Copy link
Contributor

larnsce commented Oct 24, 2024

This can be part of openwashdata phase 2 WP4: Increase FAIRness: https://openwashdata.org/pages/gallery/proposal-02/#wp4-increase-fairness

The idea is to use the existing metadata we have, enrich it and then export to other metadata schemas. An example comes from the dataspice R package to prepare data publications.

I am thinking particularly of the write_spice() function, which writes metadata from a set of CSVs into the JSON-LD:

https://docs.ropensci.org/dataspice/reference/write_spice.html

Package: https://docs.ropensci.org/dataspice/

We should review this workflow and adapt some of it to our own needs.

Another one is the Frictionless Data Table Schema: https://specs.frictionlessdata.io//table-schema/

Lastly, I think we should consider building a proper data catalogue using a data management system like CKAN: https://ckan.org/

@bonschorno @yashdubey132: we can split these up into different issues, but I would like you two to work on this.

@larnsce larnsce added this to the V2.0.0 milestone Oct 24, 2024
@larnsce
Copy link
Contributor Author

larnsce commented Oct 24, 2024

Here is our current metadata table, which is all created manually. We could extract more information from the existing packages (e.g. nrow / ncol) but also add additional manual fields. https://docs.google.com/spreadsheets/d/1vtw16vpvJbioDirGTQcy0Ubz01Cz7lcwFVvbxsNPSVM/edit?gid=0#gid=0

@larnsce
Copy link
Contributor Author

larnsce commented Oct 25, 2024

I am adding information here that comes from Asana and is about the feasibility of sharing our data also as packages for Python.


understand the need of extending GHE functionality to Python
discuss the necessity of building python pacakges

Some resources:
Instruction:
https://towardsdatascience.com/step-by-step-guide-to-creating-r-and-python-libraries-e81bbea87911
https://docs.python-guide.org/writing/structure/
Opinion:
https://www.ethanrosenthal.com/2022/02/01/everything-gets-a-package/
https://packaging.python.org/en/latest/tutorials/packaging-projects/
Example
http://www.data8.org/zero-to-data-8/datascience.html
https://mintcanary.com/frictionlessdata/tools/

Take-away from 24/01 discussion:
Nic has a proof-of-concept python pkg on wasteskipsblantyre
current design is inside an owd R data package
pro: less redundant
con: can be confusing for most users and complica


Here also the feedback from @n-raspi: https://docs.google.com/document/d/1DkzGRQTGXS0IuT_hFEFSUSLJIYczr5LXng10ELTwp78/edit#heading=h.6gmwdlbbkrxh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants