Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support YAML data #84

Open
idomingu opened this issue Jul 4, 2024 · 5 comments
Open

Support YAML data #84

idomingu opened this issue Jul 4, 2024 · 5 comments
Assignees
Labels
question Further information is requested working-group

Comments

@idomingu
Copy link

idomingu commented Jul 4, 2024

Hi,

I wonder how RML-IO could cope with datasets encoded in YAML format. YAML can be translated to JSON without losing information, so I think rml:JSONPath could be used as rml:ReferenceFormulation for YAML data.

In such case, the RML-IO spec should also mention YAML when using JSONPath, though it should also note that RML engines must internally translate YAML to JSON to make this possible.

Any thoughts?

Thanks!

@dachafra dachafra added question Further information is requested working-group labels Jul 4, 2024
@dachafra
Copy link
Member

dachafra commented Jul 4, 2024

Maybe this issue is more a rml-io-registry issue than a rml-io one. @DylanVanAssche what do you think?

@DylanVanAssche
Copy link
Collaborator

While translating YAML to JSON works, isn't there a similar thing to JSONPath/XPath for YAML?
If there is, then it is rml-io, if not, then it it rml-io-registry

@idomingu
Copy link
Author

idomingu commented Jul 4, 2024

AFAIK there is no such thing as "YAMLPath". The best example I know about is Kubernetes API, which allows you to select data in the YAML manifest using JSONPath (see JSONPath Support). Maybe someone can shed some light here.

What's the purpose of rml-io-registry and how it relates to rml-io? In any case, I still think YAML should be mentioned in the RML-IO spec. Otherwise, it seems you cannot handle YAML data in RML.

@DylanVanAssche
Copy link
Collaborator

What's the purpose of rml-io-registry and how it relates to rml-io?

RML-IO puts its focus on Logical Source/Target and Source/Target of RML.
In a Source/Target you have the access description while the Logical Source/Target contains the Source/Target as access description and the reference formulation and other things.

rml-io-registry aims to provide a detailed description for each data format on how to iterate over the data given a reference formulation and such. RML-IO is the abstraction while rml-io-registry puts the abstraction into reality for each data format e.g. SQL, JSON, XML, ... or YAML. The RML-IO spec aims to refer to this registry as you get now the impression that YAML ain't supported, which is what we want to avoid since RML-IO can support any data format, but requires a different access description, reference formulation, etc. depending on the data format at hand. As we cannot mention all existing and future data formats in a spec like RML-IO, we aim to move that to the registry so RML does not need to be revised at W3C each time a new format comes to light (like YAML here).

So I propose to define YAML and how to iterate over such data in the registry, we add then the reference formulation etc. in RML-IO as a possible reference formulation. How to do things in practice is then described in the registry for YAML.

@VladimirAlexiev
Copy link

I'm glad this is being discussed!
There is also YAML-LD. The current spec "dumbs down" YAML to JSON,
but we've also discussed leveraging YAML unique features.
Eg for representing datatypes:
issued: !xsd!date 2024-10-03
is much nicer than
"issued": {"@type": "xsd:date", "@value": "2024-10-03"}

Perhaps RML should consider such special constructs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested working-group
Projects
None yet
Development

No branches or pull requests

4 participants