Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GrETEL parser version differs from SoNaR parser version #290

Open
JanOdijk opened this issue Oct 20, 2023 · 1 comment
Open

GrETEL parser version differs from SoNaR parser version #290

JanOdijk opened this issue Oct 20, 2023 · 1 comment

Comments

@JanOdijk
Copy link
Collaborator

The queries based on the MWE canonical form ' iemand zal de gordiaanse knoop doorhakken' and 'iemand zal de Gordiaanse knoop doorhakken' fail because the parser gives as lemma for 'gordiaanse' en 'Gordiaanse' the form 'gordiaanse' instead of 'gordiaans'.

This is an error of Alpino but should not have to be a problem if the treebank searched has been parsed with the same parser version. However, in the SoNaR treebank the lemma for 'gordiaanse' is 'gordiaans', which suggests that it has been parsed with a different version of the Alpino parser.

Is it known with which version SoNaR has been parsed?

More general., ideally we should add the parser version that has been used for creating a treebank, and call the same version of Alpino for parsing the example or the MWE canonical form, though I understand that this requires a lot of changes.

@tijmenbaarda
Copy link
Contributor

GrETEL 5 has a command to show the Alpino versions of all treebanks together with the currently installed Alpino version for comparison. I am not able to log in on the server currently, but I remember that SoNaR was parsed with a higher version of Alpino than the other corpora. The installed version of Alpino is 1.3, and that is also the version of the other corpora. I believe SoNaR was parsed with version 1.6.

It would be good to match Alpino versions. The main difficulty there is that GrETEL allows searching multiple treebanks at the same time and that currently the treebanks are only selected after parsing the example sentence or MWE. An easier solution would be to allow the user to select an Alpino version in the first step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants