Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conflicts within language configurator #72

Open
tgalery opened this issue Jun 22, 2018 · 16 comments
Open

Conflicts within language configurator #72

tgalery opened this issue Jun 22, 2018 · 16 comments
Assignees

Comments

@tgalery
Copy link
Contributor

tgalery commented Jun 22, 2018

When using the language configurator to generate a config, we usually get a stacktrace when trying to associate an alias which already has been set, Looking at at the magic words config that generates the wikiconfig https://ja.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=magicwords&format=xml, the bug can be described as follows:

  1. alias 名前空間 is registered for the NAMESPACE id
  2. alias 名前空間 appears again for the ns id (things are slightly more complicated because for the second is appears followed by a :, but there's some suffix modifying code that adds the colon under certain circumstances)
  3. Since we throw an exception here when detecting the type in conflict in (2) and the 名前空間 magic word is the first alias of the ns id, the whole id and associated aliases are not added to the maps.
  4. When adding parsing functions, we explicitly expect one capable of parsing one for ns. Since there's no one that can be found in the maps, an exception is thrown.

After PR #73, we don't throw an exception when we find an alias already registered to an id. This means that the first ambiguous alias is set to the id that is first associated with, but ignored for others. Thus, a WikiConfigobject can be created, but it is not an optimal solution. if the is the possibility of two aliases being associated to the same id, we need the code to be able to handle that well.

@mawiesne
Copy link

mawiesne commented Jun 22, 2018

Referring to comment in dkpro/dkpro-jwpl#159:

@tgalery Sure, it merely "prints the stacktrace" but it's kind of confusing, e.g for one of my students. Proposal:

  • Print a warning with the exception's message instead of a full stacktrace at LanguageConfigGenerator.java:209? Or even better:
  • Use/introduce a logger that can be configured to be less "alerting" for such a case, so devs can decide what they make out of duplicate keys then, e.g. ignore them?

Cheers
mawiesne

@tgalery
Copy link
Contributor Author

tgalery commented Jun 22, 2018

To my mind a warning should suffice. I'm just wondering if there would be a way to allow a many to one relationship so these cases could be covered as well.

@tgalery
Copy link
Contributor Author

tgalery commented Jul 25, 2018

So digging more into this issue. Confusing stacktraces aside, it seems that for some languages we are able to create a wikiConfig instance.

import org.sweble.wikitext.engine.utils.LanguageConfigGenerator
val deConfig = LanguageConfigGenerator.generateWikiConfig("de")
deConfig: org.sweble.wikitext.engine.config.WikiConfig = org.sweble.wikitext.engine.config.WikiConfigImpl@e3643ba4

This prints a bunch of stacktraces like the one above, but if we do the same thing for "ja" pretty much the same stacktraces are produced but no value is created for the config, which is kind of a problem.

@tgalery
Copy link
Contributor Author

tgalery commented Jul 25, 2018

@hannesd I've been trying to build this locally, but I'm getting some issues with some of the deps of the project:

[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]
[ERROR]   The project org.sweble:sweble-parent:3.1.7-SNAPSHOT (/Users/thiago/code/tgalery/sweble-wikitext/pom.xml) has 1 error
[ERROR]     Non-resolvable parent POM for org.sweble:sweble-parent:3.1.7-SNAPSHOT: Could not transfer artifact de.fau.cs.osr:tooling:pom:3.0.9-SNAPSHOT from/to osr-public-repo (http://mojo-maven.cs.fau.de/content/repositories/public): sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: timestamp check failed and 'parent.relativePath' points at wrong local POM @ line 23, column 10: NotAfter: Wed Jul 25 13:29:37 BST 2018 -> [Help 2]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException

Any pointers ?

@hannesd hannesd self-assigned this Jul 26, 2018
@hannesd
Copy link
Member

hannesd commented Jul 26, 2018

I'm sorry but we screwed up and now our certificates have expired. If you turn of certificate validation it should work again but this is of course not ideal: https://stackoverflow.com/questions/21252800/how-to-tell-maven-to-disregard-ssl-errors-and-trusting-all-certs

We'll try to renew our certificates as soon as possible.

@hannesd
Copy link
Member

hannesd commented Jul 26, 2018

Concerning the actual problem with this issue: I'm quite busy at the moment and will not have the time to work on this. If you could provide a pull request I'd be more than happy to accept it. Simply turning the exception into a warning is fine with me, if this solves your problem.

@hannesd
Copy link
Member

hannesd commented Jul 27, 2018

Certificates have been fixed. Sorry for the inconvenience :(

@tgalery
Copy link
Contributor Author

tgalery commented Jul 27, 2018 via email

@tgalery
Copy link
Contributor Author

tgalery commented Jul 27, 2018

Sorry @hannesd I've actually tried building using mvn package install and I still get the errors. I've tried to use the mvn cli opts in the stack overflow link you mentioned but also without sucess. I'm wondering whether there' s a problem deeper then certificates updates.
Could you try building this locally ?

@tgalery
Copy link
Contributor Author

tgalery commented Jul 27, 2018

fyi, I'm trying to build the dev branch, which I think is the base.

@hannesd
Copy link
Member

hannesd commented Jul 30, 2018

Sorry again, the certificate chain had changed and I didn't notice. Should work now...

@mawiesne
Copy link

mawiesne commented Aug 3, 2018

@hannesd

Simply turning the exception into a warning is fine with me, if this solves your problem.

^ This will at least solve spammed consoles which irritate most devs most of the time 👍.
@tgalery Could you check LanguageConfigGenerator.java:209 and fix this odd behaviour?

Yet, this is merely an optical cure of the underlying problem, I guess.

@sweble
Copy link
Collaborator

sweble commented Aug 7, 2018 via email

@tgalery
Copy link
Contributor Author

tgalery commented Aug 15, 2018

@mawiesne just back from a trip today, will have a look today

@tgalery
Copy link
Contributor Author

tgalery commented Aug 15, 2018

ok, so dig some digging and kind of understand what's going on. Here is some debug on a branch of mine. From the scala console, we get something like this:

scala> val lang = "ja"
scala> val config = LanguageConfigGenerator.generateWikiConfig(lang)
Got: The name `名前空間:' was already registered by the alias `namespace' when trying to register it for alias `ns'. when adding alias ns
java.lang.IllegalArgumentException: No alias registered for parser function `ns'.
  at org.sweble.wikitext.engine.config.WikiConfigImpl.addParserFunction(WikiConfigImpl.java:449)

So in the LanguageConfigurator.java, generateWikiConfig first tries to run addi18NAliases which fails to register the ns namespace due to the fact that some other japanese magical word is registered with the keyword namespace, but then addParserFunctions expects that something is associated with the keyword ns and since no exception is handled, a language conf is not created.
One option is to associate multiple keywords with an alias, e.g. namespace to be associated with [ns, 名前空間]. Does anyone see a problem with that ?

@tgalery
Copy link
Contributor Author

tgalery commented Aug 15, 2018

Would be nice to have thoughts on the PR above ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants