-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generateWikiConfig won't work for some languages #159
Comments
I can confirm the output of the stacktrace for other languages than Japanese: German/French... are also affected. It seems that Swebble in version 3.1.7 (or below) has issues at
see: Besides the actual problem with multiple registrations ("was already registered by the alias") , it would be better to use a logging framework here, as otherwise it will not be captured within log files, given a server environment (e.g., application containers...). Can @ferschke or @reckart have a look at this? Or: contact Samy Ateia, samyateia [at] hotmail.de |
@mawiesne I'm trying to help out with PRs and build infrastructure etc a bit and I could even try running a release, but unfortunately, I have no spare resources to actually work on the code. |
fyi I took a look at the code and this is not an error per se. The language configurator tries to associate some aliases for some specific tags and prints the stacktrace if the tag is already in use. The problem is in Sweble sweble/sweble-wikitext#72 and not something that lies in the code of dkpro. That being said, you should be able to create a japanese config even with these stacktraces being printed. |
On some languages, the stacktraces are just printed and the config seems correct and the parsing goes on. On some other languages like Japanese, this leads to not just a print but in a real exception and no config are created. |
@reckart Thanks for your support. Just want to raise awareness that both #159 and #160 should be tackled before a release of 1.2.0 is conducted. IMHO #160 is a blocker atm. For #159 I'd say it could be compensated (by ignoring it) even though @claeyzre might have a point that things can fail later during runtime. @claeyzre Can you pls add a full stacktrace of the exceptions you encounter "later on" for a Japanese language/environment. |
@mawiesne The stacktrace is the one I gave you. The thing is that sometimes it's just printed and my program carries on. Sometimes it stops directly after the stacktrace being printed. To reproduce, you can execute this (Scala) code:
with language being a prefix like 'ja'. |
@tgalery Could you look into this issue, providing a workaround or fix? |
Will try to reproduce the bug first. But probably this would be a fix in sweble. |
Sounds reasonable. If you could provide a test case that demonstrates this scenario that would be of great help. |
@tgalery oh, and you can approve PRs by others now. |
FYI, I've confirmed the bug. For some language, i.e. German, we can create the config, whereas for some others, like Japanese, we cannot. I still have the feeling that the best place to fix it would be in sweble, so I will create an issue there and reference it here. If things get nasty too quickly, I might just handle the exception and return the English config for the languages that are throwing the error. |
and here's some note to self:
|
@tgalery Could you cross-link the issue in Sweble in here? |
Sure, the issue is this sweble/sweble-wikitext#72 I've been swamped this week, but should crack on it next week. |
Can someone test PR #202 to confirm this issue is gone ? |
Hi,
I am trying to parse wiki-text from many wikipedias. For some languages the configs output from generateWikiConfig work fine. But I end up with that kind of stacktraces when I try for example for Japanese. I am using "org.sweble.wikitext" % "swc-engine" % "3.1.7". Checking the issues, related bugs should have been fixed in the 3.1.7.
The text was updated successfully, but these errors were encountered: