Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust serialization mechanism needed - json serialization would be better in terms of robustness and performance #2

Open
gokhanercan opened this issue Jan 17, 2019 · 1 comment

Comments

@gokhanercan
Copy link

Serialized object highly dependent on class types and hierarchies. NGram or similar java object serialization are sensitive to member changes of such class types. Library users should frequently update such serialized resource files on their implementations. If a serialized file "word2.gram" file was serialized by the previous version of the class NGram or Word, loading previous models would yields exceptions like the following:

java.io.InvalidClassException: Dictionary.Word; local class incompatible: stream classdesc serialVersionUID = 2071394599407976767, local class serialVersionUID = -2752853122563649172
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at java.util.HashMap.readObject(HashMap.java:1394)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at MorphologicalDisambiguation.NaiveDisambiguation.loadModel(NaiveDisambiguation.java:32)
at MorphologicalDisambiguation.RootFirstDisambiguation.loadModel(RootFirstDisambiguation.java:187)

https://github.com/olcaytaner/MorphologicalDisambiguation/blob/master/src/main/java/MorphologicalDisambiguation/NaiveDisambiguation.java#L27

@gokhanercan
Copy link
Author

gokhanercan commented Jan 17, 2019

For the records, everytime I face this issue, I regenerate the files with the following:

/*
Regenerates files words.1gram, words.2gram, igs.1gram, igs.2gram.
After executing, copy output files on the root to the resources folder of your implementation.
 */
public static void ReserializeGrams(){
    RootFirstDisambiguation disambiguation = new RootFirstDisambiguation();
    DisambiguationCorpus corpus = new DisambiguationCorpus("dataset/disambiguation/penn_treebank.txt");
    disambiguation.train(corpus);
    disambiguation.saveModel();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant