-
Notifications
You must be signed in to change notification settings - Fork 82
ClassCastException loading model in Apache Spark #17
Comments
I've seen this kind of problem a few times, and they are incredibly hard to This is going to sound very hacky, but... could you try creating a new You might also appeal to the spark user list. I'm happy to help with it as -- David On Wed, Nov 19, 2014 at 8:35 AM, Tim Croydon [email protected]
|
I tried your suggestion and was able to create a BrownClusterFeature object with no trouble so doesn't look like it's a classloader issue (as far as I can tell). It feels more like the kind of problem you might get serialising using one version and trying to deserialise with another, although given the file can be deserialised using raw scala it's almost like something's happening to the file stream. I'll have a closer look at the Spark side to see if I can find similar issues there. Thanks for the prompt response and for the library! |
Is there maybe something going on with different scala versions? (Or, less On Wed, Nov 19, 2014 at 11:01 AM, Tim Croydon [email protected]
|
I'm compiling to 2.10.4 and my installed scala version matches that. However, there is a Breeze dependency at a different version - looks like nak pulls in an older version of breeze_natives: 'What depends on' Breeze 0.8:
And same for Breeze 0.9:
No idea if that might cause problems? |
nak is declared intransitive() so that shouldn't be a problem. (Seems like a bug in the dependency graph plugin...) |
Hi there, I just googled, looking for a solution for a similar problem in a project I'm working in, and we found and fixed the problem cause (I'm not sure if it fixes your current problem). We solved it adding missing classpath dependencies when creating SparkContext (not only direct dependencies):
Hope this helps. Regards! |
@timcroydon Any chance you found a solution to this problem? Running into the same issue. |
I don't recall now, I'm afraid. For various unrelated reasons, we ended up using a different library for similar functionality so I don't think I ever got round to investigating this fully - sorry! |
@reactormonk I haven't gotten it to work by that route, but perhaps I'm missing something. I assemble the project into a single jar, and also add dependent jars: SparkConf().setJars(Seq("/root/myBigJar.jar", "/root/epic-ner-en-conll_2.10-2015.1.25.jar", "/root/epic_2.10-0.3.jar")) Perhaps I'm missing not following @JSantosP suggestion correctly, as those should be included in myBigJar.jar anyway. @timcroydon Thanks for your reply! |
there's a jar from february that works, i believe. can't fix atm. On Wed, Jun 10, 2015 at 2:34 PM, acvogel [email protected] wrote:
|
I've been using the 2015.2.19 data files combined with the sources from https://github.com/dlwh/epic/tree/e0238ceb16fc9adb9511240638357e8c44200a2f. The files from February work, but I believe this tree is the last one that works. I covered some of it in #24 IIRC. I don't know if this will solve your specific issue, but it is the latest version I believe will work. From there, maybe you could fix whatever CCE is holding back usage under Spark. https://gist.github.com/briantopping/369fb337735c1b726337 is the complete dependency closure from the subproject I am using. |
I had the same problem and the JSantosP solutioin worked for me. Thank you. |
What is the final solution, I have the same problem, I make a single jar file, on my local, it works, but when submit to Spark, throw exception java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.HashMap$SerializationProxy to field epic.features.BrownClusterFeaturizer.epic$features$BrownClusterFeaturizer$$clusterFeatures of type scala.collection.immutable.Map in instance of epic.features.BrownClusterFeaturizer Who can help me, thanks a lot. |
@ltao80 I never got it to work and gave up. I'd be curious to hear from anyone else with a detailed solution. |
@acvogel Thank you for your reply, I gave up too, I change to use Stanford NLP |
I'm facing the same problem (see here [1]). I've tried @JSantosP suggestion and added several dependencies to the
Do I need the path here? I also wonder, why I should add these jars to the [1] https://github.com/Tooa/spark-fun |
Hi there,
I'm trying to use epic in an Apache Spark Streaming environment but I'm experiencing some difficulty loading the models. I'm not really sure whether this is an Epic issue, a Breeze issue, a Spark issue or where/how to solve this now! I get the following exception (for English NER):
I've tried running my code (compiled into uberjar using 'sbt assembly') in a raw scala console and I can load the model and run it fine. However, using Spark, I get the exception described. The ONLY difference as far as I can tell is the way the model file is referenced. For the raw scala environment, I can point directly at the model file on disk (e.g.
new File("mymodels/model.ser.gz")
) and it loads. In Spark, I have to load the file doing something similar to:I've tried narrowing the code down and depending whether I point at the model extracted from the jar or the jar itself I get the same result. It's definitely loading the file (I think) as it fails in other ways if the file doesn't exist. I even tried bypassing the Breeze
nonStupidObjectInputStream
to no avail.Any idea what's going on or how to test? For reference, my JVM is 1.7.0_51 and same in both scala and Spark environments.
Thanks.
The text was updated successfully, but these errors were encountered: