Spark - java.lang.ClassCastException: no se puede asignar la instancia de java.lang.invoke.SerializedLambda al campo org.apache.spark.api.java.JavaRDDLike
public class SparkDemo {
@SuppressWarnings({ "resource" })
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("Spark APP").setMaster("spark://xxx.xxx.xxx.xx:7077");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> lines = sc.textFile("/Users/mchaurasia/file.txt");
JavaRDD<String> words = lines.flatMap((String s) -> {
return Arrays.asList(s.split(" "));
});
JavaPairRDD<String, Integer> pairs = words.mapToPair((String s) -> {
return new Tuple2<String, Integer>(s, 1);
});
JavaPairRDD<String, Integer> counts = pairs.reduceByKey((a, b) -> a + b);
for (Tuple2<String, Integer> result : counts.collect()) {
System.out.println(result._1 + " , " + result._2);
}
sc.close();
System.out.println("================= DONE ===================");
}
El código anterior funciona bien cuando configuro master en local (setMaster ("local")), cuando uso setMaster ("spark: //xxx.xxx.xxx.xx: 7077"), Recibo el siguiente error:
6/06/21 09:28:59 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, xxx.xxx.xxx.xx): java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.api.java.JavaRDDLike$anonfun$fn Recibo el siguiente error:1.f$3 of type org.apache.spark.api.java.function.FlatMapFunction in instance of org.apache.spark.api.java.JavaRDDLike$anonfun$fn Recibo el siguiente error:1
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2089)
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1261)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1999)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
Excepción en el subproceso "main" org.apache.spark.SparkException: trabajo cancelado debido a la falla de la etapa: la tarea 0 en la etapa 0.0 falló 4 veces, la falla más reciente: la tarea perdida 0.3 en la etapa 0.0 (TID 5, xxx.xxx.xxx. xx): java.lang.ClassCastException: no se puede asignar la instancia de java.lang.invoke.SerializedLambda al campo org.apache.spark.api.java.JavaRDDLike $ anonfun $ fn $ 1 $ 1.f $ 3 del tipo org.apache.spark .api.java.function.FlatMapFunction en instancia de org.apache.spark.api.java.JavaRDDLike $ anonfun $ fn $ 1 $ 1 en java.io.ObjectStreamClass $ FieldReflector.setObjFieldValues (ObjectStreamClass.java:2089) atja .ObjectStreamClass. ObjectInputStream.java:1801) en java.io.ObjectInputStream.readObject0 (ObjectInputStream.java:1351)
Por favor ayuda.