Corrélation Pyspark entre plusieurs colonnes
df1 = sc.parallelize([(0.0, 1.0), (1.0, 0.0)]).toDF(["x", "y"])
df1.stat.corr("x", "y")
# -1.0
Plif Plouf
df1 = sc.parallelize([(0.0, 1.0), (1.0, 0.0)]).toDF(["x", "y"])
df1.stat.corr("x", "y")
# -1.0