You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to check how correlated two time series RDDs are with time, but they don't have the same cardinality (i.e., they have different number of data points because the timestamps the data are collected are different). I see from the Statistics API that the same number of partitions and cardinality is necessary. Below is some example code. I tried to look for some interpolation library in Spark to achieve the same number of partitions and cardinality, but I found none. Therefore, I would like to ask if it is possible with this library. Thank you
from pyspark.mllib.stat import Statistics
sc = ... # SparkContext
seriesX = ... # a series
seriesY = ... # must have the same number of partitions and cardinality as seriesX
# Compute the correlation using Pearson's method. Enter "spearman" for Spearman's method. If a
# method is not specified, Pearson's method will be used by default.
print Statistics.corr(seriesX, seriesY, method="pearson")
data = ... # an RDD of Vectors
# calculate the correlation matrix using Pearson's method. Use "spearman" for Spearman's method.
# If a method is not specified, Pearson's method will be used by default.
print Statistics.corr(data, method="pearson")
The text was updated successfully, but these errors were encountered:
I want to check how correlated two time series RDDs are with time, but they don't have the same cardinality (i.e., they have different number of data points because the timestamps the data are collected are different). I see from the Statistics API that the same number of partitions and cardinality is necessary. Below is some example code. I tried to look for some interpolation library in Spark to achieve the same number of partitions and cardinality, but I found none. Therefore, I would like to ask if it is possible with this library. Thank you
The text was updated successfully, but these errors were encountered: