Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code for simulated values #2

Open
egiovanoudi opened this issue Nov 5, 2023 · 2 comments
Open

Code for simulated values #2

egiovanoudi opened this issue Nov 5, 2023 · 2 comments

Comments

@egiovanoudi
Copy link

Hello,
I would like to ask if you could provide me the code for creating the simulated dataset.
Thanks in advance.

@bbardakk
Copy link
Contributor

bbardakk commented Nov 14, 2023

Hi @egiovanoudi ,

You can find the code block that creates the simulated dataset.

import numpy as np
from pyts.image import RecurrencePlot
from itertools import repeat
import matplotlib.pyplot as plt
from scipy.misc import imresize 
import pandas as pd

def create_simulated_data(number_of_time_points, number_of_cluster, samples):
    """
    Parameters
    number_of_time_points: (int), length of the time series
    number_of_cluster: (int), number of cluster
    samples: (int), number of samples
    
    Returns
    --------
    rp_clean: transformation of data_clean to image
    rp_noisy1: transformation of data_noisy1 to image
    rp_noisy2: transofrmation of data_noisy2 to image
    data_clean: 
    data_noisy1: 
    data_noisy2:
    """
    
    def add_noise(x): 
        """add small perturbation to the input
        Parameters
        x: data
        
        return: noisy data
        """
        return x + np.random.normal(loc=0.0, scale=0.10, size=x.shape)

    acc_clean_kmeans  =[]
    acc_noisy1_kmeans =[]
    acc_noisy2_kmeans =[]

    acc_clean_hier  =[]
    acc_noisy1_hier =[]
    acc_noisy2_hier =[]

    acc_clean_rp  =[]
    acc_noisy1_rp =[]
    acc_noisy2_rp =[]


    # Four groups will be generated.
    nt = number_of_time_points #number of time points.
    C = number_of_cluster  #number of clusters
    samples = samples #number of samples in each cluster
    data = np.zeros((samples, nt, C))  #samples X timepoints X clusters
    targets = np.empty(shape=(1,0))
    
    for i in range(C):
        initial_values = np.random.normal(loc=0.0,scale=1.0,size=(samples)) #bence bu gerçekçi değil.
        a = np.random.normal(loc=1.0,scale=0.15,size=(nt))
        b = np.random.normal(loc=0.0,scale=1.0,size=(nt))

        #Initial values are located.
        data[:,0,i] = initial_values
        for j in range(1,nt):#consider all time points but the first, since initials are already there. 
            data[:,j,i] = (add_noise(np.ones(samples)*a[j])*data[:,j-1,i]+(np.ones(samples)*b[j]))

        #Create targets!
        targets = np.hstack([targets,i*np.ones(shape=(1,samples))])

    temp = np.empty(shape=(0,nt))
    for i in range(C):#reshape
        temp = np.vstack([temp,data[:,:,i]])
    data = temp #update data with temp

    #Noise addition
    data_clean = data
    data_noisy1 = data + np.random.normal(loc=1.0,scale=0.25,size=data.shape) #small noise (check SNR)
    data_noisy2 = data + np.random.normal(loc=1.0,scale=0.60,size=data.shape) #large noise (check SNR)

    #Let's normalize (starting point = 0)
    for i in range(data.shape[0]):
        data_clean[i,:] -= data[i,0]
        data_noisy1[i,:] -= data_noisy1[i,0]
        data_noisy2[i,:] -= data_noisy2[i,0]
        
    ### SAMPLES X IMAGES (10x10)
    rp_clean  = RecurrencePlot(percentage=20).fit_transform(data_clean)
    rp_noisy1 = RecurrencePlot(percentage=20).fit_transform(data_noisy1)
    rp_noisy2 = RecurrencePlot(percentage=20).fit_transform(data_noisy2)
    
    return rp_clean, rp_noisy1, rp_noisy2, data_clean, data_noisy1, data_noisy2

@egiovanoudi
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants