Extracting width per frequency channel (VLBI) #64

tcassanelli · 2022-11-24T17:45:23Z

The purpose of this feature is to extract the width per frequency channel, giving an start and stop timestamp. Then this information will be used within the VLBI pipeline and applied to the outrigger sites. Applying this custom mask to each data set will reduce the noise and improve the correlation strength.
The idea is to run fitburst instantly once an event has been triggered and received by the outrigger sites, optimize DM and share these set of parameters to the outriggers sites (or perhaps a common site?). Then apply mask and correlate.

Run fitburst with baseband data (@ketansand branch)
Currently the software only returns a single width parameter, find and extract the full matrix of widths params per channel
Define width per frequency given a central timestamp (per frequency channel) and apply the geometric delay correction for the outrigger sites
Test first with PSR B0531+21 and older data sets, check whether the correlation strength improves
Test with the VLBI paper FRB, check whether the correlation strength improves
Integrate this feature in the main fitburst package
Integrate this feature in the future VLBI pipeline and automate the process

The text was updated successfully, but these errors were encountered:

SebastianManosalva · 2022-11-28T19:30:22Z

Right now I am using the crab pulsar to test fitburst. The code for the individual plots is:
`import matplotlib.pyplot as plt
%matplotlib inline

plt.imshow(
spectrum_downsampled[idx_good_freq], origin = "lower", aspect="auto", interpolation="nearest",
extent=[min_time, max_time, min_freq, max_freq])
plt.xlabel("Time [ms]")
plt.ylabel("frequency [MHz]")`

The original signal:

The bestfit model:

SebastianManosalva · 2022-11-28T20:43:10Z

To extract the initial time from the h5 file I am using:

`from astropy.time import Time
data = BBData.from_file(path)
data_time = data["time0"]

start_time = Time(
val=data_time[:]['ctime'],
val2=data_time[:]['ctime_offset'],
format="unix",
precision=9
)`

For the final time an I_cut value is used to know where the pulse ends.

`from sklearn import preprocessing
import numpy as np

fit_copy = np.copy(bestfit) #where bestfit is the model obtained from fitburst
#normalize
fit_copy = preprocessing.normalize(fit_copy)

I_cut = 1e-4 #this value was used as an example
pulse_time =[]
pulso = np.greater(fit_copy, I_cut)
final_time = []
for j in range(bestfit.shape[0]):

for i in range(len(bestfit[j])):
    if pulso[j][i] == True:
        pulse_time.append(times_windowed[i])
final_time.append(pulse_time[-1])

#now with the start and final time the width can be obtained `

SebastianManosalva · 2022-12-12T16:26:44Z

Here I show the functions that allow us to calculate the width, start time of the pulse and final time of the pulse per frequency channel:

`def t_dd(DM, t_0, freq_0, freq_1):
k = 4.148e3
t_delta = k * DM *(1/(freq_0)-1/(freq_1))
t_1 = t_delta + t_0
return t_1

def get_initial_and_final_time(bestfit, fname, path, ratio):
"""
input:
* bestfit The bestfit_model from run_fitburst
* fname name of the .npz file from make_input
* path path of the .npz file from make_input
* ratio percentage that decides where to cut the signal, for now its a float value
output:
* initial_time in seconds
* final_time in seconds
"""
#get the data from the npz file
fname = "fitburst_crabpulse.npz"
path = "/home/smanosal/documentos"
data = DataReader(fname, data_location=path)
data.load_data()

fit_copy = np.copy(bestfit)
fit_copy = preprocessing.normalize(fit_copy)

#Get the time frames where the pulse ends and start
frames = data.times/data.res_time   # time frames
pulso = np.greater(fit_copy, ratio) # Bool values that indicate where is the pulse
acceptable_pulse = []               
n_initial = np.zeros(bestfit.shape[0]) # number of frames until the first point of the pulse (x 1024 feeds)
n_final = np.zeros(bestfit.shape[0])   # number of frames until the last point of the pulse (x 1024 feeds)


for j in range(bestfit.shape[0]):  #1024 freq channels

    acceptable_pulse.append(frames[pulso[j]])  
    n_initial[j] = acceptable_pulse[j][0]     # First value of the pulse (in frames)
    n_final[j] = acceptable_pulse[j][-1]      # Last value of the pulse (in frames)
initial_time = n_initial*data.res_time        # Initial time in seconds 
final_time = n_final*data.res_time            # Final time in seconds

return initial_time, final_time

def get_start_time(h5_file, initial_time, final_time, DM):
"""
input:
* h5_file the path of the h5 file
* initial_time from get_initial_and_final_time
* final_time from get_initial_and_final_time
* DM of the event
output:
* startTime astropy.Time object with the date and time of the start of the pulse
* finalTime astropy.Time object with the date and time of the final of the pulse
* width of the pulse per frequency channel
"""
data_h5 = h5py.File(h5_file, 'r')

#Generate the astropy.Time object 
data_time = data_h5["time0"]
start_time = Time(                            #start_time form each freq channel
        val=data_time[:]['ctime'],
        val2=data_time[:]['ctime_offset'],
        format="unix",
        precision=9
        )
freq_idx = data_h5["index_map"]["freq"][:]["id"].astype(int)
frequencies = data_h5["index_map"]["freq"][:]["centre"].astype(float) #* apu.MHz

zero1 = np.zeros(len(initial_time))   #1024
zero2 = np.zeros(len(initial_time))

for i in range(len(start_time.jd)):
    zero1[freq_idx-1] = start_time.jd1[i]       #REVISAR los indices!
    zero2[freq_idx-1] = start_time.jd2[i]
complete_start_time = zero1 + zero2             #complete with zeros start_time


F_channel = np.linspace(min(frequencies), max(frequencies), 1024)

for i in range(1024):
    if complete_start_time[i] == 0:
        complete_start_time[i] = t_dd(DM, complete_start_time[i-1], F_channel[i-1], F_channel[i]) #start time for all the 1024 feeds
        
t = Time(complete_start_time, format='jd', precision=9)   #to astropy.time format
finalform_isot = t.isot                                   #In order to work with initial_time we should work with isot
finalform_seconds = np.zeros(1024)
finalform_date = []
for i in range(1024):
    finalform_seconds[i] = float(finalform_isot[i][17:])  #seconds in float
    finalform_date.append(finalform_isot[i][:17])         #date and time in str


t_0 = initial_time +  finalform_seconds  # 1024x1 in seconds numpy array

t_f = final_time + finalform_seconds  # 1024x1 in seconds


width = t_f - t_0   #The width of the pulse per freq channel

start_isot =[]
finish_isot =[]
for i in range(1024):
    start_isot.append(finalform_date[i] + str(t_0[i]))   
    finish_isot.append(finalform_date[i] + str(t_f[i]))

startTime = Time(np.array(start_isot), format = "isot", precision=9)   #initial_time of the pulse in astropy.Time format
finishTime = Time(np.array(finish_isot), format = "isot", precision=9)  #final_time of the pulse in astropy.Time format

return width, startTime, finishTime

def get_widths(bestfit, fname, path, ratio, h5_file, DM):
initial_time, final_time = get_initial_and_final_time(bestfit, fname, path, ratio)
width, startTime, finishTime = get_start_time(h5_file, initial_time, final_time, DM)

#maybe I can save some files in this function or some plots
return width, startTime, finishTime`

SebastianManosalva · 2022-12-12T16:28:59Z

Here is how I use the get_widths function:

`import numpy as np
import h5py
import matplotlib.pyplot as plt
%matplotlib inline
import h5py
from astropy.time import Time
from astropy import units as apu
from sklearn import preprocessing
from fitburst.backend.generic import DataReader

fit = np.load("bestfit_fitburst_crabpulse.npz")
#fit.files
bestfit = fit["arr_0"]
fname = "fitburst_crabpulse.npz"
path = "/home/smanosal/documentos"
ratio = 5*10e-3
h5_file = "/data/user-data/cassanelli/Cs/chime/singlebeam_172516423_gain_20210603T130623.h5"
DM = 56.755

width, startTime, finishTime = get_widths(bestfit, fname, path, ratio, h5_file, DM)`

SebastianManosalva · 2022-12-12T16:30:52Z

And this is a plot using the initial_time and final_time from get_initial_and_final_time

tcassanelli · 2022-12-13T12:00:58Z

@SebastianManosalva good result. First thing is, I'd like to know the start_time of the file and the number of frames for the position of the red and orange lines, if we have that then we are good. Can you test a bit more with the algorithm to select more of the pulse at lower frequencies? I see that I good portion of the signal is being lost at the end. Also can you plot those lines in the actual data? I think that will tell us whether your method is working or not.

SebastianManosalva · 2023-03-09T20:48:19Z

Here is what I have done since the last update:
I created a function that encapsulates all the processes of fitburst and my own code, the function receive the .npz file, a ratio and a save path. It returns the right and left limits of the pulse, the start_time per frequency channel (astropy.Time), the DM and the TOA.
But I am not sure if the TOA 's are correct, because they are far away from the pulse:

In green is the TOA's and orange/red are the limits.
I am looking what could be the issue.

tcassanelli · 2023-03-10T12:35:39Z

hmm I doubt that all the top section was taken out. I think it is a plotting error, since those missing channels should be spread out across the entire band and not just at the top!

SebastianManosalva · 2023-03-10T13:59:10Z

The first funtion of make_input is called cut_profile it returns the frequency channels with the name freq and this frequencies goes from 400.390625 to 656.25, the other channels are then filled with zeros. So it isn't a plotting error.

SebastianManosalva · 2023-03-17T15:31:33Z

I had been testing the code with a new event singlebeam_173568547 and with different arrival times:

First the TOA that I get from make_input
Second the average time between the two limits:
Third the TOA from baseband_analysis/analysis/toa/get_TOA
Fourth the fixed TOA from run_fitburst (this one has two values)

As I confirmed with Ziggy and Ketan the arrival time is always calculated for the 400.1953125 MHz band and between all the arrival times the last one looks really good!
But the code changed just a bit compared to the last time, so maybe the singlebeam_173098290 fit had something strange ...

SebastianManosalva · 2023-05-10T18:34:27Z

I have been trying different codes to get make_input to work as intended. The next function improves the data (coherent and incoherent dedispersion) and refines the DM.

`
from baseband_analysis.core.bbdata import BBData
from baseband_analysis.core.signal import _get_profile, get_floor, get_main_peak_lim, tiedbeam_baseband_to_power
from baseband_analysis.analysis.snr import get_snr
from fitburst.routines.profile import get_signal, count_components

def simplified_cut_profile(path, DM, t_res, DM_range = 1, w = None, downsample = 1, refine_RFI = True, fill_missing_time = None,
fill_nan = True, spectrum_lim = False, diagnostic_plots = False, valid_channels=None):

data = BBData.from_file(path)
#This function gets the power data in case the original data doesn't have it.
tiedbeam_baseband_to_power(data=data, dm=DM, time_downsample_factor=1, dedisperse=True, time_shift=False) 
#Optimize the DM, incoherent dedisp
freq_id, freq, power, offset, weight, valid_channels, time_range, DM, downsampling_factor = get_snr(data=data, DM=DM, diagnostic_plots=diagnostic_plots, w=w, valid_channels=valid_channels, spectrum_lim=spectrum_lim, DM_range=DM_range, downsample=downsample, refine_RFI=refine_RFI, fill_missing_time=fill_missing_time, fill_nan=fill_nan,return_full=True)
# get the two dimensional profile and cut the data 
profile = _get_profile(power)
start, end = get_signal(profile, ds = downsampling_factor)
# Find peaks in the profile
peaks = count_components(profile, t_res, ds = downsampling_factor, diagnostic_plots=diagnostic_plots)
peak_times = peaks * t_res
DM_err = 0.05/2

return data, freq_id, freq, power[...,start:end], valid_channels, DM, DM_err, downsampling_factor, profile, t_res, peak_times, start, power

`

tcassanelli · 2023-05-11T14:39:36Z

ok, can you please post and update with a figure as well.

tcassanelli added enhancement New feature or request Baseband Related to integrating Fitburst into baseband labels Nov 24, 2022

tcassanelli assigned tcassanelli and SebastianManosalva Nov 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting width per frequency channel (VLBI) #64

Extracting width per frequency channel (VLBI) #64

tcassanelli commented Nov 24, 2022

SebastianManosalva commented Nov 28, 2022

SebastianManosalva commented Nov 28, 2022

SebastianManosalva commented Dec 12, 2022

SebastianManosalva commented Dec 12, 2022

SebastianManosalva commented Dec 12, 2022

tcassanelli commented Dec 13, 2022

SebastianManosalva commented Mar 9, 2023

tcassanelli commented Mar 10, 2023

SebastianManosalva commented Mar 10, 2023

SebastianManosalva commented Mar 17, 2023

SebastianManosalva commented May 10, 2023

tcassanelli commented May 11, 2023

Extracting width per frequency channel (VLBI) #64

Extracting width per frequency channel (VLBI) #64

Comments

tcassanelli commented Nov 24, 2022

SebastianManosalva commented Nov 28, 2022

SebastianManosalva commented Nov 28, 2022

SebastianManosalva commented Dec 12, 2022

SebastianManosalva commented Dec 12, 2022

SebastianManosalva commented Dec 12, 2022

tcassanelli commented Dec 13, 2022

SebastianManosalva commented Mar 9, 2023

tcassanelli commented Mar 10, 2023

SebastianManosalva commented Mar 10, 2023

SebastianManosalva commented Mar 17, 2023

SebastianManosalva commented May 10, 2023

tcassanelli commented May 11, 2023