Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting width per frequency channel (VLBI) #64

Open
2 of 7 tasks
tcassanelli opened this issue Nov 24, 2022 · 12 comments
Open
2 of 7 tasks

Extracting width per frequency channel (VLBI) #64

tcassanelli opened this issue Nov 24, 2022 · 12 comments
Assignees
Labels
Baseband Related to integrating Fitburst into baseband enhancement New feature or request

Comments

@tcassanelli
Copy link

The purpose of this feature is to extract the width per frequency channel, giving an start and stop timestamp. Then this information will be used within the VLBI pipeline and applied to the outrigger sites. Applying this custom mask to each data set will reduce the noise and improve the correlation strength.
The idea is to run fitburst instantly once an event has been triggered and received by the outrigger sites, optimize DM and share these set of parameters to the outriggers sites (or perhaps a common site?). Then apply mask and correlate.

  • Run fitburst with baseband data (@ketansand branch)
  • Currently the software only returns a single width parameter, find and extract the full matrix of widths params per channel
  • Define width per frequency given a central timestamp (per frequency channel) and apply the geometric delay correction for the outrigger sites
  • Test first with PSR B0531+21 and older data sets, check whether the correlation strength improves
  • Test with the VLBI paper FRB, check whether the correlation strength improves
  • Integrate this feature in the main fitburst package
  • Integrate this feature in the future VLBI pipeline and automate the process
@tcassanelli tcassanelli added enhancement New feature or request Baseband Related to integrating Fitburst into baseband labels Nov 24, 2022
@SebastianManosalva
Copy link

Right now I am using the crab pulsar to test fitburst. The code for the individual plots is:
`import matplotlib.pyplot as plt
%matplotlib inline

plt.imshow(
spectrum_downsampled[idx_good_freq], origin = "lower", aspect="auto", interpolation="nearest",
extent=[min_time, max_time, min_freq, max_freq])
plt.xlabel("Time [ms]")
plt.ylabel("frequency [MHz]")`

The original signal:
crab pulse original

The bestfit model:
bestfit model crab pulse

@SebastianManosalva
Copy link

To extract the initial time from the h5 file I am using:

`from astropy.time import Time
data = BBData.from_file(path)
data_time = data["time0"]

start_time = Time(
val=data_time[:]['ctime'],
val2=data_time[:]['ctime_offset'],
format="unix",
precision=9
)`

For the final time an I_cut value is used to know where the pulse ends.

`from sklearn import preprocessing
import numpy as np

fit_copy = np.copy(bestfit) #where bestfit is the model obtained from fitburst
#normalize
fit_copy = preprocessing.normalize(fit_copy)

I_cut = 1e-4 #this value was used as an example
pulse_time =[]
pulso = np.greater(fit_copy, I_cut)
final_time = []
for j in range(bestfit.shape[0]):

for i in range(len(bestfit[j])):
    if pulso[j][i] == True:
        pulse_time.append(times_windowed[i])
final_time.append(pulse_time[-1])

#now with the start and final time the width can be obtained `

@SebastianManosalva
Copy link

Here I show the functions that allow us to calculate the width, start time of the pulse and final time of the pulse per frequency channel:

`def t_dd(DM, t_0, freq_0, freq_1):
k = 4.148e3
t_delta = k * DM *(1/(freq_0)-1/(freq_1))
t_1 = t_delta + t_0
return t_1

def get_initial_and_final_time(bestfit, fname, path, ratio):
"""
input:
* bestfit The bestfit_model from run_fitburst
* fname name of the .npz file from make_input
* path path of the .npz file from make_input
* ratio percentage that decides where to cut the signal, for now its a float value
output:
* initial_time in seconds
* final_time in seconds
"""
#get the data from the npz file
fname = "fitburst_crabpulse.npz"
path = "/home/smanosal/documentos"
data = DataReader(fname, data_location=path)
data.load_data()

fit_copy = np.copy(bestfit)
fit_copy = preprocessing.normalize(fit_copy)

#Get the time frames where the pulse ends and start
frames = data.times/data.res_time   # time frames
pulso = np.greater(fit_copy, ratio) # Bool values that indicate where is the pulse
acceptable_pulse = []               
n_initial = np.zeros(bestfit.shape[0]) # number of frames until the first point of the pulse (x 1024 feeds)
n_final = np.zeros(bestfit.shape[0])   # number of frames until the last point of the pulse (x 1024 feeds)


for j in range(bestfit.shape[0]):  #1024 freq channels

    acceptable_pulse.append(frames[pulso[j]])  
    n_initial[j] = acceptable_pulse[j][0]     # First value of the pulse (in frames)
    n_final[j] = acceptable_pulse[j][-1]      # Last value of the pulse (in frames)
initial_time = n_initial*data.res_time        # Initial time in seconds 
final_time = n_final*data.res_time            # Final time in seconds

return initial_time, final_time

def get_start_time(h5_file, initial_time, final_time, DM):
"""
input:
* h5_file the path of the h5 file
* initial_time from get_initial_and_final_time
* final_time from get_initial_and_final_time
* DM of the event
output:
* startTime astropy.Time object with the date and time of the start of the pulse
* finalTime astropy.Time object with the date and time of the final of the pulse
* width of the pulse per frequency channel
"""
data_h5 = h5py.File(h5_file, 'r')

#Generate the astropy.Time object 
data_time = data_h5["time0"]
start_time = Time(                            #start_time form each freq channel
        val=data_time[:]['ctime'],
        val2=data_time[:]['ctime_offset'],
        format="unix",
        precision=9
        )
freq_idx = data_h5["index_map"]["freq"][:]["id"].astype(int)
frequencies = data_h5["index_map"]["freq"][:]["centre"].astype(float) #* apu.MHz

zero1 = np.zeros(len(initial_time))   #1024
zero2 = np.zeros(len(initial_time))

for i in range(len(start_time.jd)):
    zero1[freq_idx-1] = start_time.jd1[i]       #REVISAR los indices!
    zero2[freq_idx-1] = start_time.jd2[i]
complete_start_time = zero1 + zero2             #complete with zeros start_time


F_channel = np.linspace(min(frequencies), max(frequencies), 1024)

for i in range(1024):
    if complete_start_time[i] == 0:
        complete_start_time[i] = t_dd(DM, complete_start_time[i-1], F_channel[i-1], F_channel[i]) #start time for all the 1024 feeds
        
t = Time(complete_start_time, format='jd', precision=9)   #to astropy.time format
finalform_isot = t.isot                                   #In order to work with initial_time we should work with isot
finalform_seconds = np.zeros(1024)
finalform_date = []
for i in range(1024):
    finalform_seconds[i] = float(finalform_isot[i][17:])  #seconds in float
    finalform_date.append(finalform_isot[i][:17])         #date and time in str


t_0 = initial_time +  finalform_seconds  # 1024x1 in seconds numpy array

t_f = final_time + finalform_seconds  # 1024x1 in seconds


width = t_f - t_0   #The width of the pulse per freq channel

start_isot =[]
finish_isot =[]
for i in range(1024):
    start_isot.append(finalform_date[i] + str(t_0[i]))   
    finish_isot.append(finalform_date[i] + str(t_f[i]))

startTime = Time(np.array(start_isot), format = "isot", precision=9)   #initial_time of the pulse in astropy.Time format
finishTime = Time(np.array(finish_isot), format = "isot", precision=9)  #final_time of the pulse in astropy.Time format

return width, startTime, finishTime

def get_widths(bestfit, fname, path, ratio, h5_file, DM):
initial_time, final_time = get_initial_and_final_time(bestfit, fname, path, ratio)
width, startTime, finishTime = get_start_time(h5_file, initial_time, final_time, DM)

#maybe I can save some files in this function or some plots
return width, startTime, finishTime`

@SebastianManosalva
Copy link

Here is how I use the get_widths function:

`import numpy as np
import h5py
import matplotlib.pyplot as plt
%matplotlib inline
import h5py
from astropy.time import Time
from astropy import units as apu
from sklearn import preprocessing
from fitburst.backend.generic import DataReader

fit = np.load("bestfit_fitburst_crabpulse.npz")
#fit.files
bestfit = fit["arr_0"]
fname = "fitburst_crabpulse.npz"
path = "/home/smanosal/documentos"
ratio = 5*10e-3
h5_file = "/data/user-data/cassanelli/Cs/chime/singlebeam_172516423_gain_20210603T130623.h5"
DM = 56.755

width, startTime, finishTime = get_widths(bestfit, fname, path, ratio, h5_file, DM)`

@SebastianManosalva
Copy link

And this is a plot using the initial_time and final_time from get_initial_and_final_time

Width crab pulse

@tcassanelli
Copy link
Author

@SebastianManosalva good result. First thing is, I'd like to know the start_time of the file and the number of frames for the position of the red and orange lines, if we have that then we are good. Can you test a bit more with the algorithm to select more of the pulse at lower frequencies? I see that I good portion of the signal is being lost at the end. Also can you plot those lines in the actual data? I think that will tell us whether your method is working or not.

@SebastianManosalva
Copy link

Here is what I have done since the last update:
I created a function that encapsulates all the processes of fitburst and my own code, the function receive the .npz file, a ratio and a save path. It returns the right and left limits of the pulse, the start_time per frequency channel (astropy.Time), the DM and the TOA.
But I am not sure if the TOA 's are correct, because they are far away from the pulse:
image
In green is the TOA's and orange/red are the limits.
I am looking what could be the issue.

@tcassanelli
Copy link
Author

hmm I doubt that all the top section was taken out. I think it is a plotting error, since those missing channels should be spread out across the entire band and not just at the top!

@SebastianManosalva
Copy link

The first funtion of make_input is called cut_profile it returns the frequency channels with the name freq and this frequencies goes from 400.390625 to 656.25, the other channels are then filled with zeros. So it isn't a plotting error.

@SebastianManosalva
Copy link

I had been testing the code with a new event singlebeam_173568547 and with different arrival times:

  1. First the TOA that I get from make_input
    image
  2. Second the average time between the two limits:
    image
  3. Third the TOA from baseband_analysis/analysis/toa/get_TOA
    image
  4. Fourth the fixed TOA from run_fitburst (this one has two values)
    image
    image

As I confirmed with Ziggy and Ketan the arrival time is always calculated for the 400.1953125 MHz band and between all the arrival times the last one looks really good!
But the code changed just a bit compared to the last time, so maybe the singlebeam_173098290 fit had something strange ...

@SebastianManosalva
Copy link

I have been trying different codes to get make_input to work as intended. The next function improves the data (coherent and incoherent dedispersion) and refines the DM.

`
from baseband_analysis.core.bbdata import BBData
from baseband_analysis.core.signal import _get_profile, get_floor, get_main_peak_lim, tiedbeam_baseband_to_power
from baseband_analysis.analysis.snr import get_snr
from fitburst.routines.profile import get_signal, count_components

def simplified_cut_profile(path, DM, t_res, DM_range = 1, w = None, downsample = 1, refine_RFI = True, fill_missing_time = None,
fill_nan = True, spectrum_lim = False, diagnostic_plots = False, valid_channels=None):

data = BBData.from_file(path)
#This function gets the power data in case the original data doesn't have it.
tiedbeam_baseband_to_power(data=data, dm=DM, time_downsample_factor=1, dedisperse=True, time_shift=False) 
#Optimize the DM, incoherent dedisp
freq_id, freq, power, offset, weight, valid_channels, time_range, DM, downsampling_factor = get_snr(data=data, DM=DM, diagnostic_plots=diagnostic_plots, w=w, valid_channels=valid_channels, spectrum_lim=spectrum_lim, DM_range=DM_range, downsample=downsample, refine_RFI=refine_RFI, fill_missing_time=fill_missing_time, fill_nan=fill_nan,return_full=True)
# get the two dimensional profile and cut the data 
profile = _get_profile(power)
start, end = get_signal(profile, ds = downsampling_factor)
# Find peaks in the profile
peaks = count_components(profile, t_res, ds = downsampling_factor, diagnostic_plots=diagnostic_plots)
peak_times = peaks * t_res
DM_err = 0.05/2

return data, freq_id, freq, power[...,start:end], valid_channels, DM, DM_err, downsampling_factor, profile, t_res, peak_times, start, power

`

@tcassanelli
Copy link
Author

ok, can you please post and update with a figure as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseband Related to integrating Fitburst into baseband enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants