Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022/06/02 meeting notes #1

Open
Victoria-Samboco opened this issue Jun 2, 2022 · 19 comments
Open

2022/06/02 meeting notes #1

Victoria-Samboco opened this issue Jun 2, 2022 · 19 comments

Comments

@Victoria-Samboco
Copy link
Contributor

HI everyone

This is a summary on my understanding of the current situation and how to proceed:

1.) From the Change_dir.py script we get RA and Dec of the sun.

2.) We then add these coordinates to the 'DELAY_DIR', 'PHASE_DIR', 'REFERENCE_DIR' columns of the FIELD table in our MS.

3.) We then need to subtract the MODEL-DATA from the CORRECTED-DATA or use Cubical with residuals.

4.) This would remove most of the target field in our MS. We can then do imaging of the sun using WSClean.

Please let me know if these are the correct steps.
Thank you.

@o-smirnov o-smirnov changed the title Today meeting notes 2022/06/02 meeting notes Jun 3, 2022
@o-smirnov
Copy link
Collaborator

@Victoria-Samboco I have changed the title to include the actual meeting date instead of "today", suggest you do that for future issues like this.

Correction and clarification on the above: for a regular observation (i.e. of a stationary object, with constant RA/Dec), the DELAY_DIR, PHASE_DIR, REFERENCE_DIR columns of ::FIELD are already correctly populated, there is no need to touch them. The procedure should be:

  1. Subtract MODEL_DATA from corrected data (or alternatively, use CubiCal to generate corrected residuals), and write this into the CORRECTED_DATA column.

  2. Split the multiple-hour MS into scans (or even shorter units, e.g. 5min or 10min slices), using the CORRECTED_DATA column. This will give you a series of MSs per slice.

  3. Compute the RA/Dec of the Sun for the midpoint of each scan or slice, and run chgcentre on that MS to rephase it to the Sun.

  4. Image the residuals.

For the observations of Jupiter (or other moving targets), the FIELD subtable is not correctly populated. This is a hack done by the observatory to accommodate a moving target. For these observations, we have the additional step of computing the RA/Dec of Jupiter and writing it into the FIELD suitable of each scan or slice.

@Victoria-Samboco
Copy link
Contributor Author

Noted. I will follow the instructions.

@Kincaidr
Copy link
Collaborator

Kincaidr commented Jun 6, 2022

Could you send the script for splitting the MS?

@IanHeywood
Copy link
Collaborator

If you want to split by scan then it's easy to use CASA, something like:

myms = '1620493279_sdp_l0_1024ch_NML.ms'

tb.open(myms)
scans = sorted(set(tb.getcol('SCAN_NUMBER')))
tb.done()

for scan in scans:
	scan = str(scan)
	opms = myms.replace('.ms','_scan'+scan+'.ms')
	split(vis=myms,outputvis=opms,scan=scan,datacolumn='all')

@IanHeywood
Copy link
Collaborator

For finer-grained splitting I would make use of the timerange setting of the split task. There is a script here that prints the time information for a Measurement Set on a per scan basis, or if the scan number is specified as an argument along with the MS then it lists the times for each integration for that scan.

https://github.com/IanHeywood/oxkat/blob/master/tools/scan_times.py

This can probably be adapted to chop up a MS (or individual scan MS) into a number of time bins.

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Jun 6, 2022 via email

@Victoria-Samboco
Copy link
Contributor Author

Hi everyone
Here is some notes of what i was doing since last week as well as some doubts.

During the process of carrying out the task of points 1 and 2, we noticed that when doing MS split after subtracting the MODEL_DATA to obtain the RESIDUAL data, the result of the Split did not contain the RESIDUAL_DATA column so we thought we could reverse the process by proceeding as follows :

  1. Split Ms to have different scans
  2. Subtract MODEL_DATA from de CORRECTED_DATA to have RESIDUAL_DATA for each scan.
  3. We computed the RA/Dec for each scan as well as changecentre phase centre of Measurement Set.
  4. And now iam imaging the residual data using wsclean.

Doubt 1:
I would like to know why when subtracting the MS first and then doing the ms split the RESIDUAL_DATA commune no longer appears in each scan and it seems that we have to subtract the MODEL_DATA again in each scan? Wasn't it supposed, once the MS was subtracted, each scan also had the REIDUAL_DATA column?

So about to split the ms to have 5 min slice I succeeded to do it adding (but manually) 5 min for each interaction so now iam trying to do it automatically because in the timerange CASA task we can put the initial and final time but we don't have option to say I want to divide by this interval.

@IanHeywood
Copy link
Collaborator

Hi Victoria. Are you adding a custom RESIDUAL_DATA column to the MS to hold the result of CORRECTED_DATA - MODEL_DATA? If so then CASA's split task might not recognise this non-standard column when the data are split, and it won't be added to the per-scan Measurement Sets.

For the five minute intervals it should be possible to convert the start time of the MS into a number of seconds, then produce a list of start / end times by adding 300 seconds per iteration. The time formatting options in astropy should then allow you to format this list into strings that CASA can recognise, to be passed to the timerange parameter.

@Victoria-Samboco
Copy link
Contributor Author

Yes that's what I did. So does that mean it's not wrong to have to subtract again for each scan after split MS?

Also about the imaging phase I have doubts about what the recommended number of interactions should be in the imaging process using WSCLEAN because it seems that the number of interactions first influences the processing time from what I realised as well as I think it can also clean up some features that maybe are important in the image...I don't know if I'm wrong.

@Victoria-Samboco
Copy link
Contributor Author

  1. First I wrote this code where in the timerange I specified the start date and time and the end date and time of the observation (which differ by 5 min in this case).

ms ='/home/samboco/VIVY/jovial_run/msdir/1622491578_sdp_l0-J2009_2026-corr_scan33.ms'
tb.open(ms)
T=sorted(set(tb.getcol('TIME')))
tb.done()
for scan in scans:
scan = str(scan)
SPLITED_ms=ms.replace('.ms','_scan'+scan+'.ms')

split(vis=ms,outputvis=SPLITED_ms,scan=scan,datacolumn='all',timerange='2021/06/01/03:37:43.059~2021/06/01/03:42:43.059')
print(SPLITED_ms)

@Victoria-Samboco
Copy link
Contributor Author

  1. To subtract MS we used:

from MSUtils.msutils import sumcols

ms = "/home/samboco/VIVY/jovial_run/msdir/1622491578_sdp_l0-J2009_2026-corr.ms" #call the measurement set

sumcols(ms, col1='MODEL_DATA', col2='CORRECTED_DATA', outcol= 'RESIDUAL_DATA', subtract=True)

  1. To subtract for the scans

from MSUtils.msutils import sumcols

def column_subtract(scans):
for i in range(len(scans)):
sumcols(scans[i], col1='MODEL_DATA', col2='CORRECTED_DATA', outcol= 'RESIDUAL_DATA', subtract=True)
print('Done')

if name == 'main':

scans = glob.glob('/home/samboco/VIVY/jovial_run/msdir/*.ms')
scans.sort()
#column_subtract(scans)

@o-smirnov
Copy link
Collaborator

@Victoria-Samboco I think what @IanHeywood means is that the CASA split task only knows about the three standard columns (DATA, MODEL_DATA, CORRECTED_DATA), and doesn't know how to split out a non-standard column like RESIDUAL_DATA.

So indeed, you have to split using datacolumn='all', and then do the subtraction on each individual sub-MS independently.

For future reference, when you clean up the workflow of the pipeline, it would be better to rather use a custom column (e.g. SELFCAL_DATA) for corrected visibilities during the calibration stages, then store the residuals into CORRECTED_DATA before splitting. That way you'll only need to split out the CORRECTED_DATA column. Faster, and uses less disk space.

Also, if you want to do quick column arithmetic from the command-line, you don't need MSUtils. The standard taql utility (part of casacore) can do it:

taql update 1622491578_sdp_l0-J2009_2026-corr.ms set CORRECTED_DATA=SELFCAL_DATA-MODEL_DATA

You'll probably find me using this in the jovial recipes somewhere... The only limitation is that this command will only work on existing columns (or maybe I just don't know how to make it create a new column).

@IanHeywood
Copy link
Collaborator

Also about the imaging phase I have doubts about what the recommended number of interactions should be in the imaging process using WSCLEAN because it seems that the number of interactions first influences the processing time from what I realised as well as I think it can also clean up some features that maybe are important in the image...I don't know if I'm wrong.

The optimum number of clean iterations will depend on the complexity of the field being imaged. If there are not enough iterations then not all of the emission will be deconvolved, and the image will be affected by the PSF sidelobes from the regions that have not been cleaned. Too many iterations risks over cleaning, and as you say it can make the processing time unneccesarily long.

I think it is important to use a mask to constrain the deconvolution to regions where there is known to be genuine radio emission. In conjunction with this, I would set both a fixed number of iterations as well as a threshold. If the threshold is set to for example two times the expected RMS noise level, then this will ensure a deep clean within the mask without overcleaning.

Examining the residual image is very informative, as this shows you what remains after the part of the sky that has been deconvolved has been removed. If there is significant emission remaining then the number of iterations is probably too few (or perhaps the mask needs refining). The -continue option is wsclean is useful as you can resume a deconvolution process in this situation.

Like everything, this will become clearer with repeated practise, so don't be afraid to just experiment, trying out different imaging parameters. If something is not clear then please ask, and feel free to post images if there is something specific you would like to discuss. If you want to open a specific "imaging" issue to split out this discussion the please do so.

@Victoria-Samboco
Copy link
Contributor Author

@IanHeywood and @o-smirnov thank you very much for the explanation. So below are the images of the scans.

In the first set of images I see that the images from scan nr.10 to 15 the number of white points increased In my perception indicating the increase in emission in that region (correct me if I'm wrong). Then on scan nr.17 the image appears "empty" I can't see anything there and then you can see in the following image that it is almost completely white. Bearing in mind that the objective with these images is to identify/image the sun, can we say that the blank spots that appear over time would be the result of solar radiation? And that the completely white image could be the sun? In that case can be the radio interference caused by the sun?

Well these images are from the 17 scans extracted from Ms now I'm in the process of imaging the small scans with about 5 min (I didn't do the split in the most beautiful way that could be done, but is something that I'm still trying to find a solution with the help of tips given by Prof's and Robert).

scans_images_3_to_23

scan_image2_3_to_23

scans_image_26_to_41

scan_images2_26_to_41

@o-smirnov
Copy link
Collaborator

No, something is wrong here.

The completely blank (black) images are scans that are completely flagged, for whatever reason, could be entirely legit. We can worry about that later. Ignore those scans for now.

The almost-white images are scans that are very heavily flagged. There can be legit reasons for this too. Ignore them too for now.

But the black-with-noise images are scans with good data. These look like images of empty sky. Which means either you're imaging at the wrong position, or you haven't rephased correctly. If everything goes right, you must see the disk of the Sun in all good scans.

One step at a time -- I suggest concentrating on one good scan (e.g. scan 3) and figuring out what's going wrong. Until you can see the Sun in scan 3, there's no point in running the rest of the pipeline.

@Victoria-Samboco
Copy link
Contributor Author

Ohh, oky. I'll have a look to see what happened. Is there any possibility that the problem was with the initial step? The calibration one (step zero)? Or should I worry about the process that comes after calibration?

@o-smirnov
Copy link
Collaborator

Check what sort of images you're getting after calibration, but before subtracting the model and rephasing. Is the target field looking sensible? If yes, then calibration is not the problem.

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Jun 24, 2022 via email

@Kincaidr
Copy link
Collaborator

Kincaidr commented Jul 2, 2022

I tihnk the problem might be chgcenter. Firstly, I have splitted the ms in a different way as compared to @Victoria-Samboco. Instead of giving a specific time from the listobs file, I just need to give a number (called interval) that tells us how fine to split the ms. I then use the start and end times of the ms (t0 and t1) to find the time interval dt of each ms:

array=[]
    tt = tb.open(path_to_ms)
    all_times = list(numpy.unique(tb.getcol('TIME')))
    t0 = all_times[0]
    t1 = all_times[-1]
    dt = (t1-t0)/(interval)
    for i in range(interval):
        t2=dt*i+t0
        t_iso = Time(t2/86400.0,format='mjd').iso
        array.append(t_iso)

For my run I choose interval = 100 this means that each of the 100 ms's are approximately just over 5 minutes long. After subtraction of columns, I used chgcentre. This is output for one of the MS's:

In [31]: os.system(f"chgcentre {scans[50]} {coordinates_ra[0]} {coordinates_dec[0]}")
A program to change the phase centre of a measurement set.
Written by André Offringa ([email protected]).

Processing field "J2009-2026": -03h50m23.2s -20d26m46s -> 00h04m48.69s 00d23m03.7s (61.1156 deg)
Denormal shifting: 0,0 -> 0,0
Old uvw: [0, 0, 0] (0)
New [0, 0, 0] (0)

Old uvw: [0, 0, 0] (0)
New [0, 0, 0] (0)

Old uvw: [0, 0, 0] (0)
New [0, 0, 0] (0)

Old uvw: [0, 0, 0] (0)
New [0, 0, 0] (0)

Old uvw: [0, 0, 0] (0)
New [0, 0, 0] (0)

Changing phase centre: 0%....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%
Out[31]: 0

Are we supposed to be getting zeros like this? The first line seems correct: Processing field "J2009-2026": -03h50m23.2s -20d26m46s -> 00h04m48.69s 00d23m03.7s (61.1156 deg), but not sure why I am getting zeros for old and new uvw.

Also, comparing the image from this output (5min phase-shifted MS) to the image of the full MS (~ 9hr observation) before any phase-shifting :

image

They look the same and no phase-shifting took place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants