Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional buggy spectrum if array of equal errors given #110

Open
leigh2 opened this issue Feb 22, 2024 · 3 comments
Open

Occasional buggy spectrum if array of equal errors given #110

leigh2 opened this issue Feb 22, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@leigh2
Copy link
Contributor

leigh2 commented Feb 22, 2024

I found that if TLS is provided an array of uniform errors then it will sometimes/often cause issues with the power spectrum, and hence (automatic) period selection.
I ran a few hundred tests of white noise models with transits and found that in most cases the short period end of the spectrum had erroneous massive peaks and troughs. Peaks naturally caused an incorrect selection of the period.
I've managed to reproduce this from your quick start tutorial notebook and pushed it to my fork here
https://github.com/leigh2/tls/blob/error_array_bug_example/tutorials/01%20Quick%20start%20with%20synthetic%20data.ipynb
(Note that seeds 0 and 2 are fine, indicating that it only occasionally breaks!)

I've included examples of a couple of other power spectrum plots below at different levels of broken-ness.

This was using the current main branch version, on linux, python 3.11.5 fwiw.

34ppmhr
160ppmhr

@leigh2 leigh2 added the bug Something isn't working label Feb 22, 2024
@talensgj
Copy link

I believe I've found the cause of this issue. If I'm right it has nothing to do with using an array of uniform errors, instead it is caused by the chi2 value used when no models are fit.

The function lowest_residuals_in_this_duration uses the number of datapoints in the lightcurve as the chi2 value when no models are fit. This is of course the expected value of the weighted sum of residuals, assuming the errors are Gaussian and the provided values are correct, but in reality the actual value of sum (y - 1)^2/dy^2 will not be exactly equal to the number of points. A factor that greatly exacerbates this is the re-scaling of user provided uncertainties to have a mean of 1, which completely breaks this assumption.

Most of the time this is not an issue since the starting value will be replaced anyway, but especially at shorter periods it is possible that none of the trial depths exceed transit_depth_min = 10 ppm and this wildy incorrect default value is returned.

I tested this assumption by replacing the assumed value of len(y) with the true value of sum (y - 1)^2/dy^2 in my fork, and was able to make the spurious peaks and throughs go away in the lightcurve I was testing. I'm attaching to figures below comparing TLS to the astropy implementation of BLS. In the first figure I'm using the current version of TLS, the other uses my fork with a fix for this issue.

TLS 1.32:
TLS_comparison_1

My fix:
TLS_comparison_2

@hippke
Copy link
Owner

hippke commented Aug 20, 2024

Hey, thank you for the bug report @leigh2 and the fix @talensgj ! I have come across this issue in the past myself but never had the time to dive deeper. Can you submit a pull request @talensgj or alternatively copy-paste the lines of code you changed? I will then publish a new version. Thank you so much!

@talensgj
Copy link

Hello, I've made the pull request #112 . Glad I could help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants