Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: quickBayes: An analytical approach to Bayesian loglikelihoods #6230

Open
editorialbot opened this issue Jan 15, 2024 · 36 comments
Open
Assignees
Labels
Python review TeX Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Jan 15, 2024

Submitting author: @AnthonyLim23 (Anthony Lim)
Repository: https://github.com/ISISNeutronMuon/quickBayes
Branch with paper.md (empty if default branch): 82_paper
Version: 1.0.0b18
Editor: @osorensen
Reviewers: @JonasMoss, @mattpitkin, @prashjet
Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/2439ebe608e0c76d3147e00735dab556"><img src="https://joss.theoj.org/papers/2439ebe608e0c76d3147e00735dab556/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/2439ebe608e0c76d3147e00735dab556/status.svg)](https://joss.theoj.org/papers/2439ebe608e0c76d3147e00735dab556)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@JonasMoss & @mattpitkin & @prashjet, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @osorensen know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @JonasMoss

📝 Checklist for @mattpitkin

📝 Checklist for @prashjet

@editorialbot
Copy link
Collaborator Author

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.88  T=0.15 s (676.3 files/s, 69408.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          69           1537           2092           5281
reStructuredText                16            201            316            279
YAML                             8             39             15            220
Markdown                         3             21              0             62
TeX                              1              3              0             40
DOS Batch                        1              8              1             26
make                             1              4              7              9
-------------------------------------------------------------------------------
SUM:                            99           1813           2431           5917
-------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

@editorialbot
Copy link
Collaborator Author

Wordcount for paper.md is 666

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/0921-4526(92)90036-R is OK
- 10.1103/RevModPhys.83.943 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@mattpitkin
Copy link

mattpitkin commented Jan 15, 2024

Review checklist for @mattpitkin

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/ISISNeutronMuon/quickBayes?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@AnthonyLim23) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@JonasMoss
Copy link

JonasMoss commented Jan 16, 2024

Review checklist for @JonasMoss

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/ISISNeutronMuon/quickBayes?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@AnthonyLim23) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@mattpitkin
Copy link

Hi @AnthonyLim23, I've just got a few minor things to suggest initially:

  • make the links to the documentation in the GitHub README file into actual working links (just switch from rst style links to Markdown style);
  • add the installation instructions to the documentation in https://quickbayes.readthedocs.io/;
  • fix the paper.bib file, so that the authors display properly in the paper, i.e., change it to:
@software{quasielasticbayes,
	  author = {{Sivia}, D. and {Howells}, S.},
	  title = {quasielasticbayes version 0.2.0}
}

@article{bayesPaper,
	 title = {Bayesian analysis of quasielastic neutron scattering data},
	 journal = {Physica B: Condensed Matter},
	 volume = {182},
	 number = {4},
	 pages = {341-348},
	 year = {1992},
	 note = {Quasielastic Neutron Scattering},
	 issn = {0921-4526},
	 doi = {10.1016/0921-4526(92)90036-R},
	 url = {https://www.sciencedirect.com/science/article/pii/092145269290036R},
	 author = {{Sivia}, D.~S. and {Carlile} C.~J. and {Howells}, W.~S. and {König}, S.},
	 abstract = {We consider the analysis of quasielastic neutron scattering data from a Bayesian point-of-view. This enables us to use probability theory to assess how many quasielastic components there is most evidence for in the data, as well as providing an optimal estimate of their parameters. We review the theory briefly, describe an efficient algorithm for its implementation and illustrate its use with both simulated and real data.}
}
}

@article{bayesReview,
         title = {Bayesian inference in physics},
         author = {{von Toussaint}, Udo},
         journal = {Rev. Mod. Phys.},
         volume = {83},
         issue = {3},
         pages = {943--999},
         year = {2011},
         month = {Sep},
         publisher = {American Physical Society},
         doi = {10.1103/RevModPhys.83.943},
         url = {https://link.aps.org/doi/10.1103/RevModPhys.83.943}
}

@book{bayesBook,
      author = {{Sivia}, D. and {Skilling}, J.},
      edition = {Second},
      publisher = {Oxford University Press},
      title = {Data Analysis A Bayesian Tutorial},
      year = {2006},
      isbn = {9780198568322}
}

@mattpitkin
Copy link

In the developer section of the docs, could you please add a sentences or two stating how users/developers can:

  1. Report issues or problems with the software 3) Seek support

i.e., just say that users can submit issues in the GitHub repo.

@AnthonyLim23
Copy link

I have updated the PR for the paper with your bib file. I then edited the authors list for the software so the shared names are the same as the paper.

The other comments have been addressed in ISISNeutronMuon/quickBayes#88

@mattpitkin
Copy link

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@mattpitkin
Copy link

@AnthonyLim23 It looks like I made a typo in my version of the .bib file and {Carlile} C.~J. is missing a comma, so should be {Carlile}, C.~J.. Also, is there any possible date and URL that could be use for the Quasielasticbayes v0.2.0 reference?

While, I'm at it - I spotted a minor typo. In the acknowledgements quickBayes is spelled wrong.

@AnthonyLim23
Copy link

For the Quasielasticbayes 0.2.0 I dont have a date or url. It was a programme written late 80's early 90's for internal use by scientists. So was never properly documented. I can put the github that it has since been moved to (and updated to work with Python), what do you think?

@osorensen
Copy link
Member

@AnthonyLim23, I think you can put the GitHub link that is has been moved to.

@AnthonyLim23
Copy link

That should be everything updated.

@JonasMoss
Copy link

I have now read much of the documentation and can only recommend a full rewrite of it.

Since this package is about Bayesian statistics, I need to know:

  • What's the model?
  • What are the priors?
  • How does the input data look?
  • Are there any approximations used, and if so, where, and what's the impact?
  • Where can I see the posteriors? If there are no posteriors, then what can I see? (E.g., the posterior mean or an approximation to it?)
  • What is the syntax of the functions and what do they return?
  • Can we see a real-world example?

To fix these problems it may help to take a look at, e.g., the docs for linear regression in pymc. Here it's clearly stated:

  • what the model is,
  • what the prior is,
  • how the model is specified in the program,
  • how the priors are specified in the program,
  • how to run a sampler and what it is called (NUTS).

More precise terminology would also help. I would also recommend fewer attempts at intuitive explanations of the statistics.

Here are a couple of more detailed responses regarding the loglikelihood page.

  • What you call "the loglikelihood" is called the "posterior probability", also in the book that you refer to. It is the probability of a parameter (number of peaks) conditioned on the data; the likelihood is a pmf/pdf as a function of its parameters. (Though this posterior probability is weird, as it conditions on $I$, which also contain parameters as far as I can see. But Sivia's writing is so opaque it's hard to tell.)
  • The code doesn't run properly when I copy and paste it.
  • Equation 4.20 in the book you refer to appears to hold for a certain kind of normal models; be explicit about this in the documentation - write up the regression model, all its assumptions, and its priors. (From Sivia et al's paper it appears the priors are flat.)
  • About this equation: You do not state what $D_k,\chi$ and $I$ is. And $H$ is the Hessian of what?
  • Finally, it isn't an equation, but an approximation proportional to the posterior probability. (I would add that you do not strictly speaking need to include this approximation in the documentation, a reference suffices. But you need to be clear that it is an approximation, what the model is, what the prior is, and what it approximates.)
  • It looks like Sivia's dealing with a regression model where the $y\mid x =\sum_{j=1}^M A_j\phi(x;\mu_j)+\epsilon,$ where $\phi(x;\mu_j)$ are Gaussian functions and $\epsilon$ a normal error term. Or maybe not? There is some background signal involved too, which I presumed is known. Does your package support that?
  • What does the assumption "the fit for the model is at a minima" mean? I can't see any fitting model being described here.
  • Why do you have those logarithm rules in the documentation? That looks out of place.
  • Some writing does not serve a clear purpose, e.g., the paragraph following the line "This final equation is the unnormalized loglikelihood."

@osorensen
Copy link
Member

Thanks a lot for your thorough review of the documentation, @JonasMoss!

@mattpitkin
Copy link

mattpitkin commented Feb 6, 2024

I agree with @JonasMoss's comments - the documentation needs very considerable work. The introduction should be a clear succinct summary of what the package is for - the example given is not very clear, and if included (I would actually recommend removing it) needs to be explained in more detail. As stated in the previous comments this should include: the model used, the model parameters that are being fit, how are they being fit, an explanation of the input data, e.g., something about choosing two different noise levels to show a particular effect. (As an aside, in the introduction example with the Gaussian profiles, the docs/code state that these have an amplitude of 103, but the plots show a value of about 50; how is the amplitude being defined?)

The API documentation also needs major work. Currently, the API docs exists, but following down the chain of links shows most functions/classes are very minimally documented, without any clear explanation of what they do or what the input arguments and outputs are. Having a full API documentation is not a requirement for a JOSS paper, so long there are several usage cases shown in examples that are well described and cover a decent range of the most used package functionality. I would recommend focusing on a few good quality examples and remove most of the API from the docs unless it points to well documented material.

@AnthonyLim23
Copy link

In order to reduce the amount of work we all need to do, I want to make sure that we are agreed on what changes need to be made. I hope that is ok. I agree that the documentation needs more work, but I want it to be appropriate for what the package does.

This package is a bit strange as it uses some Bayesian statistics, but not directly. So I want to check how much detail on Bayesian statistics is appropriate.

My understanding is that normally Bayesian inference uses multiple "walkers"/"samples" that are initially distributed across the parameter space according to a prior distribution (usually flat). However, in this package a prior is not defined by the user. Instead they define some initial start parameters, which are then used for a fit (the user can choose the fit engine/method). In that sense there is only a single point for the prior (unless a multistart method is used).

There is currently no posterior distribution produced by the code (I would like to add it someday, but would want to verify the results against DREAM from bumps). At present the code just produces the peak value of the distribution. This result is then used to determine which model is more likely.

The models are defined as fitting functions, which when called return an array of the corresponding y values. These are used in the optimisation to calculate the local minima. I have included some "special" functions that allow the construction of more complicated models with minimal work (e.g. composite for addition and convolution). The models are then tried within a workflow to calculate the most likely models (e.g. is the data linear or quadratic).

Input data is generally just x, y, e and the user can also include additional x, y, e data that correspond to a resolution for Quasi Elastic Neutron Scattering instruments (e.g. Osiris at ISIS).

Most of the approximations are hidden in the derivation of the peak posterior probability. The big assumption is that it is being evaluated at the local minima, which is why we can use fitting to calculate the peak posterior probability.

@JonasMoss for a real world example is it sufficient to just have images and show a line of code reading some data (from the tests) for the input?

I have included the log rules because my colleagues are unlikely to know them and I wanted them to have the information they needed to get from the equation in the book to the one implemented in the code.

@mattpitkin for the API examples would you like to see examples for the individual pieces (e.g. fitting functions, fitting engines) or just a few example workflows/model selection? Most of the API docs are autogenerated. I did try to write the docs in the style of a tutorial, but I assume that those examples are insufficient. What changes would you recommend that I make to those examples to improve the API documentation?

@mattpitkin
Copy link

Hi @AnthonyLim23, sorry for the slow response.

@mattpitkin for the API examples would you like to see examples for the individual pieces (e.g. fitting functions, fitting engines) or just a few example workflows/model selection? Most of the API docs are autogenerated. I did try to write the docs in the style of a tutorial, but I assume that those examples are insufficient. What changes would you recommend that I make to those examples to improve the API documentation?

I think it would be good to see, say, two fully worked examples demonstrating a realistic pipeline of usage of the code, including either real or simulated datasets. One could be, say, determining the number of Gaussian-like "lines" in some data and one could also incorporated some modelling of a background (or whatever you think appropriate). If there's a real analysis that it has been used on where the data is open and you a free to post in the docs that would be great, e.g., the QENS example that you have in the paper. You could do these in examples in Jupyter notebooks and either just link to them in the docs or get Sphinx to render them, so that you can easily include figures (although this is not a requirement).

Some other bits of the API do need work though. The fit functions all at least need a short doc string saying what they are and what input parameters they require, otherwise they are not really useful.

In terms of other general comments (and again in relation to @JonasMoss's comments), I think the first paragraph of the paper nicely described the use case of the code, so that should be echoed in the docs. Essentially this code is for model comparison and for each model you are (I think/assume) approximating the marginalised likelihood for a given model (also known as the evidence), where you marginalising over all the unknown model parameters (can you hold an of the parameters at fixed values or do you have to marginalise over them all?) as opposed to an analytical maximum likelihood approach. Maybe this is done using Laplace's method? You can state this in the docs and paper, but you should also briefly mention other approximation methods such as the AIC, BIC and nested sampling (which is the Monte Carlo method described at the end of the Sivia book). So, when you talk about loglikelihood in the docs I think you really mean marginal likelihood/evidence - please check if this is what you mean and make changes. You should also be explicit that priors on all model parameters are assumed to be flat and infinite in bounds (unless they are not?). This effects how you interpret the model comparison, so it does need to be stated. Is there any way to specifically set the prior ranges, or use different priors (say Gaussian priors, which can also be analytically marginalised over in certain cases)?

In the docs, it currently states that:

"Fit functions are the mathematical representations of the data"

but I don't really think this is quite what they are. The fit functions are a (potentially approximate) mathematical model for a process/phenomenon that you have collected data to measure. The data will be a combination of that modelled process and some noise (where the noise could just be measurement noise, or a combination of other unmodelled processes). So, you could say something like: "The fit functions are approximate mathematical models of the process that we intend to measure.".

@AnthonyLim23
Copy link

@mattpitkin I have data in the tests for QENS and simulated muon data (number of exp decays). I can do two examples using them.

I will look into the rest of the API docs and make them clearer.

I believe that it is the marginalised likelihood, but I will double check. At presents all of the parameters have to be free, I do want to make it easier to add fixes (the current method is https://github.com/ISISNeutronMuon/quickBayes/blob/main/src/quickBayes/fit_functions/stretch_exp_fixed.py which is a bit clunky.

In terms of priors, from a user perspective it is purely the parameter range and start values for fit parameters. I believe the flat priors are baked into the assumptions used for deriving the analytical equation. I can touch on it, but the user will never be able to do anything about it.

@prashjet
Copy link

prashjet commented Feb 27, 2024

Review checklist for @prashjet

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/ISISNeutronMuon/quickBayes?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@AnthonyLim23) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@prashjet
Copy link

prashjet commented Feb 27, 2024

Hi there. Thanks for your patience waiting for my review, I've just got round to it after a busy period.

After reading the documentation and paper, I am still quite confused about which calculations this software is doing, and what information those calculations are telling me. I think the comment from @JonasMoss provides an excellent checklist of what should be included in the documentation so that someone can clearly understand what your software does and when to use it. I won't repeat the entire checklist here, but I'll highlight some key points in a response to your last message. You say...

This package is a bit strange as it uses some Bayesian statistics, but not directly. So I want to check how much detail on Bayesian statistics is appropriate.

In principle, I think that black box software for Bayesian statistics may sometimes be useful (i.e. software that provides a solution, without necessarily going into details) but only if you can clearly explain what situations the black-box is suitable for. In the case of quickBayes, to answer this question you'll need to understand Equation 4.20 of the Sivia & Skilling book. This will require:

  • explaining all of the terms in Equation 4.20,
  • explaining the assumptions that have gone into deriving Equation 4.20,
  • doing both of the above using language/notation that is consistent throughout your software package.

That last point is a little opaque, so I'll try to clarify it with an example. Currently, your loglikelihood function takes an argument N_peaks. In the code snippet on loglikelihood page, you set N_peaks=1 for both examples of fit_functions, even though one of them is a model with a single Gaussian, and one with two Gaussians. This leaves me asking what is the meaning of N_peak and how (if it all?) does it relate to the number of peaks I have defined in my fit_function?

I think unravelling Equation 4.20 of the Sivia & Skilling is likely to be difficult task, however, given that this calculation is currently at the heart of your software package, I think it is unavoidable in order for anyone know if quickBayes is the right tool for them. I think this will involve an entire re-write of the documentation.

p.s. I also had a problem with installation.

@osorensen
Copy link
Member

Thanks to @JonasMoss, @prashjet and @mattpitkin for your thorough reviews!

@AnthonyLim23, I understand that a large number of issues need to be addressed here. It is ok if it takes time, but please keep us updated on any progress in this thread.

@AnthonyLim23
Copy link

@osorensen thank you. It will take me a while as I have another project deadline that I need to focus on first.

@AnthonyLim23
Copy link

AnthonyLim23 commented Apr 30, 2024

I have outlined my plan for the new documentation ISISNeutronMuon/quickBayes#92 I think this should cover all of the comments.

I also have an issue to do the API changes ISISNeutronMuon/quickBayes#93

I have a potential fix for ISISNeutronMuon/quickBayes#89

@osorensen
Copy link
Member

👋 @AnthonyLim23, how is it going with the work towards the issues mentioned above?

@AnthonyLim23
Copy link

AnthonyLim23 commented May 28, 2024

@osorensen I have written an introduction to Bayes and done a derivation of the main equation, using simplified notation (I have also extended it). I still need to clean it up a bit, I also need to write the comparison to other methods and the examples. Unfortunately, I have to do this around other things. I am sorry it is taking so long. I hope to get some more done next week (I am happy to stick it in a PR so you can have a look).

I think I have fixed the issue with not being able to install it ISISNeutronMuon/quickBayes#89 and the fix ISISNeutronMuon/quickBayes#90

@AnthonyLim23
Copy link

The docs are working in progress but what I have so far are here: https://quickbayes--94.org.readthedocs.build/en/94/
The PR is ISISNeutronMuon/quickBayes#94

I still need to do:

  • add details on other Bayesian methods
  • add examples (QENS, muon and background selection)
  • a general clean up of the wording (particularly in the theory section)

However, it should give a good indication of the new documentation and if it is suitable.

@osorensen
Copy link
Member

Thanks for the update @AnthonyLim23

@osorensen
Copy link
Member

Could you please give us a progress update, @AnthonyLim23?

@AnthonyLim23
Copy link

@osorensen I'm sorry this is taking so long, I am having to do around other projects. I believe the only things left to do for the docs are:

  • Tidy up the wording for the derivations
  • The comparison to other techniques/models
  • The examples (I do have the Python notebooks working)

I am hoping to get some time this week to do some work on it.

@osorensen
Copy link
Member

@AnthonyLim23, any updates on your progress with the docs?

@AnthonyLim23
Copy link

Unfortunately, I have not had any opportunity to work on this. However, I am happy to receive feedback on what I have so far

ISISNeutronMuon/quickBayes#94

@osorensen
Copy link
Member

@AnthonyLim23, I suggest you report here when you have addressed all the issues, so the reviewers can consider it all at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python review TeX Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning
Projects
None yet
Development

No branches or pull requests

6 participants