Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request [enhancement]: Support for Multiple Response Variables using Syntax in TuringGLM.jl #93

Open
kiante-fernandez opened this issue Jun 25, 2023 · 6 comments

Comments

@kiante-fernandez
Copy link
Collaborator

I would like to inquire if any ongoing work exists for a feature in TuringGLM.jl that supports a syntax for specifying multiple response variables within a single model. Based on my understanding, the current formula syntax in TuringGLM.jl only allows modeling of a single response variable. However, having the capability to model multiple response variables would significantly enhance the usability and convenience of TuringGLM.jl.

Requested Feature

Compact Syntax for Multiple Response Variables: Implement a compact syntax in TuringGLM.jl that enables users to specify multiple response variables within a single model specification. This would allow users to model the relationships between multiple dependent variables and independent variables.

Proposed Formula Syntax

Compact Syntax for Multiple Response Variables could look something similar to that of brms:

formula = @formula(y ~ condition + (condition | id),
                   w ~ condition + (condition | id),
                   v ~ condition + (condition | id))

In this proposed syntax, each response variable (y, w, v) is specified on a separate line, followed by the fixed effects (condition) and the random effects ((condition | id)).

If someone could provide guidance on where to start with these modifications, I would be happy to contribute to the implementation.

@kiante-fernandez kiante-fernandez changed the title Feature Request: Support for Multiple Response Variables using Syntax in TuringGLM.jl Feature Request [enhancement]: Support for Multiple Response Variables using Syntax in TuringGLM.jl Jun 25, 2023
@storopoli
Copy link
Member

No ongoing work going but PRs are welcomed.

@DominiqueMakowski
Copy link

Related to this, we (also tagging @itsdfish) would like indeed to see if it's possible to provide a TuringGLM interface for the reaction time models implemented in SequentialSamplingModels.jl.

If I understand, most of the heavy lifting is done in turing_model() that defines the model / priors etc, so my guess is that we should implement a _model() function for the distributions we are interested in.

Could you perhaps guide us (or add a section to the documentation) what are the necessary methods that one needs to implement in order to add new model families? like _model(), _prior() etc.

Perhaps the new package extension system would be useful?

From there, we could see how to extend the formula macro to work with multi-parameters formulas

Thanks a lot!

@storopoli
Copy link
Member

You would need to extract the multiple responses from the @formula macro from StatsModels.jl

using StatsModels: @formula

then, indeed, create a _model() and _prior() functions.

Of course you would also need to add docs and tests

@storopoli
Copy link
Member

Perhaps the new package extension system would be useful?

I don't mind reviewing PRs. If you want to implement inside TuringGLM.jl let me know.

@itsdfish
Copy link

itsdfish commented Jun 12, 2024

Last year we had a discussion and tentatively converged on the following syntax:

@formula((c,rt) ~ LBA,
         drift ~ 1 + Condition,
         threshold ~ 1 + Condition,
         ndt ~ 1 + Condition

where drift is an unbounded vector, and threshold and ndt are non-negative scalars. Before I dig into the package more, I was hoping to get an idea about the feasibility of implimenting the macro for these types of models. If you don't mind, can you please tell me whether the following are possible?

  1. Can we support vector parameters? For example, can we broadcast ~ normal(0, 1) over the vector, and can we assign specific priors to each element e.g., drift[1] ~ normal(0, 1), drift[2] ~ normal(1, 2)?
  2. Is there a way to enforce bounds on parameters, e.g., ndt ~ beta0 + x1 * beta1 ... + ... xn * betan >= 0?

Edit

Maybe the solution to item 2 is as simple as using a truncated normal?

@storopoli
Copy link
Member

Is there a way to enforce bounds on parameters, e.g., ndt ~ beta0 + x1 * beta1 ... + ... xn * betan >= 0?

Yes, truncated(d; lower, upper) as per the truncated from Distributions.jl

Can we support vector parameters? For example, can we broadcast ~ normal(0, 1) over the vector, and can we assign specific priors to each element e.g., drift[1] ~ normal(0, 1), drift[2] ~ normal(1, 2)?

Maybe with filldist and arraydist (performance concerns)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants