BlockDiagLinearOperator with blocks of differing sizes #73
Replies: 1 comment
-
Yeah having this functionality would be nice. Basically, what this would entail would be, as you said, to (i) allow representing the blocks as a list rather than a batched operator, and (ii) to make sure that the various specialized methods on |
Beta Was this translation helpful? Give feedback.
-
I've been working on an additive GP term that represents the effect of uncontrolled factors in experiments that cause repeated experiments at the same settings to give somewhat different results, such that the differences are not captured by noise terms, but rather by covariances that are correlated within each experiment, but uncorrelated between different experiments. Sort of like a "noise" term represented by an multivariate normal, instead of by a one-dimensional normal.
If the data is arranged with the results from all the experiments appended experiment-by-experiment, the resulting covariance has a block-diagonal structure, to be added to a "physics" covariance representing actual effects of interest, and to a genuine noise term. I've been using the BlockDiagLinearOperator to do this, and so far it seems to be working. One problem that I've had to work around is that BlockDiagLinearOperator assumes that all diagonal sub-blocks have the same dimension. This is tantamount to requiring that all experiments take the same amount of data, i.e. that each individual experiment dataset is the same size as all the others.
This is not a natural requirement, nor a necessary one, since different experiments can have different data sizes in general, and this poses no problem of principle for the model---just for the implementation in GPyTorch. Moreover, at prediction time, it requires that one predict a dataset the same size as the training sets (to get the predictive covariance right), which is a physically absurd requirement. I can work around it by padding the prediction data points out to the size of the per-experiment datasize from training, calling the model to get a predictive normal distribution, and using __getitem__() to eliminate the padding from the predictive distribution. But this is very kludgy and inefficient.
So what I would really like is a variant (possibly a subclass) of BlockDiagLinearOperator that admits diagonal sub-blocks of varying sizes. Possibly initialized by an iterable of LinearOperator-s, rather than using batch indices. I don't think I understand the LinearOperator code well enough to do this myself without breaking something, but I'd be happy to help test such new functionality. I hope that you'll consider it.
Thanks,
Carlo
Beta Was this translation helpful? Give feedback.
All reactions