You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Represent [TN, TM] tensors by TxT blocks of NxM lazy tensors. While block matrices are supported, the efficient representation is only when there is a diagonal structure over the T dimensions.
Motivation
Here is an example that linear_operator cannot deal with:
This calculation turns up in some multi-output GP models. It has a straightforward efficient implementation:
M_diag= {}
# We only need the diagonal of each block of Mfort1, t2initertools.product(range(T), range(T)):
r= (As[t1].T* (Bs[t1][t2] @ As[t2].T)).sum(0)
ift1==t2:
r+=torch.diag(Cs[t1])
M_diag[(t1, t2)] =r# The rotation is applied blockwise due to the kron structureR= {}
fortinrange(T): # we don't need the off-diag blocksr=0fori1, i2initertools.product(range(T), range(T)):
r+=L[t, i1] *M_diag[(i1, i2)] *L[t, i2]
R[t] =rprint("fast way")
print(torch.concat([R[t] fortinrange(T)]))
Currently, this calculation could be implemented inside linear_operator like this
Add block linear operator class that can keep track of the [T, T] block structure, represented as T^2 lazy tensors of the same shape. Implement matrix multiplication between block matrices as the appropriate linear operators on the blocks.
As a work-around, I have written manual implementations of specific cases, such as above.
I'm willing to work on PR for this
Additional context
None
The text was updated successfully, but these errors were encountered:
Thanks for the suggestion, @hughsalimbeni! @gpleiss, @jacobrgardner and I have talked in the past about expanding linear_operator beyond the current focus on square (really, symmetric PSD) matrices.
The BlockDiagLinearOperatorNonSquare extending BlockDiagLinearOperator seems like a nifty way of realizing this without a ton of refactoring, but ideally we'd rethink the inheritance structure in a way that we'd have something general like
where operators are not assumed to be square (could just have a is_square property that computes from the trailing two dimensions) or symmetric or positive definite (those could also be properties).
This would of course a major redesign of the whole library and so presumably out of scope for what you're trying to achieve here. But adding your suggestion could be a step on the way to a more general setup, and could inform / be absorbed in a larger rewrite down the road. So I'm happy to help review a PR for this.
Looks like a great addition. The key question is what functions need to be implemented to make this a reality. From the library description, we must implement:
_matmul
_transpose_nonbatch
I'm not sure what else makes sense. It seems like we might want
_diagonal
_root_decomposition?
_root_inv_decomposition?
_solve?
inv_quad_logdet?
_svd?
_symeig?
🚀 Feature Request
Represent [TN, TM] tensors by TxT blocks of NxM lazy tensors. While block matrices are supported, the efficient representation is only when there is a diagonal structure over the T dimensions.
Motivation
Here is an example that linear_operator cannot deal with:
This calculation turns up in some multi-output GP models. It has a straightforward efficient implementation:
Currently, this calculation could be implemented inside linear_operator like this
Removing the
to_dense()
gives an error, however.Pitch
Add block linear operator class that can keep track of the [T, T] block structure, represented as T^2 lazy tensors of the same shape. Implement matrix multiplication between block matrices as the appropriate linear operators on the blocks.
As a work-around, I have written manual implementations of specific cases, such as above.
I'm willing to work on PR for this
Additional context
None
The text was updated successfully, but these errors were encountered: