-
Hi, everyone! I'm having issues with one thing that I wish to test. I have a model that produces attention coefficients for the directed edges in a graph, like this:
The issue is that, since I have a full-mesh topology, most of my alphas tend to be values near 0 and the biggest attention coefficients have a low weight.
I would like to generate a binary mask so that either only the top K alpha coefficients (or those coefficients that have a value above the mean) in each index in Thank you for your time :-) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You can use |
Beta Was this translation helpful? Give feedback.
You can use
to_dense_batch(fill_value=float('-inf'))
to convertalpha
into shape[batch_size, num_nodes]
, and then calltopk(dim=1)
on it. Would that work in your case?