-
Notifications
You must be signed in to change notification settings - Fork 3
LIP0006
LIP | 6 |
---|---|
Title | Sums node with varying value input sizes |
Author | J. van de Wolfshaar |
Status | Draft |
Type | Standard |
Discussion | Issue #36 |
PR | |
Created | Feb 7, 2018 |
At this point, the Sums
class in the branch multi_nodes_dense_generator
in
nash911/libspn
can only model
multiple sums of the same dimensionality, that is, each of the sums considers the
same amount of input columns. This LIP proposes a way to break the latter
constraint. By doing so, the new SumsLayer
class could be modeling all sums
at a single depth of a tree-structured SPN. By concentrating many computations in
fewer TensorFlow Ops, we most likely improve efficiency and we reduce the graph size
significantly.
This LIP requires us alter many of the existing methods in Sums
. For simplicity,
we'll consider the _compute_value
method, which in pseudo code does something
like:
function _compute_value(self, weight_tensor, ivs_tensor, *value_tensors)
# Get indices per value input
value_tensor_indices = get_indices_for_all_value_inputs()
# For each of these gather a set of columns to sum and concatenate the columns
values_subsets = map_gather(values, value_tensor_indices)
values = concatenate_columns(values_subsets)
values = reshape(values, sum_reduce_shape)
return sum_over_last_dim(values * weight_tensor)
So this takes subsets of columns, concatenates them, reshapes the result and sums over last dimension after element-wise multiplcation with weights.
The reshaping operation directly relies on the assumption that all sums are of the same dimensionality, so we will propose a different approach that would work for sums with different dimension sizes. Other class methods will have other pieces of code that rely on this assumption, but we'll limit our discussion to this function for brevity.
We want the class to be able to model multiple sums with heterogeneous sizes. As noted before, this requires the alteration of multiple methods, but we'll focus once again to the example stated in the previous section. It could be replaced by the following:
function _compute_value(self, masked_weight_tensor, ivs_tensor, *value_tensors)
# Get indices per value input
value_tensor_indices = get_indices_for_all_value_inputs()
# Concatenate the indices AND value input tensors
value_tensor_indices_concat = concatenate_indices(value_tensor_indices)
values_concat = concatenate_columns(value_tensors)
# For each of these gather a set of columns to sum and concatenate the columns
values = gather_cols_3d_with_padding(
values_concat, value_tensor_indices_concat)
return sum_over_last_dim(values * masked_weight_tensor)
The gather_cols_3d_with_padding
functionality already exists as an Op in the
mutli_nodes_dense_generator
and is called gather_cols_3D
. This Op takes an input
tensor and a list of indices. The list of indices might be nested, meaning that we
could specify:
gathered_tensor = gather_cols_3d(value_tensor, [[0, 1, 2], [0, 2], [3, 4, 5], [1]])
This Op will return a tensor with dimensions [batch_size, num_sums, max_num_cols]
,
or in this case [batch_size, 4, 3]
, since we have 4 lists in the nested list and a
maximum inner list length of 3 (either [0, 1, 2]
or [3, 4, 5]
).
Note that gathered_tensor[:, 1, :]
will contain the zeroth and second columns of
value_tensor
and a trailing zero-column, so the padding is done implicitly.
The _compute_log_value
method would require a similar implementation, where
the element wise multiplication is replaced by element-wise addition.
The weights_tensor
will have zero values for 'padded' parts of the sums that are modeled. Let's
say we have a SumsLayer
that models three Sums
, one with size 5, one with size 3 and another
with size 2. We would need 10 weights in total, but to be able to perform component-wise after the
gather_cols_3d
operation, we construct a weight matrix with shape [num_sums, max_sum_size]
,
which in this case amounts to [3, 5]
.
In short, we propose the following:
- Create a
SumsLayer
class that can model sums of different dimension sizes- Revise methods relying on assumptions of equally sized sums, including:
_compute_value
_compute_log_value
_compute_mpe_value
_compute_log_mpe_value
_compute_value_common
_compute_mpe_path
_compute_log_mpe_path
_compute_mpe_path_common
- Revise methods relying on assumptions of equally sized sums, including:
After this, we should be able to do replace the following
iv_x = spn.IVs(num_vars=3, num_vals=2, name="iv_x")
sums_1 = spn.ParSums((iv_x, [0,1]), num_sums=2, name="sums_1")
sums_2 = spn.ParSums((iv_x, [2,3,4,5]), num_sums=1, name="sums_2")
with:
iv_x = spn.IVs(num_vars=3, num_vals=2, name="iv_x")
sums_1_2 = spn.SumsLayer([(iv_x, [0,1]), (iv_x, [0,1]), (iv_x, [2,3,4,5])],
num_sums_or_sizes=[4, 2, 2], name="sums_1_2")
The parameter num_sums_or_sizes
either designates the total number of sums as an int
or the
size of each sum that is modeled in the case of a list
. If an int
is passed to the constructor,
the sums will be sized equally and divided over a tensor that holds the concatenation of all inputs.
If a list is provided, the sums will be divided in the same way but with potentially different
sizes.
There are a few extensions:
- Automatically convert between multiple
ParSums
to a singleSums
- Revising the
Products
class to be able to handle products of varying dimensionality - Think of another input format that simplifies the first argument of the
SumsLayer
constructor
In the test below, we model 100 sum nodes. The sums' inputs are divided in 10 groups, each having
its own IVs
input. We can model this using 10 ParSums
where each ParSum
connects to a single
IVs
and computes 10 sums. We compare this against a single SumsLayer
that models the very same
structure but combines the full component-wise application of latent IVs
, the component-wise
weighting and the reduction in non-log or log space to compute the outcome of each sum. The test
measures the wall time for 250 runs of MPE path calculation.
It can be seen that the sums_layer
implementations yield much smaller graph sizes and are
significantly faster for all configurations. We consider three different versions of sums_layer
:
-
sums_layer
computes MPE counts and scatters the counts to each input, even though some inputs nodes might not be unique, so summing of the counts happens later in the graph for these non-unique nodes. -
sums_layer_v2
computes the sum of the counts of each unique input before passing it to_scatter_to_input_tensors
, yielding a considerably smaller graph size -
sums_layer_v3
computes sums of counts assums_layer_v2
, but also reuses the component-wise weighted tensor computed during forward pass in the backward pass (see also LIP7).
#-----------------------
InferenceType: MPE
-----------------------
CPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
GPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
par_sums 342 Yes No 150.62 16.50 34.53 5.42 True
sums_layer 359 Yes No 141.29 13.50 10.24 4.69 True
sums_layer_v2 57 Yes No 61.04 49.43 18.15 4.64 True
sums_layer_v3 52 Yes No 56.04 49.98 17.33 4.69 True
-----------------------
InferenceType: MPE
-----------------------
CPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
GPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
par_sums 532 Yes Yes 221.21 20.78 38.81 6.94 True
sums_layer 378 Yes Yes 227.03 13.81 12.20 5.29 True
sums_layer_v2 76 Yes Yes 67.63 49.98 20.58 5.06 True
sums_layer_v3 68 Yes Yes 63.75 50.69 19.98 5.52 True
-----------------------
InferenceType: MPE-LOG
-----------------------
CPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
GPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
par_sums 363 Yes No 160.13 17.09 23.44 5.67 True
sums_layer 362 Yes No 142.26 13.25 9.78 4.58 True
sums_layer_v2 60 Yes No 61.77 49.30 17.80 4.49 True
sums_layer_v3 55 Yes No 56.47 49.26 18.15 4.74 True
-----------------------
InferenceType: MPE-LOG
-----------------------
CPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
GPU op size indices ivs setup_time weights_init_time first_run_time rest_run_time correct
par_sums 563 Yes Yes 317.90 21.56 41.37 7.35 True
sums_layer 382 Yes Yes 149.24 13.64 12.20 5.20 True
sums_layer_v2 80 Yes Yes 70.97 55.02 34.58 5.22 True
sums_layer_v3 72 Yes Yes 64.55 50.17 20.28 5.22 True