-
Notifications
You must be signed in to change notification settings - Fork 3
LIP0016
LIP | 16 |
---|---|
Title | Dense Generator with Layer-op nodes |
Author | A. Ranganath |
Status | Draft |
Type | Standard |
Discussion | Issue #49 |
PR | #51 |
Created | April 16, 2018 |
This implementation includes augmentation of the current DenseSPNGenerator class, to utilize layer-nodes SumsLayer
and ProductsLayer
, for generating dense SPN graphs, where each layer of the graph can be modeled as a single node.
Current implementation of DenseSPNGenerator
includes generating a dense SPN graph, utilizing either single-op nodes - Sum
and Product
- or multi-op nodes (a.k.a Block-nodes
, since they model blocks of operations) - ParSums
and PermProducts
. But neither of the op-node types can be used to model an entire layer of an SPN graph as a single node.
Between SPNs with single-op nodes and block-nodes, there is performance improvements of ~2.3x, and reduction in TF graph-size of ~2.5x. To further improve performance by decreasing overheads of TF operations, as well as to decrease TF graph-size, the next logical step would be to model an entire layer of an SPN graph as a single node. To this end, two Layer-node
types have been proposed and implemented in SumsLayer
and ProductsLayer
.
In the current implementation of DenseSPNGenerator
, a boolean flag (multi_nodes
) is used to choose between single-op nodes and block-nodes, to be used in the dense SPN graph. This parameter will be replaced with NodeType
enumerate, that can take on values SINGLE
, BLOCK
or LAYER
, indicating the node-type to be used in the generated dense SPN graph.
Layer-nodes implementation of the dense generator includes first generating a dense SPN graph using block-nodes, and then converting the resulting SPN to model each layer with the appropriate layer-nodes op. So, a function that accepts an SPN graph (either consisting of single-nodes, block-nodes or a combination of both) and converts it to an SPN graph with layer-nodes, is implemented as a standalone function. With this, it would be possible to convert not only densely generated SPNs, but even custom created SPNs into SPNs with layer-node.
The proposed new DenseSPNGenerator
would contain the following changes:
- Replace boolean flag
multi_nodes
with enumerateNodeType
. - Add a new function
convert_to_layer_nodes
, that accepts an SPN, and returns the converted SPN, modeled with layer-nodes.
After this, we should be able to do the following:
- Generate a Dense SPN with layer-nodes
# Generate a Dense SPN with Layer-nodes
X = spn.IVs(num_vars=20, num_vals=2, name="X")
dense_gen = spn.DenseSPNGenerator(num_decomps=1,
num_subsets=2,
num_mixtures=3,
input_dist=spn.DenseSPNGenerator.InputDist.RAW,
node_type=spn.DenseSPNGenerator.NodeType.LAYER)
root = dense_gen.generate(X, root_name="root")
- Generate a Dense SPN with single-nodes and then convert it to layer-nodes
# Generate a Dense SPN with Single-nodes
X = spn.IVs(num_vars=20, num_vals=2, name="X")
dense_gen = spn.DenseSPNGenerator(num_decomps=1,
num_subsets=2,
num_mixtures=3,
input_dist=spn.DenseSPNGenerator.InputDist.MIXTURE,
node_type=spn.DenseSPNGenerator.NodeType.SINGLE)
root = dense_gen.generate(X, root_name="root")
# Convert the Single-nodes Dense SPN to Layer-nodes SPN
root = dense_gen.convert_to_layer_nodes(root)
- Create a custom SPN with non-layer-nodes, and then convert it to layer-nodes
# Create a custom SPN
X = spn.IVs(num_vars=2, num_vals=2, name="X")
# Layer 1
sums_12 = spn.ParSums((X, [0, 1]), num_sums=2, name="Sums-12")
sum_3 = spn.Sum((X, [2, 3]), name="Sum-3")
sum_4 = spn.Sum((X, [2, 3]), name="Sum-4")
# Layer 2
prod_1 = spn.Product((sums_12, 0), sum_3, name="Product-1")
prod_2 = spn.Product((sums_12, 0), sum_4, name="Product-2")
prod_3 = spn.Product((sums_12, 1), sum_4, name="Product-3")
# Layer 3
root = spn.Sum(prod_1, prod_2, prod_3, name="root")
# Create a DenseGenerator object
dense_gen = spn.DenseSPNGenerator(num_decomps=1,
num_subsets=2,
num_mixtures=1)
# Convert the custom SPN to Layer-nodes SPN
root = dense_gen.convert_to_layer_nodes(root)
Following are performance test results comparing training time per epoch between densely generated SPNs using SINGLE
, BLOCK
and LAYER
node-types, trained on MNIST data-set. Other test case dimensions include input_dist
(RAW or MIXTURE), device
(cpu or gpu - E5-1650v3
and GTX1080TI
respectively) and inference_type
(MARGINAL or MPE).
-----------------------
InferenceType: MARGINAL-LOG
-----------------------
CPU op node_type SPN_size TF_size mem_used input_dist setup_time weights_init_time first_run_time rest_run_time test_accuracy
mnist_all SINGLE 24014 673842 0.0000 RAW 280.1227 36.4041 57.4890 23.8387 84.9700
mnist_all SINGLE 31854 926882 0.0000 MIXTURE 402.9728 58.8921 89.1962 33.9523 85.4500
mnist_all BLOCK 5854 243822 0.0000 RAW 120.1742 16.6501 62.8902 47.3282 85.4200
mnist_all BLOCK 9774 416302 0.0000 MIXTURE 173.0755 24.7570 80.4209 57.2403 85.7300
mnist_all LAYER 244 8862 0.0000 RAW 6.1890 0.7779 22.1308 21.2759 85.0400
mnist_all LAYER 264 9882 0.0000 MIXTURE 15.8137 0.7306 28.6546 27.6661 85.1600
GPU op node_type SPN_size TF_size mem_used input_dist setup_time weights_init_time first_run_time rest_run_time test_accuracy
mnist_all SINGLE 24014 673843 3849.6328 RAW 307.4479 42.5885 55.1779 15.4929 85.8100
mnist_all SINGLE 31854 926883 4961.5785 MIXTURE 396.9775 62.2324 81.5671 22.0822 85.9000
mnist_all BLOCK 5854 243823 6837.7677 RAW 106.1877 14.5253 19.2461 6.3139 86.1100
mnist_all BLOCK 9774 416303 8106.5953 MIXTURE 170.6007 24.5830 33.3518 9.9782 85.7500
mnist_all LAYER 244 8863 8106.5953 RAW 6.7496 0.4102 2.2442 1.7716 85.7100
mnist_all LAYER 264 9883 8858.6742 MIXTURE 7.3206 0.4578 2.8466 2.3593 85.3100
-----------------------
InferenceType: MPE-LOG
-----------------------
CPU op node_type SPN_size TF_size mem_used input_dist setup_time weights_init_time first_run_time rest_run_time test_accuracy
mnist_all SINGLE 24014 603870 0.0000 RAW 257.4592 32.9899 46.6822 19.0344 84.8100
mnist_all SINGLE 31854 809870 0.0000 MIXTURE 352.2344 50.8663 69.2659 26.9178 83.7400
mnist_all BLOCK 5854 218470 0.0000 RAW 102.8327 15.2007 55.0029 43.4540 84.3600
mnist_all BLOCK 9774 365470 0.0000 MIXTURE 151.3747 21.8418 69.3199 48.6915 84.6800
mnist_all LAYER 244 7820 0.0000 RAW 5.5678 0.5634 16.8420 16.3934 84.3800
mnist_all LAYER 264 8710 0.0000 MIXTURE 6.3771 0.5707 21.9313 21.2506 84.5700
GPU op node_type SPN_size TF_size mem_used input_dist setup_time weights_init_time first_run_time rest_run_time test_accuracy
mnist_all SINGLE 24014 603871 8858.6742 RAW 272.7215 37.8374 42.3436 13.1057 84.4500
mnist_all SINGLE 31854 809871 8858.6742 MIXTURE 354.9050 54.5285 61.5796 17.9023 83.9500
mnist_all BLOCK 5854 218471 8858.6742 RAW 93.8918 12.1474 14.4012 5.1138 84.6900
mnist_all BLOCK 9774 365471 8858.6742 MIXTURE 148.4445 20.8485 24.4300 8.0788 83.7100
mnist_all LAYER 244 7821 8858.6742 RAW 12.5599 0.3401 1.8216 1.5022 84.6500
mnist_all LAYER 264 8711 8858.6742 MIXTURE 6.7061 0.3785 2.2467 1.9182 84.0500
Performance test results on GPU show a ~3.9x
improvement in performance between block-nodes and layer-nodes, and a 35x
reduction in TF graph-size between the two.