Skip to content
Avinash Ranganath edited this page Apr 16, 2018 · 1 revision

LIP 16 - Dense Generator with Layer-op nodes

LIP 16
Title Dense Generator with Layer-op nodes
Author A. Ranganath
Status Draft
Type Standard
Discussion Issue #49
PR #51
Created April 16, 2018


This implementation includes augmentation of the current DenseSPNGenerator class, to utilize layer-nodes SumsLayer and ProductsLayer, for generating dense SPN graphs, where each layer of the graph can be modeled as a single node.

Technical Background

Current implementation of DenseSPNGenerator includes generating a dense SPN graph, utilizing either single-op nodes - Sum and Product- or multi-op nodes (a.k.a Block-nodes, since they model blocks of operations) - ParSums and PermProducts. But neither of the op-node types can be used to model an entire layer of an SPN graph as a single node.

Between SPNs with single-op nodes and block-nodes, there is performance improvements of ~2.3x, and reduction in TF graph-size of ~2.5x. To further improve performance by decreasing overheads of TF operations, as well as to decrease TF graph-size, the next logical step would be to model an entire layer of an SPN graph as a single node. To this end, two Layer-node types have been proposed and implemented in SumsLayer and ProductsLayer.


In the current implementation of DenseSPNGenerator, a boolean flag (multi_nodes) is used to choose between single-op nodes and block-nodes, to be used in the dense SPN graph. This parameter will be replaced with NodeType enumerate, that can take on values SINGLE, BLOCK or LAYER, indicating the node-type to be used in the generated dense SPN graph.

Layer-nodes implementation of the dense generator includes first generating a dense SPN graph using block-nodes, and then converting the resulting SPN to model each layer with the appropriate layer-nodes op. So, a function that accepts an SPN graph (either consisting of single-nodes, block-nodes or a combination of both) and converts it to an SPN graph with layer-nodes, is implemented as a standalone function. With this, it would be possible to convert not only densely generated SPNs, but even custom created SPNs into SPNs with layer-node.

The proposed new DenseSPNGenerator would contain the following changes:

  • Replace boolean flag multi_nodes with enumerate NodeType.
  • Add a new function convert_to_layer_nodes, that accepts an SPN, and returns the converted SPN, modeled with layer-nodes.


After this, we should be able to do the following:

  • Generate a Dense SPN with layer-nodes
# Generate a Dense SPN with Layer-nodes
X = spn.IVs(num_vars=20, num_vals=2, name="X")
dense_gen = spn.DenseSPNGenerator(num_decomps=1,
root = dense_gen.generate(X, root_name="root")
  • Generate a Dense SPN with single-nodes and then convert it to layer-nodes
# Generate a Dense SPN with Single-nodes
X = spn.IVs(num_vars=20, num_vals=2, name="X")
dense_gen = spn.DenseSPNGenerator(num_decomps=1,
root = dense_gen.generate(X, root_name="root")
# Convert the Single-nodes Dense SPN to Layer-nodes SPN
root = dense_gen.convert_to_layer_nodes(root)
  • Create a custom SPN with non-layer-nodes, and then convert it to layer-nodes
# Create a custom SPN
X = spn.IVs(num_vars=2, num_vals=2, name="X")
# Layer 1
sums_12 = spn.ParSums((X, [0, 1]), num_sums=2, name="Sums-12")
sum_3 = spn.Sum((X, [2, 3]), name="Sum-3")
sum_4 = spn.Sum((X, [2, 3]), name="Sum-4")
# Layer 2
prod_1 = spn.Product((sums_12, 0), sum_3, name="Product-1")
prod_2 = spn.Product((sums_12, 0), sum_4, name="Product-2")
prod_3 = spn.Product((sums_12, 1), sum_4, name="Product-3")
# Layer 3
root = spn.Sum(prod_1, prod_2, prod_3, name="root")
# Create a DenseGenerator object
dense_gen = spn.DenseSPNGenerator(num_decomps=1,
# Convert the custom SPN to Layer-nodes SPN
root = dense_gen.convert_to_layer_nodes(root)

Performance comparison

Following are performance test results comparing training time per epoch between densely generated SPNs using SINGLE, BLOCK and LAYER node-types, trained on MNIST data-set. Other test case dimensions include input_dist (RAW or MIXTURE), device (cpu or gpu - E5-1650v3 and GTX1080TI respectively) and inference_type (MARGINAL or MPE).

InferenceType: MARGINAL-LOG
 CPU          op     node_type  SPN_size  TF_size  mem_used  input_dist  setup_time weights_init_time  first_run_time  rest_run_time test_accuracy
       mnist_all      SINGLE   24014     673842      0.0000         RAW   280.1227      36.4041         57.4890         23.8387        84.9700
       mnist_all      SINGLE   31854     926882      0.0000     MIXTURE   402.9728      58.8921         89.1962         33.9523        85.4500
       mnist_all       BLOCK    5854     243822      0.0000         RAW   120.1742      16.6501         62.8902         47.3282        85.4200
       mnist_all       BLOCK    9774     416302      0.0000     MIXTURE   173.0755      24.7570         80.4209         57.2403        85.7300
       mnist_all       LAYER     244       8862      0.0000         RAW     6.1890       0.7779         22.1308         21.2759        85.0400
       mnist_all       LAYER     264       9882      0.0000     MIXTURE    15.8137       0.7306         28.6546         27.6661        85.1600
 GPU          op     node_type  SPN_size  TF_size  mem_used  input_dist  setup_time weights_init_time  first_run_time  rest_run_time test_accuracy
       mnist_all      SINGLE   24014     673843   3849.6328         RAW   307.4479      42.5885         55.1779         15.4929        85.8100
       mnist_all      SINGLE   31854     926883   4961.5785     MIXTURE   396.9775      62.2324         81.5671         22.0822        85.9000
       mnist_all       BLOCK    5854     243823   6837.7677         RAW   106.1877      14.5253         19.2461          6.3139        86.1100
       mnist_all       BLOCK    9774     416303   8106.5953     MIXTURE   170.6007      24.5830         33.3518          9.9782        85.7500
       mnist_all       LAYER     244       8863   8106.5953         RAW     6.7496       0.4102          2.2442          1.7716        85.7100
       mnist_all       LAYER     264       9883   8858.6742     MIXTURE     7.3206       0.4578          2.8466          2.3593        85.3100

InferenceType: MPE-LOG
 CPU          op     node_type  SPN_size  TF_size  mem_used  input_dist  setup_time weights_init_time  first_run_time  rest_run_time test_accuracy
       mnist_all      SINGLE   24014     603870      0.0000         RAW   257.4592      32.9899         46.6822         19.0344        84.8100
       mnist_all      SINGLE   31854     809870      0.0000     MIXTURE   352.2344      50.8663         69.2659         26.9178        83.7400
       mnist_all       BLOCK    5854     218470      0.0000         RAW   102.8327      15.2007         55.0029         43.4540        84.3600
       mnist_all       BLOCK    9774     365470      0.0000     MIXTURE   151.3747      21.8418         69.3199         48.6915        84.6800
       mnist_all       LAYER     244       7820      0.0000         RAW     5.5678       0.5634         16.8420         16.3934        84.3800
       mnist_all       LAYER     264       8710      0.0000     MIXTURE     6.3771       0.5707         21.9313         21.2506        84.5700
 GPU          op     node_type  SPN_size  TF_size  mem_used  input_dist  setup_time weights_init_time  first_run_time  rest_run_time test_accuracy
       mnist_all      SINGLE   24014     603871   8858.6742         RAW   272.7215      37.8374         42.3436         13.1057        84.4500
       mnist_all      SINGLE   31854     809871   8858.6742     MIXTURE   354.9050      54.5285         61.5796         17.9023        83.9500
       mnist_all       BLOCK    5854     218471   8858.6742         RAW    93.8918      12.1474         14.4012          5.1138        84.6900
       mnist_all       BLOCK    9774     365471   8858.6742     MIXTURE   148.4445      20.8485         24.4300          8.0788        83.7100
       mnist_all       LAYER     244       7821   8858.6742         RAW    12.5599       0.3401          1.8216          1.5022        84.6500
       mnist_all       LAYER     264       8711   8858.6742     MIXTURE     6.7061       0.3785          2.2467          1.9182        84.0500

Performance test results on GPU show a ~3.9x improvement in performance between block-nodes and layer-nodes, and a 35x reduction in TF graph-size between the two.


Clone this wiki locally