Skip to content

Hints for Performance Testing

Andrzej Pronobis edited this page May 5, 2017 · 6 revisions
  • Always use feeding as the mechanism for providing all inputs. Using constants enables some pre-computing mechanism which results in ops being run on CPU, even if they are supposed to be run on a GPU.

  • Use tf.identity(tf.placeholder()) to provide inputs. This way, we can expect that the input is copied to GPU only once to the identity op, and then served to other ops.

  • Instead of stacking ops, make them dependent on each other using control flow ops. This simulates op stacking, while permitting the use of arbitrary inputs. For instance:

          ops = op_fun(params_op, indices)
          for _ in range(self.num_ops - 1):
              ops = op_fun(tf.tuple([params_op, ops])[0], indices)
    

    Here, the tuple adds a dependency on the output of the previous op, but still feeds the proper input to the next op.

  • Always test performance on inputs of different dtypes. The performance my vary greatly between dtypes.

  • Test on several machines, performance varies between GPU/CPU configurations.

Clone this wiki locally