Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow backend in IPE mode reports warning "expects data to be 64 byte aligned" on every inference #484

Open
vkuzmin-uber opened this issue Mar 26, 2021 · 0 comments
Assignees

Comments

@vkuzmin-uber
Copy link
Contributor

vkuzmin-uber commented Mar 26, 2021

Bug

  • Using OPE: no
  • Neuropod backend TensorFlow, CPU

Found during benchmark that in IPE mode TF backend reports warning on every request that itself affects performance:

neuropod/backends/tensorflow/tf_tensor.cc:140: SPDLOG_WARN("In order to wrap data, TensorFlow expects data to be 64 byte aligned! Making a copy...");

We implemented allocator (in Java) that guarantees buffer alignment 64, so this isn't a blocker. But still wanted to discuss the following suggestions:

Expected behavior, Suggestions:

Problem №1

In high performant systems we should "guard" logger from being problem for system health (read/write I/O, busy and full disk) and info, warning, errors should not have a visible performance impact under high load.

Options:

  1. Logger should support "suppress" if logging the same message more than N. It probably means that Logger need to compare logging message with previous one. To mitigate it we may have a special flag or call for "suppress" and perform suppress only if "flood" is detected (ideally find some "standard" algo with options that can be used).

  2. Implement Logger with Ring Buffer, with pattern Multiple Producers (concurrent LOG calls) ->Single Consumer (dedicated thread that reads from Ring Buffer and writes to log file (or whatever) then. Depending on policies, Producers may overwrite "old" or drop own logs if buffer is full. Consumer can detect duplicates and avoid spam in log, writing as "log message ... suppressed another 10K occurrences. Vivek, do you know any OSS C++ logger like that? (searched for C++ version of LMAX disruptor" https://github.com/fsaintjacques/disruptor-- that can be used as with current Neuropod Logger). Such approach can be a + for neuropod for high load and for further development to control extra "cost".

(My experience: I used similar approach for logger in C++ Algo Trading service)

Problem №2

I investigated Tensorflow code and it seems it is related to https://github.com/tensorflow/runtime/blob/4adbd9585428dea3fed2f69e721b7369410a27d0/third_party/eigen/BUILD#L47
Looking at current TF repo code and https://eigen.tuxfamily.org/dox/TopicPreprocessorDirectives.html, I am not sure if align 64 is really necessary to avoid "copy". Vivek, can you at least audit it and if it is still the best way to guarantee "no copy", leave a comment at the warning. Ideally, if Neuropod report warning (with suppress)"send" it to TF as is, let it decide. In this case caller/user can experiment with alignments, benchmark it. Maybe have it as an "option", if user want Neuropod to do "smart" things or be thing layer instead and give user/caller more control.

I put both together for now we can split onto separate issues if find it "valid" first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants