Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set filesystem constructor parameter for FilePrefetchBuffer inside PrefetchTail #13157

Closed
wants to merge 1 commit into from

Conversation

archang19
Copy link
Contributor

@archang19 archang19 commented Nov 25, 2024

Summary

After #13118 was merged, I did some investigation to see whether the file system buffer reuse code was actually being used.

The good news is that I was able to see from the CPU profiling results that my code is getting invoked through the warm storage stress tests.

The bad news is that most of the time, the optimization is not being used, so we end up going through the regular old RandomAccessFileReader::Read path.

Here is the entire function call chain up to FilePrefetchBuffer::Read

  1. rocksdb::DB::MultiGet
  2. rocksdb::DBImpl::MultiGet
  3. rocksdb::DBImpl::MultiGetCommon
  4. rocksdb::DBImpl::MultiGetImpl
  5. rocksdb::Version::MultiGet
  6. rocksdb::Version::MultiGetFromSST
  7. rocksdb::TableCache::MultiGet
  8. rocksdb::TableCache::FindTable
  9. rocksdb::TableCache::GetTableReader
  10. rocksdb::BlockBasedTableFactory::NewTableReader
  11. rocksdb::BlockBasedTable::Open
  12. rocksdb::BlockBasedTable::PrefetchTail
  13. rocksdb::FilePrefetchBuffer::Prefetch
  14. rocksdb::FilePrefetchBuffer::Read

At this point, we split into rocksdb::RandomAccessFileReader::Read and
rocksdb::FilePrefetchBuffer::FSBufferDirectRead. FSBufferDirectRead gets called <3% of the time.

I think the root cause is that the FileSystem* fs parameter is not getting passed into the FilePrefetchBuffer constructor. When fs is nullptr, UseFSBuffer() will always return false and we do not end up calling FSBufferDirectRead.

Luckily, it does not seem like there are too many places I need to change. BlockBasedTable resets its prefetch_buffer in 3 separate places. When it disables the prefetch buffer (2/3 of the instances), we don't care about whether the fs parameter is there. This PR is addressing the third instance, where it is not trying to disable the buffer.

Note that there is another method, PrefetchBufferCollection::GetOrCreatePrefetchBuffer that creates new FilePrefetchBuffers without the fs parameter. This method gets called by compaction_iterator and merge_helper. I think we can address this in a subsequent PR:

  1. Each of these changes effectively "unlocks" the buffer reuse feature. Separating the changes would be helpful when I look at the profiling results again, since I can isolate what impact this PR had on the percentage of time that rocksdb::FilePrefetchBuffer::FSBufferDirectRead was invoked.
  2. I still need to look into what exactly I would need to changes I need to make to PrefetchBufferCollection
  3. This code seems to be for blob prefetching in particular, and I don't think it has the biggest ROI anyways.
      const Status s = blob_fetcher_->FetchBlob(
          user_key(), blob_index, prefetch_buffer, &blob_value_, &bytes_read);
  1. I am not sure if the current benchmark I am using for warm storage exercises this blob prefetching code, so I may need to find another way to assess the performance impact.

Test Plan

The existing unit test coverage guards against obvious bugs. I ran another set of performance tests to confirm there were no regressions in CPU utilization.

@archang19 archang19 changed the title Set fs constructor parameter for FilePrefetchBuffer Set filesystem constructor parameter for FilePrefetchBuffer Nov 25, 2024
@archang19 archang19 changed the title Set filesystem constructor parameter for FilePrefetchBuffer Set filesystem constructor parameter for FilePrefetchBuffer inside PrefetchTail Nov 25, 2024
@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@archang19 archang19 requested a review from anand1976 November 25, 2024 20:32
@archang19 archang19 marked this pull request as ready for review November 25, 2024 20:32
@facebook-github-bot
Copy link
Contributor

@archang19 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@archang19 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@anand1976 anand1976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Thanks for the change. The table open path is not activating the fs buffer reuse feature.

@facebook-github-bot
Copy link
Contributor

@archang19 merged this pull request in 31408c0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants