-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Add a new GetNewestUserDefinedTimestamp API #13547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5204d1e
to
fdb6b91
Compare
@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
InstrumentedMutexLock l(&mutex_); | ||
bool enter_write_thread = sv->mem == cfd->mem(); | ||
WriteThread::Writer w; | ||
// Enter write thread to read the mutable memtable to avoid racing access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to support concurrent access to newest_udt_ from memtable instead of blocking writes for each call to GetNewestUserDefinedTimestamp()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a more performant direction. Right now, the newest_udt_
is a Slice so concurrent access is not available. For uint64_t timestamp, we can consider using std::atomic<uint64_t>.
Since the internal user for this API currently only need to call this API after DB::Open before any writes start, to get the timestamp to recover from. I think entering write thread to avoid the concurrent issue isn't a bad idea. If we need this API to be more performant, I can revise the implementation. We introduce this API for them because the GetFullHistoryTsLow
API doesn't feel generic enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
db/db_impl/db_impl.cc
Outdated
// There is currently no metadata in SST files or MANIFEST to track | ||
// timestamps for this case. | ||
// TODO(yuzhangyu): add support for this case. | ||
status = Status::NotSupported( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relying on data being in memtable feels not reliable, maybe it's better to just not support persist_user_defined_timestamps
at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Our internal users currently indeed doesn't persist user defined timestamps. I think it's a good idea to not support it at all if it is not fully supported. Let me update the PR.
fdb6b91
to
3f2662b
Compare
@jowlyzhang has updated the pull request. You must reimport the pull request before landing. |
@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jowlyzhang merged this pull request in 476a98c. |
Summary: This PR adds a DB::GetNewestUserDefinedTimestamp API to get the newest timestamp of the column family. This is only for when the column family enables user defined timestamp. It checks the mutable memtable, the immutable memtable and the SST files, and returns the first newest user defined timestamp found. When user defined timestamp is not persisted in SST files, there is metadata in MANIFEST tracking upperbound of flushed timestamps, so the newest timestamp in SST files can be found. If user defined timestamps are persisted in SST files, currently no timestamp metadata info is persisted. A NotSupported status will be returned if SST files need to be checked in that case. Pull Request resolved: facebook#13547 Test Plan: Added tests Reviewed By: cbi42 Differential Revision: D73123575 Pulled By: jowlyzhang fbshipit-source-id: 460ac4f9c96926d3c8fcf7944edab8dc0feae1dd
This PR adds a DB::GetNewestUserDefinedTimestamp API to get the newest timestamp of the column family. This is only for when the column family enables user defined timestamp.
It checks the mutable memtable, the immutable memtable and the SST files, and returns the first newest user defined timestamp found. When user defined timestamp is not persisted in SST files, there is metadata in MANIFEST tracking upperbound of flushed timestamps, so the newest timestamp in SST files can be found. If user defined timestamps are
persisted in SST files, currently no timestamp metadata info is persisted. A NotSupported status will be returned if SST files need to be checked in that case.
Test plan:
Added tests