Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to provide timeline callback #23

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

jamesmunns
Copy link
Contributor

This introduces the ability for the user to report the current timeline information (a UUID/TimelineId and name string).

This somewhat significantly reworks how current timeline information is tracked, which removes some thread locals, and introduces some globals.

Marked as WIP as I am still working on getting this going in mnemOS (with Tokio and Maitake task IDs) to validate it works as expected. Early feedback is welcome.

@jamesmunns jamesmunns requested a review from azdle August 11, 2022 13:59
@jamesmunns
Copy link
Contributor Author

Oof, I thought "get current task ID" was already possible in tokio, but it doesn't seem to be possible yet, but is being worked on in tokio-rs/tokio#4721.

Eliza has suggested I could (manually) use the "Task Local Storage" API to manually hold this - which isn't a great interface at the moment, but will be acceptable for my project specifically.

That being said - I think this PR is still good, and doesn't change any current user-facing behavior if a custom callback is not provided, and leaves us open to supporting task IDs once 4721 lands. Unfortunately we won't be able to provide an "out of the box" tokio callback option on day one of this landing.

@jamesmunns jamesmunns marked this pull request as ready for review August 15, 2022 18:40
jamesmunns added a commit to tosc-rs/mnemos that referenced this pull request Aug 15, 2022
@jamesmunns
Copy link
Contributor Author

I've integrated this successfully in tosc-rs/mnemos#40, let me know if you'd like repro steps :)

@jamesmunns jamesmunns marked this pull request as draft August 16, 2022 18:29
@jamesmunns
Copy link
Contributor Author

Note: Moving this back to draft after speaking with @mullr, the plan is to add some additional "data munging" callback features as well to avoid a couple deficiencies this change introduces (or doesn't address), including unbounded lookup cost, as well as the user application needing to keep modality-consistent IDs for things like source and destination tasks/timelines.

@jamesmunns jamesmunns marked this pull request as ready for review August 22, 2022 21:07
@jamesmunns jamesmunns force-pushed the james/timeline-callback branch from 70d2019 to 4d8d563 Compare August 22, 2022 21:08
Copy link
Contributor

@azdle azdle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on board with the attribute handler stuff, just a couple of nits related to that.

The timeline stuff is throwing me for a bit of a loop and I need some more time to understand that stuff and convince myself that the LRU is necessary. Will get back to this first thing next week.

Comment on lines +206 to +208
// TODO(AJM): Does this NEED to be in the global metadata? The old code did this.
let run_id = Uuid::new_v4();
options.add_metadata("run_id", run_id.to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it does because it needs to get applied to every timeline. That might change in the future, but can't say for sure.

Comment on lines +32 to +37
//! ## Fallback handler
//!
//! Additionally, there is a "fallback handler" that will take any tracing key NOT matched by the attribute
//! handlers by ensuring that the given key starts with "event.*", and the contained data is converted into
//! a format usable by modality. If this conversion is suitable for data logged by your application, then no
//! specific attribute handler needs to be provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to "Default handler" just so we're not risking giving people the impression that it's a bad thing for this to not be overridden?

Copy link
Contributor

@azdle azdle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a handful of changes I'd like to see before this merges. The big ones being:

I don't think we should change the library's idea of a thread ID to a u64. That's going to cause a whole bunch of tracking overhead for programs that aren't already tracking its timelines like that. I'd also be really worried about accidental collisions. (More in a comment.)

I would also really like to see the timeline LRU tracking become optional. I think it's just overhead in most instances except for your particular edge case. I believe that most systems will have some way of knowing when they create a new timeline or at least tracking that information with the timeline itself so that we don't have to guess when we can stop tracking it.

Also, before this gets merged, please squash out as much of your added & reverted changes as you can.

Comment on lines +8 to +12

# TODO(AJM): Work around mismatched UUID version
[patch.crates-io.modality-ingest-client]
git = "https://github.com/auxoncorp/modality-sdk"
rev = "4a3b9e3cca08e8225e084f9e8409e818646589ec"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, this should no longer be needed.

Comment on lines 42 to 44
pub fn set_timeline_identifier(&mut self, identifier: fn() -> TimelineInfo) {
self.timeline_identifier = identifier;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should get a chainable with_ variant.

pub id: TimelineId,
}

pub(crate) static TIMELINE_IDENTIFIER: OnceCell<fn() -> TimelineInfo> = OnceCell::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we either put a verb in this name or use a word other than "identifier" to make it more obvious it's a function that returns an identifier rather than being an identifier itself?

use std::fmt::Debug;
use thiserror::Error;

#[derive(Debug, Clone)]
pub struct TimelineInfo {
// TODO: Can we make this a &'static str to avoid allocations? Maybe a Cow<str>?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will come, there's a lot of places where we can do this, we just haven't had a need yet and will need changes in the SDK first.

Comment on lines 301 to 305
#[inline]
fn generate_timeline_id(&self, user_id: u64) -> TimelineId {
// We deterministically generate timeline IDs using UUIDv5, with
uuid::Uuid::new_v5(&self.run_id, &user_id.to_ne_bytes()).into()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be the responsibility of the user function. We are essentially using the u64 as the thread id since all are being hashed with the same namespace (run_id). I'm worried this would be too prone to timeline id collisions.

I would rather keep this library to a single timeline id concept and have the user provided callback generate that so that they have an opportunity to choose a different namespace based on the source of the user_id.

pub(crate) enum TracingValue {
pub enum TracingValue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like pubing this either. But I'm not sure if there's a good alternative.

This is for users to implement an AttrHandler, right? Does tracing have any types we can steal? Or what do you think of always doing the default conversion first and having AttrHandler take in an AttrVal instead?

Comment on lines +10 to +12
//! Handlers are matched on a SPECIFIC key text. If a matching key text is found, the
//! provided handler will be called, will replace the key and value of the data reported
//! by `tracing`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "SPECIFIC" supposed to be drawing my attention to? It seems like it's important since it's in all caps. But I don't understand what distinction you're trying to make between that and "Handlers are matched by key."

Comment on lines +41 to +55
//! ## Tracing Metadata
//!
//! Additionally, `tracing-modality` will provide the following event keys with data sourced from `tracing`'s
//! metadata fields, IF the values have not already been filled by any of the Default or Fallback handlers.
//!
//! | modality key | tracing metadata (or other) source |
//! | :--- | :--- |
//! | "event.name" | `metadata.name()` |
//! | "event.severity" | `metadata.level()` |
//! | "event.source.module" | `metadata.module_path()` |
//! | "event.source.file" | `metadata.file()` |
//! | "event.source.line" | `metadata.line()` |
//! | "event.internal.rs.tick" | nanosecond ticks since first tracing event |
//! | "event.timestamp" | nanosecond ticks since unix epoch |
//!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the wrong place for this. I think it might make more sense to just move this to the readme. Thoughts?

Comment on lines +218 to +223
//
// TODO(AJM): Grumble grumble, this is partially duplicating the work of `bind_user_timeline()`,
// and also not actually registering the metadata for this timeline. Since this is our worker
// thread, and we will never produce tracing information that is sent to modality, it is probably
// fine. If we do at a later time, it will then be registered via the actual call to
// `bind_user_timeline()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should execute on the program's main thread (or wherever the user does init from), right? I don't think it'll create a timeline for the worker thread (which would be a problem).

Comment on lines +590 to 593
// TODO(AJM): This check is redundant everywhere AFAICT?
// Should we make a NewType that guarantees the string starts with
// 'event.'?
let key = if key.starts_with("event.") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had it like that before and explicitly removed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants