Skip to content

chore: [MCP-2] add is_container_env to telemetry #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Conversation

fmenezes
Copy link
Collaborator

@fmenezes fmenezes commented Jun 13, 2025

is_container_env should be true / false

@fmenezes fmenezes marked this pull request as ready for review June 13, 2025 18:01
@Copilot Copilot AI review requested due to automatic review settings June 13, 2025 18:01
@fmenezes fmenezes requested a review from a team as a code owner June 13, 2025 18:01
@fmenezes fmenezes changed the title MCP-2: add is_container_env to telemetry chore: [MCP-2] add is_container_env to telemetry Jun 13, 2025
Copilot

This comment was marked as outdated.

@fmenezes fmenezes requested a review from Copilot June 13, 2025 18:04
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for detecting whether the application is running in a containerized environment by introducing the is_container_env telemetry property. Key changes include updating telemetry types, modifying the Telemetry class to asynchronously retrieve container environment status via file checks and environment variables, and updating tests to properly await asynchronous telemetry common properties.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/unit/telemetry.test.ts Updated tests to await asynchronous verification of common properties and buffering state.
tests/integration/telemetry.test.ts Adjusted telemetry creation to include the new getContainerEnv and updated asynchronous checks.
src/telemetry/types.ts Added is_container_env to the Telemetry CommonProperties type.
src/telemetry/telemetry.ts Modified Telemetry to support container environment detection and asynchronous common properties.

public async getAsyncCommonProperties(): Promise<CommonProperties> {
return {
...this.getCommonProperties(),
is_container_env: (await this.containerEnvPromise) ? "true" : "false",
Copy link
Preview

Copilot AI Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider returning a boolean for is_container_env in getAsyncCommonProperties (e.g., true/false) instead of a string to more directly reflect its boolean nature, if this change is compatible with TelemetryBoolSet.

Suggested change
is_container_env: (await this.containerEnvPromise) ? "true" : "false",
is_container_env: await this.containerEnvPromise,

Copilot uses AI. Check for mistakes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not compatible TelemetryBoolSet expects strings "true" or "false"

Comment on lines 37 to 43
for (const file of ["/.dockerenv", "/run/.containerenv", "/var/run/.containerenv"]) {
const exists = await fileExists(file);
if (exists) {
return true;
}
}
return !!process.env.container;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fairly minor as this is clearly not on a hot path, but would it make sense to first check for the env variable since that is cheapest? And then for the files, we can check them in parallel rather than 1 by 1.

/** Resolves when the device ID is retrieved or timeout occurs */
private bufferingEvents: number = 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name no longer makes much sense. How about pendingPropertyPromises or something similar?

@@ -117,6 +163,14 @@ export class Telemetry {
};
}

public async getAsyncCommonProperties(): Promise<CommonProperties> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have super strong feelings here, but I'm not a huge fan of making methods public just for the sake of tests. While I agree it's convenient to test the internals here, this and isBufferingEvents are implementation details, so they shouldn't be made public.


async function fileExists(filePath: string): Promise<boolean> {
try {
await fs.stat(filePath);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await fs.stat(filePath);
await fs.access(filePath);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't need read access

Copy link
Collaborator Author

@fmenezes fmenezes Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since these files are controlled by docker container I'm not too sure what happens when using non root / non default user over docker (meaning docker run -u <some user>)

Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without read access without permission check access, we'd not be able to check for permission stat either.stat also gives you stats which we're discarding, access is lighter in that sense.

Node Docs example uses access to check for file existance:
https://nodejs.org/api/fs.html#fs_fs_access_path_mode_callback

if (process.env.container) {
return true;
}
for (const file of ["/.dockerenv", "/run/.containerenv", "/var/run/.containerenv"]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the / paths work on windows?

return instance;
}

private async start(): Promise<void> {
private start(): void {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's usually always cleaner to use async function then wrap them in finally-s etc. I'd revert this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idea here was to start the promises and not wait for them to finish

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the benefit of that? we're already spawning an async function without waiting for it to finish (with void above) so this isn't blocking the main thread.

We'd want to buffer events beforehand so we can make sure telemetry stuff we send is with device_id resolved if possible

if (!this.isTelemetryEnabled()) {
return;
}

this.deviceIdPromise = getDeviceId({
Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are probably looking for Promise.all (or Promise.allSettled)?

Comment on lines 44 to 45
private deviceIdPromise: Promise<string> | undefined;
private containerEnvPromise: Promise<boolean> | undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably turn this into a single setup promise with Promise.all or something, see below:

Copy link
Collaborator

@gagik gagik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can manage the multiple promises better. Likely having 1 setup promise that combines the other 1 promises and just awaiting those instead of having a counter. Though maybe I'm missing some bit of the implementation here.

@fmenezes
Copy link
Collaborator Author

I think we can manage the multiple promises better. Likely having 1 setup promise that combines the other 1 promises and just awaiting those instead of having a counter. Though maybe I'm missing some bit of the implementation here.

I did not know either it was prior to my time, but just checked the code it should be ok just to wait on both promises, just be instant

@gagik
Copy link
Collaborator

gagik commented Jun 16, 2025

should be ok just to wait on both promises

yeah we can keep the async start + buffering like before and just add the second await into it with Promise.all before setting isBuffering to false

@fmenezes fmenezes requested a review from gagik June 16, 2025 13:36
mcp_client_version: this.session.agentRunner?.version,
mcp_client_name: this.session.agentRunner?.name,
session_id: this.session.sessionId,
config_atlas_auth: this.session.apiClient.hasCredentials() ? "true" : "false",
config_connection_string: this.userConfig.connectionString ? "true" : "false",
is_container_env: (await this.containerEnvPromise) ? "true" : "false",
device_id: await this.deviceIdPromise,
Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be awaiting here; an explicit unawaited + buffering with an async start like it was before is easier to reason about, it just needs another to await for container inside start

Otherwise there's potentially hundreds of floating functions awaiting device_id or whatever else. We can assume in worst case scenario this isn't instant (and could never even resolve)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the device ID there is a timeout, on the container env it is just fs/promises so we can be sure they will resolve. These promises are private, I think the buferring logic only makes sense if we are tagging on finally or something, to me feels unneeded complication. Our current timeout is 3 seconds, no reason to be extra defensive.

Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fair that we don't need to be extra defensive but either way we're still essentially going to end up buffering these events.

The difference is that with awaiting here the "buffering" would happen through a bunch of async functions in parallel in memory stack waiting on these promises as opposed to more explicit buffering where we store the functions we want to execute ourselves and execute them in order after resolving these promises in one place.

With awaiting in common properties approach we'd, for example, not have any guarantees that these functions end up resolving in the same order. Which we don't need to care about, but it's nice to have better idea of how async execution is going to happen and have 1 point of logic where we resolve all the async issues.

With discussions with @nirinchev, we also want to create a shared telemetry service component across devtools so we would not want to deviate too much unless there are good reasons for it (this setup is modeled after mongosh and Compass telemetry).

this.commonProperties.device_id = await this.deviceIdPromise;

this.isBufferingEvents = false;
this.containerEnvPromise = this.getContainerEnv();
Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.containerEnvPromise = this.getContainerEnv();
const [deviceId, isContainerEnv] = await Promise.all([this.deviceIdPromise, this.getContainerEnvPromise]);
this.commonProperties.device_id = deviceId;
this.commonProperties.is_container_env = isContainerEnv;

something like this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this approach start would take up to 3s which I think it's acceptable, will change

Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will have identical behavior to awaiting inside the getCommonProperties, we do not await the start() function anywhere so it's essentially same as declaring a combined promise, we instantly initialize the telemetry and start buffering event when emit is run.

In both cases we'd instantly be able to emit events, and in both cases if device ID takes 3 seconds, the eventual emission of events will take 3 seconds.

But with start there's only 1 point where we deal with asynchronous logic, and in the other case we have a bunch of less predictable asyncronous functions listening in into 2 different promises.

@fmenezes fmenezes requested a review from gagik June 16, 2025 16:40

void instance.start();
await instance.start();
Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await instance.start();
void instance.start();

This was an important detail in the existing implementation: start was not getting awaited before. This is why the behavior is equivalent to what was trying to be accomplished before with skipping the promise. Telemetry was already not waiting for the device ID resolution.

So the create function does not need to be asynchronous and can be left as is.

mcp_client_version: this.session.agentRunner?.version,
mcp_client_name: this.session.agentRunner?.name,
session_id: this.session.sessionId,
config_atlas_auth: this.session.apiClient.hasCredentials() ? "true" : "false",
config_connection_string: this.userConfig.connectionString ? "true" : "false",
is_container_env: (await this.containerEnvPromise) ? "true" : "false",
device_id: await this.deviceIdPromise,
Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fair that we don't need to be extra defensive but either way we're still essentially going to end up buffering these events.

The difference is that with awaiting here the "buffering" would happen through a bunch of async functions in parallel in memory stack waiting on these promises as opposed to more explicit buffering where we store the functions we want to execute ourselves and execute them in order after resolving these promises in one place.

With awaiting in common properties approach we'd, for example, not have any guarantees that these functions end up resolving in the same order. Which we don't need to care about, but it's nice to have better idea of how async execution is going to happen and have 1 point of logic where we resolve all the async issues.

With discussions with @nirinchev, we also want to create a shared telemetry service component across devtools so we would not want to deviate too much unless there are good reasons for it (this setup is modeled after mongosh and Compass telemetry).

Comment on lines +42 to +43
private deviceId: string | undefined;
private containerEnv: boolean | undefined;
Copy link
Collaborator

@gagik gagik Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not have them inside commonProperties so it easier to see which fields in the state are used solely for common properties?

@@ -20,7 +20,7 @@ try {
version: packageInfo.version,
});

const telemetry = Telemetry.create(session, config);
const telemetry = await Telemetry.create(session, config);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const telemetry = await Telemetry.create(session, config);
const telemetry = Telemetry.create(session, config);

the creation logic does not be turned async.

I'd say the main change needed is to add this.commonProperties.is_container_env intialization into start (possibly rename it to setup as that naming seems confusing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants