Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image-rs: use index for layers store path #902

Merged
merged 4 commits into from
Feb 19, 2025

Conversation

squarti
Copy link
Contributor

@squarti squarti commented Feb 10, 2025

This PR associates an index for a given image layer and uses the index to build the layer store path. This reduces the path length significantly. The next available layer index is computed by reading the existent directories in the data directory. This is done during pull client initialization.

Fixes: #901

@squarti squarti marked this pull request as ready for review February 11, 2025 19:29
@squarti squarti requested a review from a team as a code owner February 11, 2025 19:29
Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome bug report, thanks. Can you describe in more detail what you are doing in your fix in the PR description?

@squarti
Copy link
Contributor Author

squarti commented Feb 11, 2025

A possible issue with the approach taken is that pull client cannot run in parallel since the layer index is managed in memory. Although, this already seems to be an issue since the meta store is only saved at the end of pull_image.

Copy link
Member

@Xynnn007 Xynnn007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. Btw I did not find any place that shows mount does not support data longer than 4096 bytes although the behavior shows this.

Another thing I worry about is synchronization issues. Now ImageClient can respond to multiple pull_image requests asynchronously. Each pull_image request will create a PullClient. Assuming multiple pull_image requests are initiated at the same time, it will be possible to obtain multiple identical or close layers_index, and overwriting might occur.

The previous method of using hash values ​​to name layers can avoid this risk assuming that the hashes do not collide. This is possible if an auto-incrementing number is used.

If number is used, do we need to consider using PullClient as part of ImageClient and perform operations such as async_pull_layers and pull_manifest exclusively and atomically?

@squarti squarti force-pushed the layers-index branch 4 times, most recently from 5b30e95 to 2f85db2 Compare February 12, 2025 14:58
@squarti
Copy link
Contributor Author

squarti commented Feb 12, 2025

Would moving the counter to ImageClient as above address these issues?

@squarti squarti force-pushed the layers-index branch 11 times, most recently from 0259c8c to 06eac92 Compare February 12, 2025 22:52
@ChengyuZhu6
Copy link
Member

I'm thinking would it be possible to use the footprint instead of the full hash value as the layer store name? For example, changing from run/image-rs/layers/sha256_055008870f0b0eef21dc8f9be651f07c3f52fa33724d899f8d14bded0a4fa38d to run/image-rs/layers/sha256_055008870f0b?

@Xynnn007
Copy link
Member

I'm thinking would it be possible to use the footprint instead of the full hash value as the layer store name? For example, changing from run/image-rs/layers/sha256_055008870f0b0eef21dc8f9be651f07c3f52fa33724d899f8d14bded0a4fa38d to run/image-rs/layers/sha256_055008870f0b?

I have considered it before, but considering that the probability of collision will increase exponentially after truncating the hash value, it may cause unexpected bugs in extreme cases. If we use the self-increasing index as proposed, layer collisions can be absolutely avoid.

@mkulke
Copy link
Contributor

mkulke commented Feb 13, 2025

is there a way to write this code in a way that we can unit-test the construction of the layer storage paths? this would be good also for specifying what we are doing.

@squarti squarti force-pushed the layers-index branch 5 times, most recently from 1e9be67 to 2e2de58 Compare February 13, 2025 19:09
Copy link
Member

@Xynnn007 Xynnn007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @mkulke PTAL

@squarti squarti force-pushed the layers-index branch 12 times, most recently from cded759 to 4b1a11c Compare February 14, 2025 21:27
Copy link
Member

@ChengyuZhu6 ChengyuZhu6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @squarti . While I'm concerned about the ci failure with ubuntu-24.04-arm.

@fitzthum
Copy link
Member

CI Green. @mkulke are we good to merge this?

Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the moment we're just logging the errors instead of bailing out. since this is a sensitive area of the code, I would prefer if we would propagate the errors and fail on anything that we encounter and do not expect, i.e. get_layers_index() should return a Result<> since it's a fallible operation.

This PR creates a layer store abstraction which uses an incrementing
index as a unique store path for image layers instead of layer
digests. This reduces the layer path length significantly 
allowing more layers to be mounted given the 4096 page size
limit of mount command.

Fixes: confidential-containers#901

Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for accommodating, a few nits.

it's not great that we swallow the errors at the upper layers now, but apparently that is also what we do with the meta store. It's still an improvement because we don't get something undefined on malformed images, but a predictable and unusuable 0-ed out layerstore. if we unwrap we still need to log the errors though (see suggestion), otherwise it won't be fun to debug.

squarti and others added 3 commits February 19, 2025 09:17
Co-authored-by: Magnus Kulke <mkulke@gmail.com>
Signed-off-by: squarti <silenio_quarti@ca.ibm.com>
Co-authored-by: Magnus Kulke <mkulke@gmail.com>
Signed-off-by: squarti <silenio_quarti@ca.ibm.com>
Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>
@mkulke mkulke merged commit 7269cfd into confidential-containers:main Feb 19, 2025
8 checks passed
@BbolroC BbolroC mentioned this pull request Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

image-rs fails to mount images with many layers
5 participants