Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logic to build intel-gpu image to build_llama_and_whisper.sh #724

Merged
merged 1 commit into from
Feb 4, 2025

Conversation

cgruver
Copy link
Collaborator

@cgruver cgruver commented Feb 3, 2025

Modify the Containerfile for intel-gpu to use the common build script.

Add entrypoint to intel-gpu image to set environment.

Signed-off-by: Charro Gruver <cgruver@redhat.com>
Copy link
Contributor

sourcery-ai bot commented Feb 3, 2025

Reviewer's Guide by Sourcery

This pull request modifies the intel-gpu container image to use the common build script and adds an entrypoint to set the environment.

Sequence diagram for intel-gpu container startup process

sequenceDiagram
    participant Container as Container Start
    participant Entrypoint as Entrypoint Script
    participant System as System Config

    Container->>Entrypoint: Start container
    Entrypoint->>System: Check/Create HOME directory
    Entrypoint->>System: Configure user in /etc/passwd
    Entrypoint->>System: Add user to render/video groups
    Entrypoint->>System: Configure shell (.bashrc/.zshrc)
    Entrypoint->>System: Source Intel OneAPI environment
    Entrypoint->>Container: Execute CMD
Loading

File-Level Changes

Change Details Files
Use the common build script to build llama.cpp.
  • Removed the manual build steps for llama.cpp.
  • Added a call to the common build script build_llama_and_whisper.sh.
  • The build script is executed with the intel-gpu argument.
container-images/intel-gpu/Containerfile
Add an entrypoint script to set the environment.
  • Added an entrypoint.sh script to set environment variables.
  • The Containerfile now uses the entrypoint.sh script as the entrypoint.
  • The entrypoint script sets the HOME variable, creates the home directory, adds the user to the /etc/passwd and /etc/group files, configures zsh and bash shells, and sources the oneAPI environment variables.
container-images/intel-gpu/Containerfile
container-images/intel-gpu/entrypoint.sh
Modify the common build script to support intel-gpu.
  • Added intel-specific packages to the install list.
  • Added logic to source the oneAPI environment variables.
  • Added logic to set the cmake flags for intel-gpu.
container-images/scripts/build_llama_and_whisper.sh

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cgruver - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟡 Security: 1 issue found
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +25 to +26
sed "s|${render_group}|${render_group_new}|g" /etc/group > /tmp/group
cat /tmp/group > /etc/group
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 issue (security): The group file modifications could race when multiple containers start simultaneously

Consider using a more atomic approach to modify the group file. Also, the USER_NAME variable should be properly escaped to prevent potential injection issues.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah:

mv /tmp/group /etc/group

would be more atomic, same on line 28.

I don't see the race condition myself, but mv is a nicer option, if it works, more atomic to replace the inode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that mv /tmp/group /etc/group is more atomic and generally a better approach for replacing the file. However, the race condition I'm referring to isn't about replacing the file itself, but about concurrent modifications to the file. If multiple containers based on this image start simultaneously, they could all be reading /etc/group, modifying it in memory, and then writing their modified version back. This could lead to some updates being lost.

For example:

  1. Container 1 reads /etc/group.
  2. Container 2 reads /etc/group (same content as Container 1).
  3. Container 1 adds llama-user to the render group and writes the updated /etc/group.
  4. Container 2 adds llama-user to the video group and writes the updated /etc/group, overwriting the changes made by Container 1 regarding the render group.

Using mv doesn't solve this. A more robust solution would be to use a tool designed for concurrent file modifications, like usermod and groupmod. These tools handle locking and ensure that changes aren't overwritten. They also address the escaping concern I mentioned.

Could you try using usermod -a -G render ${USER_NAME} and usermod -a -G video ${USER_NAME} instead of directly manipulating /etc/group? This would be safer and more idiomatic. Also, ensure that USER_NAME is properly defined and sanitized before being used in these commands.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't each container have it's own filesystem anyway? Hence no race condition

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that each container has its own filesystem, after it starts. However, the /etc/group file is typically part of the container image. So, when multiple containers are launched from the same image, they initially share the same /etc/group file. The modifications made by the entrypoint script happen after the container starts but before any user-specified commands run.

So, if multiple containers based on this image start simultaneously, they will all run this entrypoint script concurrently, and they will all be modifying their initially-shared copy of /etc/group. This is where the race condition occurs. Even though they each have their own filesystem later, the changes made during startup can be overwritten by other containers starting at the same time.

Using tools like usermod and groupmod is crucial because they handle file locking and ensure that concurrent modifications to /etc/group are applied correctly, avoiding the race condition even when multiple containers start simultaneously from the same image. They also handle escaping and other potential security issues related to directly manipulating /etc/group.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried mv initially but it requires rw access to the underlying folder, where cat > just requires rw on the file.

Copy link
Collaborator

@ericcurtin ericcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Member

rhatdan commented Feb 4, 2025

The CI is broken outside of this PR.

@rhatdan rhatdan merged commit 341f962 into containers:main Feb 4, 2025
10 of 13 checks passed
@cgruver
Copy link
Collaborator Author

cgruver commented Feb 4, 2025

The CI is broken outside of this PR.

Ah, I was digging through logs trying to figure out what I had broken... Thanks!

@cgruver cgruver deleted the add-intel-gpu branch February 4, 2025 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants