-
Notifications
You must be signed in to change notification settings - Fork 266
feat: add ability to export feature labels for static node #2170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
For the HPC use case, we want to be able to export static features without Kubernetes. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: vsoch The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @vsoch! |
Hi @vsoch. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-nfd ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
how about using stdout to unify the commandline, for example
|
@ChaoyiHuang thank you for taking a look! How would you distinguish these two? # Export labels to the terminal for inspection
nfd-worker --export=stdout
nfd-worker --export They are both going to dump to the output stream. Did you have in mind using different streams, or minimizing other output for the rest of the run for one case? Or are you saying to keep the command the same, but have |
keep the command the same format (stdout is standard output). --export is also OK, just have to code a little more. |
it may just export to standard output, and no need to support export to path. if want to save the content to file, it can be done via redirecting the output to file, for example:
so the code can be simplified to just support exporting to standard output
|
The nfd worker generates a lot of logs (and other unexpected output) so I wasn't sure how to cleanly do that. Whereas with writing the file explicitly, I can control exactly the contents. |
@@ -178,7 +181,8 @@ func NewNfdWorker(opts ...NfdWorkerOption) (NfdWorker, error) { | |||
} | |||
|
|||
// k8sClient might've been set via opts by tests | |||
if nfd.k8sClient == nil { | |||
// Export mode does not assume in Kubernetes environment | |||
if nfd.k8sClient == nil && !nfd.args.Export { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's better to keep the worker k8s centric (as it implements the logic for updating k8s resources), operate on raw features rather than labels, and add a new subcommand to the nfd client for dumping features.
Raw features can easily be transformed into labels if needed. However, since labels are derived from raw features and the worker configuration, relying solely on labels may cause certain node capabilities to be missed due to default or custom configurations.
The same principle applies to the image compatibility implementation in NFD, where comparisons are performed by checking the features (not the labels).
Thoughts?
Problem: We want to use NFD (labels and other functionality) for HPC, but a Kubernetes environment is expected.
Solution: Add an
nfd-worker --export
mode that allows for generation and export of static labels.For the HPC use case, we want to be able to export static features without Kubernetes, either to the screen (for inspection) or a file (for saving and use with a scheduler, or other compatibility checking algorithm). Adding support for this is fairly easy - we need to just expose the flag, disable the check for Kubernetes assets, and then generate the labels once and exit (without starting or sending to a server). The usage is simple and intuitive:
The flag is also useful for cases beyond HPC when a quick inspection is desired. Note that instead of adding two flags (a boolean for export, and string for a filepath) I opted for the design above - one flag that supports being used as a boolean or providing a path. A new flag type is added to support that, and it is generic so could be used for other future cases. For my implementation, for us to check how this new flag was set, we need to pass it to the initFlags function:
And I think this is OK for now, but if more flags were defined in this manner, we would want to pass a structure with all of them. I decided that was overkill for the time being. For another design choice, I think the generation of the labels should be in the same function to
Run()
, and to support this I checkw.args.Export
before setting up any servers / Prometheus, and run the generation (that will handle the export to terminal or file and exit early, if applicable). I have also added tests and documentation, and I tested on an HPC node (in user space) to make sure the labels generated without issue.I realize I didn't open an issue for discussion - so am happy to have it here! To better explain the use case, this is a feature intended to better support compatibility checking, but for HPC. We won't be able to run compatibility checks assuming labels are available via Kubernetes, but we will be able to export them to files that represent nodes, and then be able to use associated tools to check compatibility artifacts against the known features. This is what I'm working on today. Thanks!