Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Fleet plan includes offers from backends without multi-node support with placement: cluster #2300

Open
r4victor opened this issue Feb 14, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@r4victor
Copy link
Collaborator

Steps to reproduce

  1. Create a fleet configuration with placement: cluster.
  2. Apply configuration.
  3. Notice offers from clouds that do not support multi-node:
✗ dstack apply -f .dstack/confs/fleet.yaml            
 Project        r4victor-main                          
 User           victor                                 
 Configuration  .dstack/confs/fleet.yaml               
 Type           fleet                                  
 Fleet type     cloud                                  
 Nodes          2                                      
 Placement      cluster                                
 Resources      2..xCPU, 8GB.., 8xa100, 100GB.. (disk) 
 Spot policy    on-demand                              

 #  BACKEND  REGION            INSTANCE     RESOURCES        SPOT  PRICE      
 1  lambda   europe-central-1  gpu_8x_a100  124xCPU,         no    $10.32     
                                            1933GB, 8xA100                    
                                            (40GB),                           
                                            6598.7GB (disk)                   
 2  lambda   asia-northeast-1  gpu_8x_a100  124xCPU,         no    $10.32     
                                            1933GB, 8xA100                    
                                            (40GB),                           
                                            6598.7GB (disk)                   
 3  vastai   us-washington     17831846     96xCPU, 1814GB,  no    $13.8944   
                                            8xA100 (80GB),                    
                                            100.0GB (disk)                    
    ...                                                                       
 Shown 3 of 37 offers, $51.2073 max

Specifying 8xa100 helps getting backends without multi-node support on top.

Actual behaviour

This is caused by the multinode check that relies on fleet_model that does not exist when requesting the plan:

multinode = fleet.spec.configuration.placement == InstanceGroupPlacement.CLUSTER

Note that the offers without multi-node are actually filtered out when provisioning the fleet, so it's only a fleet plan bug.

Expected behaviour

No response

dstack version

master

Server logs

Additional information

No response

@r4victor r4victor added the bug Something isn't working label Feb 14, 2025
@r4victor r4victor self-assigned this Feb 14, 2025
@r4victor r4victor changed the title [Bug]: Fleet plan includes offers from backends without multi-node support with placement: cluster. [Bug]: Fleet plan includes offers from backends without multi-node support with placement: cluster Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant