Inference from ViTAutoEnc during SSL #1393

nmj14 · 2023-05-12T21:52:59Z

nmj14
May 12, 2023

Hello,

I am working with the self_supervised_pretraining/vit_unetr_ssl/ssl_train.ipynb to train a model on image patches that undergo patch masking, coarse dropout, and coarse shuffle in the training transform in order to recreate the original patches through SSL. I want to apply the trained model on a set of test image patches using the following code. However, after attempting to stitch the model inferences on the image patches into the original image size, the resulting image does not match the original image. Please see my code snippets and screenshots below. If you have any suggestions on how I may solve this issue, I would greatly appreciate it!

My data set training images look like the following of size [1246, 934]:

I run the ssl_train jupyter notebook very similarly to the tutorial parameters, adjusting for 2D grayscale images, scaling intensity transform from 0 to 255 to between 0 and 1, and a spatial cropping transform of [128, 128]. The model parameters are as follows:

model = ViTAutoEnc(
in_channels=1,
img_size=(128, 128),
patch_size=(16, 16),
pos_embed="conv",
hidden_size=768,
mlp_dim=3072,
spatial_dims=2
)
I train the model following the ssl_train tutorial notebook and save the model.

For testing, I take a new image similar to above, but of size [1338, 1004], and patch it into 1,408 overlapping patches of [128, 128] with step size of 28. I perform the following transform on each [128, 128] patch:

test_transforms = Compose(
[
LoadImaged(keys=["image", "label"]),
EnsureChannelFirstd(keys=["image", "label"]),
ScaleIntensityRanged(
keys=["image", "label"],
a_min=0,
a_max=255,
b_min=0.0,
b_max=1.0,
clip=True,
),
RandCoarseDropoutd(keys=["image"], prob=1.0, holes=6, spatial_size=5, dropout_holes=True, max_spatial_size=32),
RandCoarseShuffled(keys=["image"], prob=0.8, holes=10, spatial_size=8),
ToTensord(keys=["image", "label"]),
]
)

And an example test image patch after transform is below:

I loaded the previously trained model weights per the following screenshot:

Finally, I tried to apply the model to the new test image patches I created with the following code:
for num, patch in enumerate(test_loader):
start_time = time.time()
inputs = patch["image"].to(device)
outputs_v1, hidden_v1 = model(inputs)
predictions = outputs_v1.array[0, 0, :, :]
if num == 0:
patches = predictions
else:
patches = np.dstack((patches, predictions))
patches_ch = np.moveaxis(patches, -1, 0)
reconstructed_image = reconstruct_from_patches_2d(patches_ch, (1004, 1338))
test_rs = pre.MinMaxScaler().fit_transform(reconstructed_image)
test_rs255 = test_rs*255

The following image is the variable reconstructed_image:

The following image is the variable test_rs255:

The goal is to have either reconstructed_image or test_rs255 look similar, in terms of having cellular structures, to the original image at the top of this post (the test image and the train are not the exact same image, but they are looking at similar cells). Any help would be greatly appreciated! Thank you!!

Nic-Ma · 2023-05-13T02:11:21Z

Nic-Ma
May 13, 2023
Maintainer

Hi @finalelement ,

Could you please help share some comments on this question?

Thanks in advance.

0 replies

finalelement · 2023-05-25T19:54:48Z

finalelement
May 25, 2023

Hello @nmj14 I think the way that the weights have been loaded is for the specific task of fine-tuning for a different downstream task. In your particular case, it seems like you are just trying to visually see how the reconstructions look like. I would just load the weights the standard way, model.load_state_dict and not pop any weights (unlike the screenshot you have posted where some layers are being popped out of the state dict).

Please try the above and if the problem persists, please also share the training curve, that will help us in giving more insights. Thank you

1 reply

nmj14 May 30, 2023
Author

Hello @finalelement, I loaded the weights as you suggested per the following lines of code.
device = torch.device("cuda:0")
model = ViTAutoEnc(
in_channels=1,
img_size=(128, 128),
patch_size=(16, 16),
pos_embed="conv",
hidden_size=768,
mlp_dim=3072,
spatial_dims=2
)
model = model.to(device)
model.load_state_dict(torch.load(os.path.join(model_dir, model_filename)))
model.eval()

But, the output is the following error. For this reason is why I had switched previously from this traditional way of loading weights to popping some of the layers.

Please see the training curve below:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inference from ViTAutoEnc during SSL #1393

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Inference from ViTAutoEnc during SSL #1393

Uh oh!

nmj14 May 12, 2023

Replies: 2 comments · 1 reply

Uh oh!

Nic-Ma May 13, 2023 Maintainer

Uh oh!

finalelement May 25, 2023

Uh oh!

nmj14 May 30, 2023 Author

nmj14
May 12, 2023

Replies: 2 comments 1 reply

Nic-Ma
May 13, 2023
Maintainer

finalelement
May 25, 2023

nmj14 May 30, 2023
Author