UNETR using random 96x96x96 patches and non-overlapping 16x16x16 sub-patches? #421
Replies: 4 comments
-
Hi @rekalantar , In the tutorial, we set Thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
Hi @rekalantar Thanks for your interest in our work. We first sample inputs of size (96,96,96) from the entire volume and then utilize (16,16,16) non-overlapping patches from each sample. This process is similar to general ImageNet computer vision wherein images of different sizes are first resized to (256,256) and then center-cropped. On another note, using the entire imaging volume is not feasible in most cases due to memory constrains. Thanks |
Beta Was this translation helpful? Give feedback.
-
Great thank you for your response. I wonder if applying light embeddings and/or dilated convolutions cross 96x96x96 patches would help at all. Perhaps attention modules could become lighter or replaced by separable convolutions to avoid memory overshoot. In any case, exciting project. keep up the good work! |
Beta Was this translation helpful? Give feedback.
-
Hi @rekalantar I believe making the attention modules lighter/more efficient is a promising direction. Thanks |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Ali, thank you for sharing this great work.
I had a question regarding the image and patch sizes. I noticed that UNETR uses non-overlapping sub-patches of size 16 from patches of 96 which are randomly selected from the image volume. In this case is it fair to say that the network is still not able to take into account the entire imaging volume? From the tutorial my understanding is that 'RandCropByPosNegLabeld' only picks out one patch at a time? I would appreciate it if you please provide an explanation for this.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions