Result Mismatch with the original results in paper in human faces #6

mu-cai · 2020-08-09T11:09:48Z

Hi Rosinality,

Thanks for your excellent code! It is quite excellent!
However, I found that the result of human images is not good. For example, your results are:

(It is obvious that the eyeglasses for the generated images are different from the paper's claim.)

Maybe the problem is the training scheme? Or the cropping method?

rosinality · 2020-08-09T13:11:22Z

Yes, in mumy cases the effect of the texture code in quite restrictive. I suspect that maybe the differences of training scheme was crucial, but I don't know what could be. (Maybe it is related to image/patch resolutions.)

mu-cai · 2020-08-17T16:26:20Z

@rosinality
Thanks for your information!
During the past few days, I have trained your model under CeleAHQ dataset, and the result is quite bad.

each row means original A, original B, reconstructed A and structure A + structure B. You can see that the resulting image even can't keep the pose of A.

Also, I trained it on the LSUN church dataset, the result is also not good.

You can see that the reconstruction quality is not very good, never to say the swapping results.

I think there may be several possible issues:

(1) padding, someone has already pointed that out.
(2) The crop. As for the church dataset, they don't do the resize first, instead, they do the crop.

I also wonder how can they keep the original image size when keeping the ratio and the short side(256) unchanged.

(3) The co-occur unit. You can see that in paper they stated that

And for each prediction, they did the following operation:

So this operation should be down for 8 times.

Thanks again for your nice work!

Best,
Mu

rosinality · 2020-08-17T23:43:59Z

Could you let me know which part of the padding is incorrect?
LSUN church dataset is already resized to be shorter side size is 256, so resizing will not affect the result.
Do you mean that cooccur discriminator should be done on 8 patches? Hmm, it seems like that this is different from the paper. I will try to fix this.

Thank you for your testing & check!

mu-cai · 2020-08-18T00:17:24Z

@rosinality

Thanks for your reply!

You have already fixed this problem yesterday(already committed).
Yes, the shorter side is 256, however, the size of the longer side is not fixed. However, during training, you resized the image into a square in your code, making the ratio of two sides changed, which mismatched with the paper.
Yes! Your single operation should be done for 8 times. Because when you sample a patch from one real image and 8 patches from the fake image, you will get just one prediction. You need 8$N$ predictions, no $N$ .

Thanks again for your answer!

Best,
Mu

rosinality · 2020-08-18T01:34:56Z

Actually padding will not affect results, as that bug will only affect 1x1 convs in current implementation
As prepare_data.py will do resizing with torchvision, it will respect aspect ratios by default.
Seems like that it is important issue. Fixed it at 38cb3ae.

mu-cai · 2020-08-18T03:24:21Z

@rosinality

Thanks for your quick programming!
I have run your code just now, and one more question:
In your code, for each structure/texture pair, you have 8 crops for the real/fake images, but only 4 crops for the ref image. However, I think that for each crop of the real/fake image, we need 4 patches. That is to say, we need 4*8=32 patches in total.

This is my understanding, however, the author didn't state this in his paper... what is your opinion?

Mu

rosinality · 2020-08-18T03:29:00Z

Hmm maybe you are right. But as model uses mean of ref image vectors, maybe it is not very different from using distinct ref patches for each samples. (Hopefully.)

rosinality · 2020-08-18T12:29:59Z

I have changed to use distinct reference samples for each samples. It is less resource consuming than I thought, and I suspect that it will be more robust way to do the training.

mu-cai · 2020-08-18T14:24:20Z

@rosinality

Thanks for your working! Yes, in my opinion, if the training iterations are large enough, then the fixed reference samples would produce the same result as the distinct reference samples. Yes, I also think that the model would be more robust if adopting the distinct reference patches. The GPU memory won't increase too much if doing so... also superised.

Mu

zhangqianhui · 2020-10-14T13:27:44Z

My tf implementation: https://github.com/zhangqianhui/Swapping-Autoencoder-tf. Hope to help you.

virgile-blg · 2020-10-29T16:08:05Z

Hi @mu-cai,

Did the above corrections lead to better structure/style swapping results on your side ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Result Mismatch with the original results in paper in human faces #6

Result Mismatch with the original results in paper in human faces #6

mu-cai commented Aug 9, 2020

rosinality commented Aug 9, 2020

Uh oh!

mu-cai commented Aug 17, 2020 •

edited

Loading

Uh oh!

rosinality commented Aug 17, 2020 •

edited

Loading

Uh oh!

mu-cai commented Aug 18, 2020

Uh oh!

rosinality commented Aug 18, 2020

Uh oh!

mu-cai commented Aug 18, 2020

Uh oh!

rosinality commented Aug 18, 2020

Uh oh!

rosinality commented Aug 18, 2020 •

edited

Loading

Uh oh!

mu-cai commented Aug 18, 2020

Uh oh!

zhangqianhui commented Oct 14, 2020

Uh oh!

virgile-blg commented Oct 29, 2020 •

edited

Loading

Uh oh!

Result Mismatch with the original results in paper in human faces #6

Result Mismatch with the original results in paper in human faces #6

Comments

mu-cai commented Aug 9, 2020

rosinality commented Aug 9, 2020

Uh oh!

mu-cai commented Aug 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rosinality commented Aug 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mu-cai commented Aug 18, 2020

Uh oh!

rosinality commented Aug 18, 2020

Uh oh!

mu-cai commented Aug 18, 2020

Uh oh!

rosinality commented Aug 18, 2020

Uh oh!

rosinality commented Aug 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mu-cai commented Aug 18, 2020

Uh oh!

zhangqianhui commented Oct 14, 2020

Uh oh!

virgile-blg commented Oct 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mu-cai commented Aug 17, 2020 •

edited

Loading

rosinality commented Aug 17, 2020 •

edited

Loading

rosinality commented Aug 18, 2020 •

edited

Loading

virgile-blg commented Oct 29, 2020 •

edited

Loading