-
Notifications
You must be signed in to change notification settings - Fork 46
Result Mismatch with the original results in paper in human faces #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes, in mumy cases the effect of the texture code in quite restrictive. I suspect that maybe the differences of training scheme was crucial, but I don't know what could be. (Maybe it is related to image/patch resolutions.) |
@rosinality Also, I trained it on the LSUN church dataset, the result is also not good. You can see that the reconstruction quality is not very good, never to say the swapping results. I think there may be several possible issues: (1) padding, someone has already pointed that out. I also wonder how can they keep the original image size when keeping the ratio and the short side(256) unchanged. (3) The co-occur unit. You can see that in paper they stated that And for each prediction, they did the following operation: So this operation should be down for 8 times. Thanks again for your nice work! Best, |
Thank you for your testing & check! |
|
Thanks for your quick programming! This is my understanding, however, the author didn't state this in his paper... what is your opinion? Mu |
Hmm maybe you are right. But as model uses mean of ref image vectors, maybe it is not very different from using distinct ref patches for each samples. (Hopefully.) |
I have changed to use distinct reference samples for each samples. It is less resource consuming than I thought, and I suspect that it will be more robust way to do the training. |
Thanks for your working! Yes, in my opinion, if the training iterations are large enough, then the fixed reference samples would produce the same result as the distinct reference samples. Yes, I also think that the model would be more robust if adopting the distinct reference patches. The GPU memory won't increase too much if doing so... also superised. Mu |
My tf implementation: https://github.com/zhangqianhui/Swapping-Autoencoder-tf. Hope to help you. |
Hi @mu-cai, Did the above corrections lead to better structure/style swapping results on your side ? |
Hi Rosinality,
Thanks for your excellent code! It is quite excellent!

However, I found that the result of human images is not good. For example, your results are:
(It is obvious that the eyeglasses for the generated images are different from the paper's claim.)

Maybe the problem is the training scheme? Or the cropping method?
The text was updated successfully, but these errors were encountered: