-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An explanation for the source code of finding the alignment path in GlowTTS? #72
Comments
Hi! I am assuming this late you might not need this! But I am still writing it for the future in case someone else also encounters this! x_s_sq_r = torch.exp(-2 * x_logs)
logp1 = torch.sum(-0.5 * math.log(2 * math.pi) - x_logs, [1]).unsqueeze(-1) # [b, t, 1]
logp2 = torch.matmul(x_s_sq_r.transpose(1,2), -0.5 * (z ** 2)) # [b, t, d] x [b, d, t'] = [b, t, t']
logp3 = torch.matmul((x_m * x_s_sq_r).transpose(1,2), z) # [b, t, d] x [b, d, t'] = [b, t, t']
logp4 = torch.sum(-0.5 * (x_m ** 2) * x_s_sq_r, [1]).unsqueeze(-1) # [b, t, 1]
logp = logp1 + logp2 + logp3 + logp4 # [b, t, t'] It is the log-likelihood computation from a gaussian centred at (x_m, x_logs). And in attn = monotonic_align.maximum_path(logp, attn_mask.squeeze(1)).unsqueeze(1).detach() They find a Viterbi approximation (using dynamic programming) over the data likelihood to maximise it further. Hope this helps! |
For a Gaussian distribution Taking the logarithm, we get: Now, let's see how each term in the code corresponds to the above formula:
Adding these four terms together, we get: |
Hi. I'm reading the source code of GlowTTS model for educational purposes. One of the sections that I can't really understand is where we try to find the alignment path using Monotonic Alignment Search in the training phase. Could anyone please explain me the following lines of code?
Thanks in advance.
The text was updated successfully, but these errors were encountered: