Titanet-Large: Compute EER in every epoch #12881

ukemamaster · 2025-04-04T10:19:43Z

Hi @nithinraok To compute the test EER after every epoch, i have to set is_audio_pair: true in the titanet-large.yaml file. How should the corresponding test data manifest file look like?

The text was updated successfully, but these errors were encountered:

ukemamaster · 2025-04-04T12:04:07Z

@nithinraok On the other hand, if i set is_audio_pair to false, and pass a test split manifest file (which was created while generating the train manifest file from fileslist passing --split argument to scripts/speaker_tasks/filelist_to_manifest.py), i get CUDA Out of Memory error, even for a batch size of 1. i have 24G GPUs.
I really need to know how the test manifest file should look like. In both cases of is_audio_pair.

nithinraok · 2025-04-04T12:58:12Z

@stevehuang52 do you have a sample manifest on how test manifest should be when is_audio_pair is set to true?

stevehuang52 · 2025-04-04T14:43:23Z

Example of a line in the manifest file when is_audio_pair=True:
{
"audio_filepath": ["/path/to/audio_wav_0.wav", "/path/to/audio_wav_1.wav"],
"duration": null, # not used but need the field, will load the whole audio
"offset": 0.0, # not used but need the field, will load the whole audio
"label": "0" # label for the pair, 0 for not the same speaker, 1 for same speaker
}

stevehuang52 · 2025-04-04T14:45:49Z

When is_audio_pair=False, the manifest should look like a normal speaker recognition manifest:
{
"audio_filepath": "/path/to/audio_wav_0.wav",
"duration": 10.0,
"offset": 0.0,
"label": "speaker_id_000",
}

ukemamaster · 2025-04-04T14:56:48Z

Thanks @stevehuang52 , i will try it.
One more question : can i pass multiple test manifest files to compute eer for multiple set of trials?

stevehuang52 · 2025-04-04T15:10:47Z

Yes, you can pass them as a list of manifests: model.validation_ds.manifest_filepath=[manifest_1.json,manifest_2.json]

ukemamaster · 2025-04-04T15:14:07Z

Great. And from the titanet-large.yaml file is possible to pass?

stevehuang52 · 2025-04-04T15:16:08Z

Yes you can directly specify them in the yaml file

ukemamaster · 2025-04-07T09:34:57Z

Hi @stevehuang52
I created the manifest file for validation pairs. it looks like this:

{"audio_filepath": ["path/to/00000.wav", "path/to/00001.wav"], "duration": 0.0, "offset": 0.0, "label": "0"}
{"audio_filepath": ["path/to/00002.wav", "path/to/00001.wav"], "duration": 0.0, "offset": 0.0, "label": "1"}

It seems ok now, but i get this error:

Error executing job with overrides: []
Traceback (most recent call last):
  File "/NeMo/examples/speaker_tasks/recognition/my_speaker_reco.py", line 69, in main
    trainer.fit(speaker_model)
  File "nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 980, in _run
    results = self._run_stage()
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1021, in _run_stage
    self._run_sanity_check()
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1050, in _run_sanity_check
    val_loop.run()
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/loops/utilities.py", line 181, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 115, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx)
  File "//nemo/lib/python3.10/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 376, in _evaluation_step
    output = call._call_strategy_hook(trainer, hook_name, *step_kwargs.values())
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 293, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/nemo/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 393, in validation_step
    return self.model.validation_step(*args, **kwargs)
  File "/NeMo/nemo/collections/asr/models/label_models.py", line 565, in validation_step
    return self.evaluation_step(batch, batch_idx, dataloader_idx, 'val')
  File "/NeMo/nemo/collections/asr/models/label_models.py", line 418, in evaluation_step
    return self.pair_evaluation_step(batch, batch_idx, dataloader_idx, tag)
  File "/NeMo/nemo/collections/asr/models/label_models.py", line 467, in pair_evaluation_step
    self._macro_accuracy.update(preds=logits, target=labels)
  File "/nemo/lib/python3.10/site-packages/torchmetrics/metric.py", line 550, in wrapped_func
    update(*args, **kwargs)
  File "/nemo/lib/python3.10/site-packages/torchmetrics/classification/stat_scores.py", line 339, in update
    _multiclass_stat_scores_tensor_validation(
  File "/nemo/lib/python3.10/site-packages/torchmetrics/functional/classification/stat_scores.py", line 283, in _multiclass_stat_scores_tensor_validation
    raise ValueError(
ValueError: If `preds` have one dimension more than `target`, `preds.shape[1]` should be equal to number of classes.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Seems like shape incompatibility issue.
The shapes of logits and labels are: torch.Size([64, 2]) torch.Size([64]) here logits.shape[1]=2 and n_classes=54988 for training set. I think the problem, arises from here. The labels for validation set has only 2 classes.

Do you think the manifest file is correct? Are the label shapes correct? Do i need to convert them to one-hot vectors?

ukemamaster · 2025-04-07T11:06:11Z

On the other hand, if i set is_audio_pair to false, and pass a normal manifest file for test set as you said, i get CUDA Out of Memory error, even for a batch size of 1. i have 24G GPUs.

nithinraok · 2025-04-07T13:03:07Z

CUDA OOM also depends on each of your audio sample length. Keep them <=3sec

stevehuang52 · 2025-04-07T14:40:05Z

Hi @ukemamaster, regarding the macro-accuracy error, there's a bug in the model code, which will be fixed by this PR: #12908.

Regarding OOM error, that's probably due to the lengths of audios, could you please share the statistics of your audio lengths? Also, it'll be helpful if you can share where in the code the OOM occurred, since the classification layer may take a lot of GPU memory if you have a huge number of speakers and long audios. As @nithinraok suggests, we normally use less than 3s audios during training with is_audio_pair=false.

ukemamaster · 2025-04-08T08:45:30Z

Hi @stevehuang52, Incorporating the PR, I still get the same error. Printing the self._macro_accuracy.num_classes inside pair_evaluation_step() method still gives 54988 .

If i re-initialize the self._macro_accuracy inside pair_evaluation_step() method, like:

self._macro_accuracy = Accuracy(num_classes=2, top_k=1, average='macro', task='multiclass').to(labels.get_device())

It proceeds to training correctly.

BUT is it safe to do that? The method looks like:

    def pair_evaluation_step(self, batch, batch_idx, dataloader_idx: int = 0, tag: str = 'val'):
        audio_signal_1, audio_signal_len_1, audio_signal_2, audio_signal_len_2, labels, _ = batch
        _, audio_emb1 = self.forward(input_signal=audio_signal_1, input_signal_length=audio_signal_len_1)
        _, audio_emb2 = self.forward(input_signal=audio_signal_2, input_signal_length=audio_signal_len_2)

        # convert binary labels to -1, 1
        loss_labels = (labels.float() - 0.5) * 2
        cosine_sim = torch.cosine_similarity(audio_emb1, audio_emb2)
        loss_value = torch.nn.functional.mse_loss(cosine_sim, loss_labels)

        logits = torch.stack([1 - cosine_sim, cosine_sim], dim=-1)
        acc_top_k = self._accuracy(logits=logits, labels=labels)

        ################# re-initialize self._macro_accuracy
        self._macro_accuracy = Accuracy(num_classes=2, top_k=1, average='macro', task='multiclass').to(labels.get_device())

        correct_counts, total_counts = self._accuracy.correct_counts_k, self._accuracy.total_counts_k
        self._macro_accuracy.update(preds=logits, target=labels)
        stats = self._macro_accuracy._final_state()

        output = {
            f'{tag}_loss': loss_value,
            f'{tag}_correct_counts': correct_counts,
            f'{tag}_total_counts': total_counts,
            f'{tag}_acc_micro_top_k': acc_top_k,
            f'{tag}_acc_macro_stats': stats,
            f"{tag}_scores": cosine_sim,
            f"{tag}_labels": labels,
        }

        if tag == 'val':
            if isinstance(self.trainer.val_dataloaders, (list, tuple)) and len(self.trainer.val_dataloaders) > 1:
                self.validation_step_outputs[dataloader_idx].append(output)
            else:
                self.validation_step_outputs.append(output)
        else:
            if isinstance(self.trainer.test_dataloaders, (list, tuple)) and len(self.trainer.test_dataloaders) > 1:
                self.test_step_outputs[dataloader_idx].append(output)
            else:
                self.test_step_outputs.append(output)

        return output

stevehuang52 · 2025-04-08T14:12:06Z

Hi @ukemamaster, re-initializing the metric in validation step is probably fine in this particular case. Meanwhile, I just updated the PR to use a separate metric class self._pair_macro_accuracy for pair evaluation, could you please try it out?

ukemamaster · 2025-04-09T08:36:55Z

@stevehuang52 With the updated PR the training goes fine.

One question regarding the EER:

In the logs,

Epoch 0, global step 20816: 'val_loss' reached 1.23330 (best 1.23330), 
Epoch 1, global step 41632: 'val_loss' reached 1.31718 (best 1.23330), 
Epoch 2, global step 62448: 'val_loss' reached 1.14597 (best 1.14597), 
Epoch 3, global step 83264: 'val_loss' reached 0.91044 (best 0.91044), 
Epoch 4, global step 104080: 'val_loss' reached 0.81943 (best 0.81943), 
Epoch 5, global step 124896: 'val_loss' reached 0.77752 (best 0.77752), 
Epoch 6, global step 145712: 'val_loss' reached 0.75393 (best 0.75393),

the val_loss referes to the EER computed from the trials? If not, where is the EER logged?

stevehuang52 · 2025-04-09T18:18:28Z

the val_loss referes to the EER computed from the trials? If not, where is the EER logged?

@ukemamaster In the paired eval case, the val_loss is the MSE loss between predicted cosine similarity and the groundtruth labels converted to -1 and 1.

For saving checkpoint based on EER value, you need to set exp_manager.checkpoint_callback_params.monitor='val_eer'. You can refer to this example on how to configure checkpoint_callback_params

If you only need to monitor the EER but still save checkpoints based on val_loss, you don't need the above change, and the EER values will be logged in WandB as well.

ukemamaster · 2025-04-11T08:50:44Z

Thanks @stevehuang52.
Now i can save model checkpoints based on val_eer.

ashors1 closed this as completed May 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Titanet-Large: Compute EER in every epoch #12881

Titanet-Large: Compute EER in every epoch #12881

ukemamaster commented Apr 4, 2025

ukemamaster commented Apr 4, 2025

nithinraok commented Apr 4, 2025

stevehuang52 commented Apr 4, 2025 •

edited

Loading

stevehuang52 commented Apr 4, 2025

ukemamaster commented Apr 4, 2025

stevehuang52 commented Apr 4, 2025

ukemamaster commented Apr 4, 2025

stevehuang52 commented Apr 4, 2025

ukemamaster commented Apr 7, 2025 •

edited

Loading

ukemamaster commented Apr 7, 2025 •

edited

Loading

nithinraok commented Apr 7, 2025

stevehuang52 commented Apr 7, 2025

ukemamaster commented Apr 8, 2025 •

edited

Loading

stevehuang52 commented Apr 8, 2025

ukemamaster commented Apr 9, 2025 •

edited

Loading

stevehuang52 commented Apr 9, 2025

ukemamaster commented Apr 11, 2025

Titanet-Large: Compute EER in every epoch #12881

Titanet-Large: Compute EER in every epoch #12881

Comments

ukemamaster commented Apr 4, 2025

ukemamaster commented Apr 4, 2025

nithinraok commented Apr 4, 2025

stevehuang52 commented Apr 4, 2025 • edited Loading

stevehuang52 commented Apr 4, 2025

ukemamaster commented Apr 4, 2025

stevehuang52 commented Apr 4, 2025

ukemamaster commented Apr 4, 2025

stevehuang52 commented Apr 4, 2025

ukemamaster commented Apr 7, 2025 • edited Loading

ukemamaster commented Apr 7, 2025 • edited Loading

nithinraok commented Apr 7, 2025

stevehuang52 commented Apr 7, 2025

ukemamaster commented Apr 8, 2025 • edited Loading

stevehuang52 commented Apr 8, 2025

ukemamaster commented Apr 9, 2025 • edited Loading

stevehuang52 commented Apr 9, 2025

ukemamaster commented Apr 11, 2025

stevehuang52 commented Apr 4, 2025 •

edited

Loading

ukemamaster commented Apr 7, 2025 •

edited

Loading

ukemamaster commented Apr 7, 2025 •

edited

Loading

ukemamaster commented Apr 8, 2025 •

edited

Loading

ukemamaster commented Apr 9, 2025 •

edited

Loading