This webpage provides a sound demo on the test set of the SMS-WSJ corpus. For technical details, please refer to the following paper:
Z.-Q. Wang and S. Watanabe, "UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures", in Conference on Neural Information Processing Systems (NeurIPS), 2023.
Utterance ID: 0_442c040o_443c040g | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | -3.2 | 3.4 | |||
UNSSOR (row 2a) | 11.7 | 15.0 | |||
IVA (row 3b) | 5.8 | 9.5 | |||
iRAS w/ causal filters (row 3c) | 1.7 | 8.0 | |||
iRAS w/ non-causal filters (row 3d) | 2.4 | 8.8 | |||
PIT (row 4a) | 15.5 | 18.5 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 1015_446c0415_442c040c | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 0.3 | -0.5 | |||
UNSSOR (row 2a) | 15.4 | 15.5 | |||
IVA (row 3b) | 8.3 | 9.6 | |||
iRAS w/ causal filters (row 3c) | 8.2 | 6.2 | |||
iRAS w/ non-causal filters (row 3d) | 9.0 | 7.7 | |||
PIT (row 4a) | 18.3 | 18.4 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 1120_445c040c_441c040m | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 5.6 | -5.5 | |||
UNSSOR (row 2a) | 17.1 | 12.0 | |||
IVA (row 3b) | 6.7 | 2.3 | |||
iRAS w/ causal filters (row 3c) | 9.1 | 1.9 | |||
iRAS w/ non-causal filters (row 3d) | 9.4 | 2.2 | |||
PIT (row 4a) | 20.5 | 15.6 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 15_443c040a_444c0415 | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 1.0 | -0.9 | |||
UNSSOR (row 2a) | 20.6 | 18.9 | |||
IVA (row 3b) | 5.7 | 4.3 | |||
iRAS w/ causal filters (row 3c) | 9.0 | 5.1 | |||
iRAS w/ non-causal filters (row 3d) | 9.2 | 5.4 | |||
PIT (row 4a) | 21.9 | 20.9 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 999_441c040c_447c040k | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | -2.4 | 3.0 | |||
UNSSOR (row 2a) | 15.4 | 18.2 | |||
IVA (row 3b) | 11.2 | 8.4 | |||
iRAS w/ causal filters (row 3c) | 5.6 | 8.5 | |||
iRAS w/ non-causal filters (row 3d) | 7.0 | 9.5 | |||
PIT (row 4a) | 18.4 | 20.6 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 989_441c040l_445c0401 | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | -1.6 | 1.8 | |||
UNSSOR (row 2a) | 15.8 | 17.4 | |||
IVA (row 3b) | 13.7 | 8.0 | |||
iRAS w/ causal filters (row 3c) | 11.2 | 12.1 | |||
iRAS w/ non-causal filters (row 3d) | 8.8 | 10.0 | |||
PIT (row 4a) | 22.9 | 23.6 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 39_443c040s_444c040o | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 0.2 | -0.0 | |||
UNSSOR (row 2a) | 16.6 | 16.6 | |||
IVA (row 3b) | 11.4 | 13.6 | |||
iRAS w/ causal filters (row 3c) | 7.2 | 5.6 | |||
iRAS w/ non-causal filters (row 3d) | 7.5 | 5.8 | |||
PIT (row 4a) | 18.8 | 19.1 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 725_444c040l_442c040p | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 2.6 | -2.5 | |||
UNSSOR (row 2a) | 15.6 | 13.3 | |||
IVA (row 3b) | 11.0 | 5.8 | |||
iRAS w/ causal filters (row 3c) | 8.7 | 5.9 | |||
iRAS w/ non-causal filters (row 3d) | 8.6 | 5.3 | |||
PIT (row 4a) | 18.7 | 16.0 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 314_440c040r_441c040p | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 0.9 | -0.7 | |||
UNSSOR (row 2a) | 20.2 | 19.3 | |||
IVA (row 3b) | 13.1 | 13.2 | |||
iRAS w/ causal filters (row 3c) | 16.3 | 15.8 | |||
iRAS w/ non-causal filters (row 3d) | 11.1 | 8.5 | |||
PIT (row 4a) | 22.3 | 22.1 | |||
Clean speaker images | - | - | -------------------------------------------------- | ||
Utterance ID: 60_441c0410_442c040c | Speaker 1 | Speaker 2 | SDR (dB) of Speaker 1 | SDR (dB) of Speaker 2 | |
Mixture (row 0a) | 0.2 | -0.1 | |||
UNSSOR (row 2a) | 15.9 | 16.3 | |||
IVA (row 3b) | 9.0 | 8.6 | |||
iRAS w/ causal filters (row 3c) | 8.0 | 8.4 | |||
iRAS w/ non-causal filters (row 3d) | 8.7 | 9.3 | |||
PIT (row 4a) | 19.4 | 19.3 | |||
Clean speaker images | - | - |