Sound Demo of M2M on SMS-WSJ-FF-CT Corpus


This webpage provides a sound demo based on the test set of the SMS-WSJ-FF-CT corpus. For technical details, please refer to the following paper:

Z.-Q. Wang, "Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation", in arXiv preprint arXiv:2402.09313, 2024.

This demo presents the separation results of 10 randomly-selected test signals. We consider all the six channels for multi-channel processing (i.e., corresponding to the models in Table 2).
The SDR scores are computed, in default, by using far-field clean speaker images as the reference signals, except for the case of close-talk mixtures, where close-talk speaker images are used as the reference signals.
Firefox is recommended.


Utterance ID: 1046_440c0414_444c040n Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) 0.7 -0.6
M2M (row 1a) 15.3 15.0
UNSSOR (row 4a) 11.6 11.2
PIT (row 4b) 16.9 17.0
Far-Field Clean Speaker Image - -
Close-Talk Mixture 3.8 (ref: close-talk speech) 6.9 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 1257_441c0412_442c040j Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) 2.0 -2.0
M2M (row 1a) 20.0 18.5
UNSSOR (row 4a) 18.3 17.0
PIT (row 4b) 20.6 19.1
Far-Field Clean Speaker Image - -
Close-Talk Mixture 15.0 (ref: close-talk speech) 7.8 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 279_445c040l_444c0408 Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) 0.1 0.0
M2M (row 1a) 23.3 23.1
UNSSOR (row 4a) 19.7 20.1
PIT (row 4b) 26.1 25.3
Far-Field Clean Speaker Image - -
Close-Talk Mixture 18.1 (ref: close-talk speech) 19.6 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 442_440c040i_442c0405 Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) -1.5 1.4
M2M (row 1a) 19.6 20.8
UNSSOR (row 4a) 15.9 18.3
PIT (row 4b) 21.4 22.1
Far-Field Clean Speaker Image - -
Close-Talk Mixture 18.9 (ref: close-talk speech) 17.4 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 315_447c0404_442c040o Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) -0.4 0.3
M2M (row 1a) 23.7 22.6
UNSSOR (row 4a) 20.5 20.0
PIT (row 4b) 26.3 24.4
Far-Field Clean Speaker Image - -
Close-Talk Mixture 17.7 (ref: close-talk speech) 24.3 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 158_440c040e_442c040k Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) 0.5 -0.5
M2M (row 1a) 20.1 19.6
UNSSOR (row 4a) 18.1 17.4
PIT (row 4b) 21.5 21.1
Far-Field Clean Speaker Image - -
Close-Talk Mixture 19.9 (ref: close-talk speech) 24.1 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 39_443c040s_444c040o Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) 0.2 -0.0
M2M (row 1a) 18.0 18.0
UNSSOR (row 4a) 16.6 16.6
PIT (row 4b) 18.8 19.1
Far-Field Clean Speaker Image - -
Close-Talk Mixture 19.9 (ref: close-talk speech) 13.1 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 281_447c040c_445c0402 Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) -0.5 0.4
M2M (row 1a) 13.6 14.2
UNSSOR (row 4a) 10.2 9.5
PIT (row 4b) 15.4 15.9
Far-Field Clean Speaker Image - -
Close-Talk Mixture 7.1 (ref: close-talk speech) 8.0 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 176_440c0416_442c040l Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) -1.0 0.9
M2M (row 1a) 11.1 12.3
UNSSOR (row 4a) 6.0 8.5
PIT (row 4b) 11.7 13.1
Far-Field Clean Speaker Image - -
Close-Talk Mixture 2.8 (ref: close-talk speech) 8.6 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -







Utterance ID: 948_444c0403_446c040r Spk 1 Log Mag. of Spk 1 Spk 2 Log Mag. of Spk 2 SDR (dB) of Spk 1 SDR (dB) of Spk 2
Far-Field Mixture (row 0) -1.4 1.4
M2M (row 1a) 18.7 21.6
UNSSOR (row 4a) 16.6 18.6
PIT (row 4b) 21.2 23.5
Far-Field Clean Speaker Image - -
Close-Talk Mixture 18.4 (ref: close-talk speech) 25.0 (ref: close-talk speech)
Close-Talk Clean Speaker Image - -