Sound Demo of CTRnet with oracle speaker diarization on CHiME-6 Corpus


We provide a sound demo based on the test set of the CHiME-6 corpus. For technical details, please refer to the following paper:

Z.-Q. Wang and S. Cornell, "Cross-Talk Speech Reduction, by Separation, for Separation", in submission, 2026.
This demo presents the results of CTRnet with oracle speaker diarization.

Note:
• Each spectrogram is masked using oracle speaker-activity timestamps, resulting in pure black regions in the spectrograms.
• There are four speakers, so there are four close-talk mixtures, and four separated signals by each algorithm.
• You can carefully listen to each close-talk mixture to figure out who is the target speaker (or get a hint from the results of CTRnet). It is the one who talks the most over the unmasked regions.
CTRnet clearly outperforms GSS, supervised approach, and mixtures. It removes cross-talk dramatically and introduces little distortion to target speech.
• Firefox/Chrome is recommended.

We provide the following examples (click to jump to the example):
S01 at 00:02:06
S01 at 00:22:57
S01 at 01:04:32
S01 at 01:34:23
S01 at 02:04:53
S01 at 02:14:47
S01 at 02:21:02
S21 at 00:02:19
S21 at 00:20:51
S21 at 01:17:05
S21 at 01:53:04
S21 at 02:14:40
S21 at 02:18:49
S21 at 02:25:53




Signal segment:
S01 at 00:02:06

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S01 at 00:22:57

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S01 at 01:04:32

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S01 at 01:34:23

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S01 at 02:04:53

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S01 at 02:14:47

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S01 at 02:21:02

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P01
Mixture
(0 of Table II)
P02
Mixture
(0 of Table II)
P03
Mixture
(0 of Table II)
P04




Supervised
(2 of Table II)
P01
Supervised
(2 of Table II)
P02
Supervised
(2 of Table II)
P03
Supervised
(2 of Table II)
P04




GSS (8-channel)
(1b of Table II)
P01
GSS (8-channel)
(1b of Table II)
P02
GSS (8-channel)
(1b of Table II)
P03
GSS (8-channel)
(1b of Table II)
P04




Semi-sup. CTRnet
(10b of Table II)
P01
Semi-sup. CTRnet
(10b of Table II)
P02
Semi-sup. CTRnet
(10b of Table II)
P03
Semi-sup. CTRnet
(10b of Table II)
P04




Signal segment:
S21 at 00:02:19

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48




Signal segment:
S21 at 00:20:51

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48




Signal segment:
S21 at 01:17:05

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48




Signal segment:
S21 at 01:53:04

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48




Signal segment:
S21 at 02:14:40

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48




Signal segment:
S21 at 02:18:49

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48




Signal segment:
S21 at 02:25:53

Systems Speaker Signal Log-compreesed Power Spectrogram
Mixture
(0 of Table II)
P45
Mixture
(0 of Table II)
P46
Mixture
(0 of Table II)
P47
Mixture
(0 of Table II)
P48




Supervised
(2 of Table II)
P45
Supervised
(2 of Table II)
P46
Supervised
(2 of Table II)
P47
Supervised
(2 of Table II)
P48




GSS (8-channel)
(1b of Table II)
P45
GSS (8-channel)
(1b of Table II)
P46
GSS (8-channel)
(1b of Table II)
P47
GSS (8-channel)
(1b of Table II)
P48




Semi-sup. CTRnet
(10b of Table II)
P45
Semi-sup. CTRnet
(10b of Table II)
P46
Semi-sup. CTRnet
(10b of Table II)
P47
Semi-sup. CTRnet
(10b of Table II)
P48