INDEX
Explanations
references to reality TV shows and their participants
New Auto-Interp
Negative Logits
pong
-0.17
853
-0.16
olesterol
-0.15
æĪ²
-0.14
ikit
-0.14
eor
-0.14
thá»§y
-0.14
altet
-0.13
[Unit
-0.13
ylland
-0.13
POSITIVE LOGITS
elim
0.24
contestant
0.24
contestants
0.23
elimination
0.23
judges
0.22
audition
0.22
Season
0.21
season
0.21
Idol
0.20
Elim
0.20
Activations Density 0.062%