INDEX
Explanations
mentions of reality television shows and their participants
New Auto-Interp
Negative Logits
*/(
-0.77
grounds
-0.69
Samoa
-0.68
rade
-0.67
eer
-0.65
eers
-0.65
gered
-0.65
Sequ
-0.63
rall
-0.63
holders
-0.61
POSITIVE LOGITS
OST
1.07
OA
0.93
CP
0.92
JA
0.92
INO
0.91
ONY
0.90
ythm
0.90
onda
0.88
CE
0.87
YD
0.87
Activations Density 0.004%