INDEX
Explanations
mentions of the TV show "Survivor"
references to the TV show "Survivor" and its contestants
New Auto-Interp
Negative Logits
forward
-0.76
pains
-0.75
Stur
-0.74
ware
-0.72
brid
-0.70
ensing
-0.69
heter
-0.67
rity
-0.66
anical
-0.66
resent
-0.66
POSITIVE LOGITS
contestants
1.34
contestant
1.33
Solitaire
1.13
Survivor
1.09
Contest
1.02
Apprentice
0.96
Surv
0.92
ipedia
0.87
achel
0.83
Simulator
0.82
Activations Density 0.025%