INDEX
Explanations
mentions of the name "Simon" at a high level of activation
mentions of the name "Simon."
New Auto-Interp
Negative Logits
olulu
-0.73
00200000
-0.72
reek
-0.72
late
-0.71
eals
-0.70
merce
-0.69
doors
-0.67
ktop
-0.67
rings
-0.66
laws
-0.66
POSITIVE LOGITS
Simon
1.16
Simon
1.08
Says
0.80
Richie
0.79
Gerr
0.79
Fraser
0.73
irtual
0.73
Baron
0.72
zman
0.72
Baz
0.72
Activations Density 0.007%