INDEX
Explanations
mentions of Barack Obama
New Auto-Interp
Negative Logits
bley
-0.83
heed
-0.81
fman
-0.77
itionally
-0.73
cig
-0.73
jelly
-0.72
ishes
-0.68
kish
-0.66
perm
-0.65
ruciating
-0.65
POSITIVE LOGITS
Obama
0.97
Obama
0.97
irez
0.79
ration
0.74
-|
0.73
ostics
0.73
-+
0.72
inois
0.69
kson
0.67
ãĤ¿
0.66
Activations Density 0.060%