INDEX
Explanations
mentions of Barack Obama
New Auto-Interp
Negative Logits
ancia
-0.15
ÃŃt
-0.15
ials
-0.15
onse
-0.14
ance
-0.14
殿
-0.14
rence
-0.14
alls
-0.14
org
-0.14
ails
-0.14
POSITIVE LOGITS
.gdx
0.20
addin
0.16
roots
0.16
usic
0.16
azo
0.15
ames
0.15
planation
0.15
Sesso
0.15
pillar
0.14
istrovstvÃŃ
0.14
Activations Density 0.002%