INDEX
Explanations
names of political figures
references to political figures and their associated eras
New Auto-Interp
Negative Logits
citiz
-0.77
exper
-0.71
additive
-0.69
plur
-0.68
appre
-0.68
(>
-0.68
PRODUCT
-0.67
orally
-0.66
mods
-0.66
*.
-0.65
POSITIVE LOGITS
era
1.33
esque
1.24
inspired
1.22
Murray
1.14
themed
1.13
style
1.10
Clinton
1.08
backed
1.08
Pal
1.07
induced
1.07
Activations Density 0.074%