INDEX
Explanations
proper nouns, specifically names of political figures and locations
variations of the word "assess"
New Auto-Interp
Negative Logits
istically
-0.78
glam
-0.65
ests
-0.65
isher
-0.62
iggins
-0.62
ised
-0.61
olls
-0.61
opsy
-0.61
spring
-0.61
opener
-0.60
POSITIVE LOGITS
asse
1.12
ment
0.79
uve
0.79
ments
0.78
bec
0.76
âĹ¼
0.76
Hollande
0.75
xual
0.73
rand
0.72
uble
0.72
Activations Density 0.012%