INDEX
Explanations
names of political figures
names of prominent individuals, particularly political figures
New Auto-Interp
Negative Logits
yip
-0.75
footed
-0.70
tle
-0.68
drm
-0.67
draw
-0.66
Score
-0.65
emouth
-0.65
Stories
-0.65
ipop
-0.64
soType
-0.62
POSITIVE LOGITS
ãĥĺ
0.77
ÃĽ
0.70
ा
0.69
à¨
0.65
embargo
0.64
ij士
0.62
ļéĨĴ
0.60
rand
0.60
afort
0.59
Ö¼
0.58
Activations Density 0.181%