INDEX
Explanations
phrases or names related to political figures or organizations
proper nouns, specifically names of people and organizations
New Auto-Interp
Negative Logits
creen
-0.90
wagen
-0.86
Sieg
-0.79
tons
-0.69
rolled
-0.68
strip
-0.68
Reson
-0.68
Telephone
-0.66
orks
-0.66
pter
-0.65
POSITIVE LOGITS
sterdam
0.75
Ambrose
0.74
auga
0.73
Orig
0.71
afa
0.71
ibles
0.70
endi
0.69
qua
0.66
amus
0.65
anth
0.65
Activations Density 0.018%