INDEX
Explanations
proper nouns related to political figures and leadership positions
phrases referring to leadership positions or roles
New Auto-Interp
Negative Logits
partName
-0.73
rouse
-0.70
solicitation
-0.70
ritz
-0.66
gerald
-0.66
trap
-0.66
Feature
-0.64
bid
-0.64
BI
-0.63
norm
-0.63
POSITIVE LOGITS
Yugoslavia
0.79
Wales
0.78
Kamp
0.76
Honduras
0.75
Nations
0.75
Britain
0.74
England
0.73
Burlington
0.72
Ethiopia
0.70
NXT
0.70
Activations Density 0.098%