INDEX
Explanations
words related to leadership positions or titles
references to military leadership positions
New Auto-Interp
Negative Logits
eat
-0.75
aver
-0.71
430
-0.65
alle
-0.64
employment
-0.63
ropy
-0.63
encers
-0.63
apy
-0.62
ted
-0.61
Democrats
-0.60
POSITIVE LOGITS
chief
0.82
commands
0.77
commanding
0.77
stration
0.76
commander
0.76
ials
0.74
ially
0.73
ufact
0.72
itatively
0.72
commanders
0.70
Activations Density 0.012%