INDEX
Explanations
mentions of military or authoritative figures, specifically commanders
references to military commanders
New Auto-Interp
Negative Logits
eat
-0.73
pher
-0.71
ether
-0.67
apers
-0.65
oric
-0.65
roma
-0.62
encers
-0.62
aver
-0.62
ramid
-0.61
apy
-0.61
POSITIVE LOGITS
commander
0.93
Commander
0.91
chief
0.90
guiActiveUn
0.83
colonel
0.81
ufact
0.79
commanders
0.78
cknow
0.75
utenant
0.73
officer
0.73
Activations Density 0.014%