INDEX
Explanations
words related to military ranks and positions
titles or roles related to military and authority figures
New Auto-Interp
Negative Logits
ilion
-0.76
fman
-0.71
oho
-0.69
etting
-0.68
bloom
-0.66
undai
-0.65
iggurat
-0.64
esting
-0.64
ether
-0.64
anwhile
-0.64
POSITIVE LOGITS
extraord
1.38
esses
1.15
hood
1.03
ial
0.99
Beware
0.98
ess
0.94
stown
0.90
NAME
0.89
Candidate
0.88
beware
0.86
Activations Density 0.250%