INDEX
Explanations
phrases containing military ranks and names
references to military and police ranks
New Auto-Interp
Negative Logits
ahime
-0.83
rights
-0.76
theless
-0.75
FORE
-0.68
phal
-0.65
spoiler
-0.64
flix
-0.64
女
-0.64
destro
-0.63
GPU
-0.63
POSITIVE LOGITS
geant
1.18
Sgt
1.11
Sergeant
0.93
gt
0.91
Pepper
0.90
veland
0.84
Maj
0.82
sergeant
0.79
cha
0.77
Clancy
0.77
Activations Density 0.031%