INDEX
Explanations
terms related to military and armed forces
New Auto-Interp
Negative Logits
bane
-0.17
æ¿
-0.16
Pron
-0.15
retaining
-0.15
iras
-0.15
opy
-0.14
Shell
-0.14
apan
-0.14
Aub
-0.14
Shell
-0.14
POSITIVE LOGITS
bedo
0.15
otron
0.15
oksen
0.14
klar
0.14
CLUD
0.14
_Impl
0.14
stad
0.14
638
0.13
news
0.13
ammen
0.13
Activations Density 0.010%