INDEX
Explanations
military ranks and unit designations
New Auto-Interp
Negative Logits
imb
-0.16
rema
-0.16
mbH
-0.16
ucz
-0.15
tility
-0.15
lei
-0.14
ģında
-0.14
_iteration
-0.14
ulkan
-0.14
ekim
-0.14
POSITIVE LOGITS
Kil
0.14
Weed
0.14
екÑģ
0.14
Mus
0.14
int
0.14
ovic
0.13
Kr
0.13
łĢ
0.13
ecure
0.13
↵
0.13
Activations Density 0.014%