INDEX
Explanations
references to political entities and titles
New Auto-Interp
Negative Logits
atoon
-0.16
431
-0.15
addtogroup
-0.14
анÑĮ
-0.14
forme
-0.14
.cfg
-0.14
.mods
-0.14
LEC
-0.14
enler
-0.14
ziej
-0.13
POSITIVE LOGITS
Beam
0.24
pras
0.23
Rat
0.22
Fin
0.22
beam
0.21
Sta
0.21
Rates
0.20
Gener
0.20
Ober
0.19
Beam
0.19
Activations Density 0.024%