INDEX
Explanations
proper nouns and specific terms related to organizations and policies
New Auto-Interp
Negative Logits
krát
-0.15
Mocks
-0.14
tep
-0.14
¬ģ
-0.13
UNG
-0.13
iew
-0.13
Prov
-0.13
ackers
-0.13
ÑĥÑĩа
-0.13
Karlov
-0.13
POSITIVE LOGITS
y
0.19
041
0.14
elia
0.14
vig
0.14
hir
0.13
abin
0.13
ØŃداث
0.13
CppType
0.13
ÌĨ
0.12
032
0.12
Activations Density 0.002%