INDEX
Explanations
forms of the word "won" and its variations, indicating victory or denial
New Auto-Interp
Negative Logits
ÂĿ
-0.17
еÑİ
-0.16
lyn
-0.16
enny
-0.15
tre
-0.15
ей
-0.15
zac
-0.14
echa
-0.14
erç
-0.14
ett
-0.14
POSITIVE LOGITS
't
0.30
’t
0.28
kish
0.21
ldr
0.18
ky
0.18
ner
0.17
ildo
0.16
DER
0.16
rg
0.16
nt
0.16
Activations Density 0.080%