INDEX
Explanations
references to research studies and their findings
New Auto-Interp
Negative Logits
İnsan
-0.20
Bölüm
-0.18
baģlantılar
-0.18
DÃ¼ÅŁ
-0.18
Åŀu
-0.17
Bütün
-0.17
KiÅŁ
-0.17
MÃ¼ÅŁ
-0.17
Düny
-0.17
Bazı
-0.16
POSITIVE LOGITS
Sey
0.20
TOK
0.19
Bey
0.18
ı
0.18
̧
0.17
Cay
0.17
Pam
0.17
Bir
0.16
Erd
0.15
.dep
0.15
Activations Density 0.074%