INDEX
Explanations
prepositions, particularly "of" and "on"
New Auto-Interp
Negative Logits
-0.69
-0.61
-0.55
for
-0.54
-0.53
si
-0.48
a
-0.48
/
-0.46
has
-0.45
↵
-0.45
POSITIVE LOGITS
Efq
1.00
niająca
0.99
دانشنامهٔ
0.98
AndEndTag
0.96
tartalomajánló
0.93
NUMX
0.93
شهاد
0.92
^(@)
0.92
itſelf
0.90
$_(
0.89
Activations Density 0.937%