INDEX
Explanations
question starters like because, who, porque
New Auto-Interp
Negative Logits
-2.63
чем
-2.11
favoritas
-2.11
躂
-2.09
But
-2.09
ཫ
-2.09
焢
-2.08
擼
-2.05
เพราะ
-2.02
と言われる
-1.98
POSITIVE LOGITS
0
2.41
{2.16
to
2.09
$
2.00
You
1.99
的人物
1.94
”
1.93
buses
1.90
;
1.89
遴
1.88
Activations Density 0.000%