INDEX
Explanations
someone actions or descriptions
New Auto-Interp
Negative Logits
an
1.75
ம்
1.58
y
1.52
a
1.30
ا
1.26
ură
1.20
iere
1.19
ações
1.18
ală
1.17
logrado
1.16
POSITIVE LOGITS
else
1.87
із
1.35
else
1.33
</
1.22
щодо
1.20
ဦ
1.20
факт
1.20
ੁ
1.20
montée
1.16
är
1.15
Activations Density 0.282%