INDEX
Explanations
adverbs that modify actions
New Auto-Interp
Negative Logits
ÙıÙĨ
-0.15
θμ
-0.14
ÑĢаб
-0.14
akes
-0.14
kers
-0.14
874
-0.13
åı·
-0.13
-0.13
онÑĭ
-0.13
cate
-0.13
POSITIVE LOGITS
ürk
0.17
igh
0.16
à¹Ĩ
0.16
ahoma
0.15
_Style
0.15
manner
0.15
Booth
0.15
uluk
0.15
issa
0.15
/at
0.15
Activations Density 0.322%