INDEX
Explanations
phrases that indicate possession or experience, particularly with the verb "has"
New Auto-Interp
Negative Logits
507
-0.16
å¥ı
-0.15
eve
-0.15
sm
-0.15
uze
-0.14
PERT
-0.14
ستاÙĨÛĮ
-0.14
Temmuz
-0.14
ILING
-0.13
ÃŁen
-0.13
POSITIVE LOGITS
ĵ¨
0.15
CONTROL
0.14
avana
0.14
otron
0.14
iable
0.14
ayscale
0.13
erli
0.13
kov
0.13
itals
0.13
otto
0.13
Activations Density 0.039%