INDEX
Explanations
expressions of desire or intention related to preferences and actions
New Auto-Interp
Negative Logits
isma
-0.21
pedia
-0.18
doz
-0.17
eless
-0.15
swick
-0.15
vrier
-0.14
addock
-0.14
ÄĻd
-0.14
Çİ
-0.14
uming
-0.14
POSITIVE LOGITS
else
0.21
who
0.20
except
0.19
Except
0.17
except
0.16
whom
0.16
who
0.15
Except
0.15
including
0.15
è°ģ
0.15
Activations Density 0.093%