INDEX
Explanations
expressions of dissatisfaction or grievances
New Auto-Interp
Negative Logits
VIC
-0.17
lernen
-0.16
lify
-0.16
upt
-0.15
ales
-0.15
witch
-0.15
oping
-0.15
exp
-0.15
æĪ
-0.14
flush
-0.14
POSITIVE LOGITS
ICTURE
0.16
ylon
0.15
iskey
0.15
ably
0.15
currentColor
0.15
oppins
0.15
achts
0.14
окол
0.14
antd
0.14
areth
0.14
Activations Density 0.026%