INDEX
Explanations
negative sentiments or expressions of doubt and resistance in contexts of social or personal dynamics
New Auto-Interp
Negative Logits
abar
-0.17
PURE
-0.15
åģ¥
-0.15
Giov
-0.15
Bik
-0.14
ancell
-0.14
shint
-0.14
μÎŃν
-0.13
ntl
-0.13
iggins
-0.13
POSITIVE LOGITS
nor
0.19
fore
0.15
bout
0.15
Nor
0.15
Nor
0.14
ÃŃcia
0.14
lef
0.14
reak
0.14
gesch
0.14
_HIDE
0.14
Activations Density 0.583%