INDEX
Explanations
negative sentiments towards entertainment or media, particularly movies
New Auto-Interp
Negative Logits
ãĥ³ãĥĸ
-0.08
CLUDING
-0.06
_GF
-0.06
_SW
-0.06
yny
-0.06
lẫn
-0.06
ÑĤÑĢо
-0.06
563
-0.06
اÙĦÙĩ
-0.06
_TS
-0.06
POSITIVE LOGITS
even
0.31
even
0.26
Even
0.24
Even
0.23
EVEN
0.23
даже
0.19
sogar
0.19
_even
0.18
навÑĸÑĤÑĮ
0.16
incluso
0.16
Activations Density 0.051%