INDEX
Explanations
negative sentiments or rejections of ideas
New Auto-Interp
Negative Logits
ymes
-0.17
EP
-0.15
vý
-0.15
Watcher
-0.14
orado
-0.14
دÙģØªØ±
-0.14
otta
-0.14
_CONTEXT
-0.14
isay
-0.14
ICES
-0.13
POSITIVE LOGITS
æķ´ä¸ª
0.25
overall
0.23
вообÑīе
0.23
overall
0.21
Overall
0.20
altogether
0.20
Overall
0.19
entire
0.19
Entire
0.17
ANDOM
0.16
Activations Density 0.219%