INDEX
Explanations
negative evaluations or disappointments related to experiences and expectations
New Auto-Interp
Negative Logits
edback
-0.17
ossa
-0.16
_tid
-0.15
ạp
-0.15
ίÏĦ
-0.14
uhn
-0.13
redd
-0.13
itmap
-0.13
ZO
-0.13
iny
-0.13
POSITIVE LOGITS
actually
0.16
à¹Ħà¸ĭ
0.15
enton
0.14
å®ŀéĻħ
0.14
Nos
0.14
860
0.14
kla
0.14
asal
0.14
nos
0.14
otle
0.13
Activations Density 0.335%