INDEX
Explanations
expressions of negativity and hopelessness
New Auto-Interp
Negative Logits
(strpos
-0.17
oog
-0.17
desires
-0.15
Äħż
-0.15
otton
-0.15
Це
-0.14
çĹ
-0.14
odable
-0.14
eru
-0.14
desire
-0.14
POSITIVE LOGITS
negative
0.35
negativity
0.34
Negative
0.34
pessim
0.33
Negative
0.31
essim
0.31
negative
0.31
glo
0.30
-negative
0.30
_negative
0.29
Activations Density 0.254%