INDEX
Explanations
terms related to negative experiences or perceptions
Negative sentiment or failure
negative outcomes
New Auto-Interp
Negative Logits
atown
-0.53
uninterrupted
-0.50
сшта
-0.48
]='\
-0.48
ESTON
-0.47
undamaged
-0.46
estekak
-0.45
undisturbed
-0.45
smooth
-0.45
<>(
-0.45
POSITIVE LOGITS
worse
0.77
poor
0.72
incapable
0.72
Worse
0.70
ineffective
0.68
ácara
0.67
unworthy
0.66
Worse
0.65
unable
0.65
incapa
0.64
Activations Density 0.997%