INDEX
Explanations
expressions or statements conveying regret or disappointment
negative sentiments and expressions of disappointment
New Auto-Interp
Negative Logits
natureconservancy
-0.86
Rated
-0.74
figure
-0.72
iership
-0.67
é¾
-0.66
sbm
-0.66
cffffcc
-0.65
ivated
-0.65
appro
-0.65
noticed
-0.64
POSITIVE LOGITS
imaru
0.78
Olivier
0.63
soever
0.63
Ares
0.62
Toledo
0.59
Nero
0.59
Adrian
0.59
ा
0.58
Corrections
0.58
THAT
0.58
Activations Density 0.133%