INDEX
Explanations
feelings or emotions in text
expressions of emotions or feelings
New Auto-Interp
Negative Logits
cedes
-0.80
afort
-0.72
annis
-0.69
nect
-0.69
itors
-0.66
wald
-0.66
estyles
-0.65
umbn
-0.65
aciously
-0.65
Cosponsors
-0.63
POSITIVE LOGITS
that
0.86
sympathy
0.80
of
0.80
feeling
0.76
sadness
0.75
satisfaction
0.72
compulsion
0.72
impression
0.71
inev
0.71
enthusi
0.70
Activations Density 0.076%