INDEX
Explanations
evaluative statements with positive connotations
instances of positive evaluations of individuals or their performances
New Auto-Interp
Negative Logits
selves
-0.73
attRot
-0.68
),"
-0.68
urers
-0.67
".[
-0.64
.''.
-0.63
selves
-0.63
collectively
-0.62
candidates
-0.61
"—
-0.60
POSITIVE LOGITS
himself
0.99
understatement
0.86
pseudonym
0.79
blogging
0.78
POV
0.76
vocals
0.73
eloqu
0.72
herself
0.72
his
0.71
Himself
0.71
Activations Density 1.212%