INDEX
Explanations
sentences expressing positive opinions about individuals, particularly in a professional context
expressions of admiration and appreciation for individuals
New Auto-Interp
Negative Logits
unspecified
-0.62
specified
-0.60
-0.59
congressional
-0.54
illance
-0.53
unprotected
-0.53
-0.53
orum
-0.52
unlawful
-0.52
unconstitutional
-0.52
POSITIVE LOGITS
mentors
0.76
compliment
0.75
compliments
0.73
humility
0.73
empath
0.66
Favorite
0.66
inspirational
0.65
motivate
0.65
ment
0.65
laughs
0.65
Activations Density 1.023%