INDEX
Explanations
phrases expressing gratitude
expressions of gratitude or appreciation
New Auto-Interp
Negative Logits
gren
-0.68
alse
-0.67
chem
-0.66
arc
-0.66
chemical
-0.66
idth
-0.64
alties
-0.63
ctl
-0.63
UGC
-0.62
paio
-0.61
POSITIVE LOGITS
contacting
0.96
patience
0.91
trusting
0.89
stopping
0.88
tuning
0.87
bothering
0.86
helping
0.86
letting
0.86
visiting
0.86
noticing
0.85
Activations Density 0.031%