INDEX
Explanations
expressions or mentions of regret
expressions of regret
New Auto-Interp
Negative Logits
ammy
-0.78
ipt
-0.71
ymph
-0.66
dotted
-0.66
arnaev
-0.66
IPS
-0.65
chool
-0.63
iop
-0.62
Appalachian
-0.61
IP
-0.61
POSITIVE LOGITS
regrets
1.30
regret
1.25
regretted
1.12
fully
1.00
remorse
0.90
imaru
0.84
gratification
0.84
ful
0.84
fulness
0.82
faced
0.81
Activations Density 0.006%