INDEX
Explanations
words related to personal experiences and emotions
phrases that describe distressing or uncomfortable situations
New Auto-Interp
Negative Logits
ranch
-0.89
abet
-0.78
riber
-0.76
Resp
-0.74
luaj
-0.71
ospons
-0.70
orsi
-0.69
soever
-0.69
raper
-0.68
backer
-0.67
POSITIVE LOGITS
unbearable
1.30
heartbreaking
1.22
unbelievable
1.21
humiliating
1.21
horrible
1.21
disgusting
1.20
scary
1.19
embarrassing
1.18
surreal
1.17
terrifying
1.14
Activations Density 0.175%