INDEX
Explanations
phrases related to reactions and responses in a variety of contexts
responses characterized by strong emotional reactions
New Auto-Interp
Negative Logits
Inher
-0.72
ritten
-0.70
ceilings
-0.65
inherited
-0.61
tenance
-0.60
Reincarnated
-0.59
properties
-0.59
Limit
-0.58
prized
-0.58
ledger
-0.58
POSITIVE LOGITS
incred
0.79
disbelief
0.76
affirmative
0.75
fury
0.72
hostility
0.72
kindness
0.71
Cancel
0.70
response
0.70
denial
0.68
rique
0.68
Activations Density 0.152%