INDEX
Explanations
words related to emotional responses or feedback
phrases related to responses or reactions to events or situations
New Auto-Interp
Negative Logits
hold
-0.71
ramer
-0.70
ffe
-0.68
locked
-0.68
fusc
-0.65
enture
-0.64
orah
-0.63
inav
-0.63
mortg
-0.62
Sinai
-0.62
POSITIVE LOGITS
reaction
1.24
reactions
1.22
ivation
1.09
ivated
1.05
Reaction
1.05
aries
1.02
ivating
0.95
naires
0.88
naire
0.85
responses
0.85
Activations Density 0.018%