INDEX
Explanations
phrases related to reactions or responses
words and phrases related to emotional responses or reactions
New Auto-Interp
Negative Logits
heses
-0.77
oln
-0.76
utical
-0.72
uum
-0.72
ourage
-0.68
prints
-0.65
cipl
-0.64
arc
-0.64
gdala
-0.64
ln
-0.62
POSITIVE LOGITS
hearing
1.08
witnessing
0.97
seeing
0.96
sudden
0.90
suggestion
0.86
discovering
0.85
losing
0.84
impending
0.81
perceived
0.81
recalling
0.80
Activations Density 0.808%