INDEX
Explanations
words related to negative emotions or situations, specifically those that evoke a sense of distress or discomfort
words associated with feelings of distress or discomfort
New Auto-Interp
Negative Logits
theless
-0.88
swick
-0.73
ton
-0.72
tes
-0.72
fter
-0.70
deen
-0.69
LOAD
-0.69
procedural
-0.68
glers
-0.67
FIELD
-0.67
POSITIVE LOGITS
anced
1.26
ribut
1.16
illery
1.07
ancing
1.05
ributes
1.04
enfranch
1.04
astrous
0.98
aste
0.97
illation
0.97
ressing
0.90
Activations Density 0.010%