INDEX
Explanations
words related to positive emotions, especially joy
instances of the word "joy" and related expressions of happiness
New Auto-Interp
Negative Logits
pta
-0.67
iban
-0.62
deductible
-0.60
conflic
-0.60
underest
-0.60
stockp
-0.59
fragmented
-0.59
comma
-0.58
Rhod
-0.58
Breach
-0.58
POSITIVE LOGITS
sticks
1.47
fully
1.31
ously
1.28
ous
1.16
joy
1.14
iously
1.05
ride
1.01
stick
0.98
son
0.96
ilee
0.93
Activations Density 0.016%