INDEX
Explanations
negative words or phrases related to criticism or judgment
negative and positive descriptors related to health, behavior, and attributes
New Auto-Interp
Negative Logits
hower
-0.78
hander
-0.75
ioxide
-0.75
jen
-0.75
jam
-0.74
hemer
-0.72
ften
-0.71
aniel
-0.71
former
-0.70
hirt
-0.70
POSITIVE LOGITS
intentions
1.24
backgrounds
1.20
ambitions
1.16
aspirations
1.12
ties
1.11
qualities
1.11
interests
1.11
feelings
1.10
histories
1.08
motives
1.07
Activations Density 0.336%