INDEX
Explanations
adjectives related to personality traits or behaviors
terms related to meticulousness and negative behaviors
New Auto-Interp
Negative Logits
Annotations
-0.83
Bundy
-0.82
eer
-0.79
Lans
-0.75
icides
-0.75
olin
-0.75
Pell
-0.73
rams
-0.71
yrim
-0.71
rill
-0.70
POSITIVE LOGITS
conscientious
0.90
bowel
0.73
ortment
0.71
urgical
0.69
dispos
0.69
cientious
0.68
object
0.66
byter
0.65
omal
0.64
ullivan
0.63
Activations Density 0.019%