INDEX
Explanations
words related to personality traits and behaviors, specifically focusing on social interactions and physical characteristics
descriptors of social and personality traits
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.76
ioxide
-0.75
UNCH
-0.75
antha
-0.74
ERO
-0.74
ptions
-0.74
AMS
-0.73
ISA
-0.72
WIND
-0.71
ACA
-0.70
POSITIVE LOGITS
minded
1.21
spirited
0.92
ness
0.87
enough
0.86
paced
0.86
minded
0.78
nered
0.78
ly
0.77
sounding
0.76
hearted
0.76
Activations Density 0.297%