INDEX
Explanations
pronouns and descriptors related to physical appearance and personality traits
pronouns referring to male and female characters, indicating the presence and actions of individuals
New Auto-Interp
Negative Logits
Untitled
-0.81
earch
-0.75
atlantic
-0.71
IRE
-0.71
committee
-0.68
HAM
-0.65
wrong
-0.65
amaz
-0.64
seen
-0.63
Horowitz
-0.63
POSITIVE LOGITS
possesses
1.11
excel
1.08
enjoys
1.03
earns
1.00
ooz
1.00
communicates
0.99
thri
0.99
emits
0.98
'll
0.95
rarely
0.95
Activations Density 0.395%