INDEX
Explanations
references to a specific person or entity in a position of authority
references to the pronoun "she."
New Auto-Interp
Negative Logits
srfAttach
-0.71
erection
-0.63
INGTON
-0.61
undo
-0.60
Jr
-0.60
ENSE
-0.59
GEAR
-0.59
ustom
-0.59
VERTIS
-0.59
XIII
-0.59
POSITIVE LOGITS
pher
1.53
athed
1.40
pard
1.39
pherd
1.38
ffield
1.37
athing
1.34
ldon
1.30
ikh
1.20
lled
1.19
bang
1.16
Activations Density 0.079%