INDEX
Explanations
proper nouns, particularly female names
references to a specific female character or subject in the narrative
New Auto-Interp
Negative Logits
INGTON
-0.75
Skydragon
-0.72
ornia
-0.66
SPONSORED
-0.66
atory
-0.65
vernment
-0.64
assing
-0.63
CCC
-0.62
church
-0.62
kefeller
-0.62
POSITIVE LOGITS
pherd
1.50
pher
1.39
pard
1.29
ldon
1.25
ppard
1.18
ffield
1.13
athing
1.12
ikh
1.04
athed
0.98
lly
0.92
Activations Density 0.070%