INDEX
Explanations
references to a specific person, "She."
repeated references to a female subject
New Auto-Interp
Negative Logits
Skydragon
-0.71
atory
-0.67
kefeller
-0.67
INGTON
-0.65
ilateral
-0.64
ugu
-0.63
Jindal
-0.62
folios
-0.61
Jesse
-0.61
invading
-0.59
POSITIVE LOGITS
pherd
1.34
pher
1.26
pard
1.20
ffield
1.10
ppard
1.09
athed
1.00
athing
1.00
'll
0.97
itage
0.96
Majesty
0.95
Activations Density 0.091%