INDEX
Explanations
references to individuals or groups within a narrative
references to individuals or groups
New Auto-Interp
Negative Logits
Racine
-0.93
uſe
-0.86
cauſe
-0.83
purpoſe
-0.79
pleaſure
-0.79
cys
-0.77
ſuch
-0.77
leaſt
-0.75
Conci
-0.74
stiefel
-0.74
POSITIVE LOGITS
him
1.36
them
1.29
Him
1.28
Him
1.26
Them
1.25
HIM
1.24
THEM
1.20
them
1.14
herself
1.13
Them
1.13
Activations Density 0.121%