INDEX
Explanations
people's names or pronouns referring to specific individuals
references to individuals or entities defined by the pronoun "whom."
New Auto-Interp
Negative Logits
trap
-0.70
termin
-0.69
artifacts
-0.65
repre
-0.64
starting
-0.64
0100
-0.60
inhibitor
-0.59
hig
-0.59
pend
-0.59
pillar
-0.59
POSITIVE LOGITS
soever
2.13
she
0.91
he
0.90
we
0.89
they
0.85
thou
0.80
critics
0.80
Vanity
0.77
you
0.74
Chomsky
0.72
Activations Density 0.026%