INDEX
Explanations
references to the word "it" and its variations, indicating a focus on pronouns and their usage in context
New Auto-Interp
Negative Logits
Keller
-0.71
Reentrant
-0.71
ressional
-0.70
anhyd
-0.70
Browne
-0.70
twe
-0.69
demos
-0.69
Keller
-0.69
dermal
-0.68
Sopho
-0.68
POSITIVE LOGITS
it
1.19
its
1.09
It
1.05
itself
1.02
It
0.99
它
0.98
itself
0.94
Its
0.91
它
0.91
Its
0.89
Activations Density 1.819%