INDEX
Explanations
occurrences of the pronoun "it" and related forms
New Auto-Interp
Negative Logits
emme
-0.17
ä»¶
-0.15
ije
-0.15
iders
-0.14
316
-0.14
ersen
-0.14
hg
-0.14
ilar
-0.14
ipers
-0.14
vice
-0.13
POSITIVE LOGITS
will
0.27
’ll
0.22
will
0.21
'll
0.21
WILL
0.21
time
0.18
promises
0.17
won
0.17
beh
0.17
vit
0.17
Activations Density 0.107%