INDEX
Explanations
the speaker's first-person self-references (instances of "I" and its conjugated/contracted forms).
New Auto-Interp
Negative Logits
32
-0.08
660
-0.07
66
-0.07
show
-0.07
494
-0.07
-out
-0.07
_net
-0.07
00
-0.07
over
-0.07
around
-0.07
POSITIVE LOGITS
I
0.26
I
0.19
"I
0.16
i
0.15
,I
0.15
“I
0.14
—I
0.14
-I
0.14
.I
0.14
(I
0.14
Activations Density 0.516%