INDEX
Explanations
instances of the pronoun "I"
references to the speaker or author within a context of providing instructions or sharing experiences
New Auto-Interp
Negative Logits
Jarrett
-0.63
ãĥĬ
-0.62
icist
-0.62
Mosul
-0.61
tnc
-0.60
minster
-0.59
marg
-0.59
Rodrig
-0.59
totality
-0.58
Pearson
-0.58
POSITIVE LOGITS
'm
1.43
've
1.23
suppose
1.08
'll
1.03
'd
1.02
EEE
1.00
am
0.98
WB
0.97
WI
0.96
nex
0.93
Activations Density 0.337%