INDEX
Explanations
pronouns and reflexive pronouns referring to oneself or others
references to self and personal agency
New Auto-Interp
Negative Logits
Alger
-0.68
Vine
-0.66
Clover
-0.64
Noon
-0.64
Newspaper
-0.62
Mub
-0.61
Syndicate
-0.61
Lawyers
-0.60
PBS
-0.59
RIS
-0.59
POSITIVE LOGITS
selves
0.89
icz
0.68
worshipped
0.67
ãĥķ
0.67
selves
0.64
abl
0.64
çīĪ
0.63
ãĤĭ
0.63
anship
0.62
åĤ
0.62
Activations Density 0.038%