INDEX
Explanations
personal pronouns followed by strong affirmations or beliefs
instances of the pronoun "I," indicating a strong focus on personal statements and opinions
New Auto-Interp
Negative Logits
Cutter
-0.66
Violet
-0.62
delta
-0.61
OG
-0.60
rules
-0.59
Sodium
-0.58
Temperature
-0.58
noses
-0.58
screen
-0.58
pans
-0.57
POSITIVE LOGITS
'm
1.23
am
1.04
nex
1.02
've
0.94
suppose
0.91
presume
0.89
rene
0.88
suspect
0.88
sth
0.87
AAF
0.84
Activations Density 0.321%