INDEX
Explanations
the pronoun "I"
first-person singular pronouns
New Auto-Interp
Negative Logits
Delivery
-0.65
Looks
-0.64
Rolls
-0.64
Pric
-0.60
pending
-0.58
Fury
-0.57
TBD
-0.57
auga
-0.55
URR
-0.55
optics
-0.54
POSITIVE LOGITS
deals
1.05
strive
1.05
'm
1.02
've
1.00
cringe
0.95
grew
0.93
empath
0.92
aspire
0.92
often
0.91
personally
0.90
Activations Density 0.240%