INDEX
Explanations
phrases prompting action or investigation
phrases that initiate a suggestion or invitation to do something
New Auto-Interp
Negative Logits
stadt
-0.64
mouth
-0.59
ads
-0.58
pockets
-0.58
coded
-0.58
assian
-0.57
ious
-0.57
invasive
-0.56
yr
-0.56
carc
-0.55
POSITIVE LOGITS
Let
3.16
Let
2.34
Lets
2.15
LET
1.73
let
1.70
let
1.40
Suppose
1.38
Forget
1.38
LET
1.32
Take
1.28
Activations Density 0.012%