INDEX
Explanations
phrases related to intense emotions or situations
ellipses or pauses in text
New Auto-Interp
Negative Logits
merit
-0.74
opath
-0.69
attest
-0.67
scapego
-0.64
wielded
-0.64
heaviest
-0.64
replacement
-0.63
complement
-0.63
depletion
-0.62
adulthood
-0.62
POSITIVE LOGITS
BUT
1.09
yet
1.06
why
1.05
until
1.05
oops
1.02
except
0.97
yeah
0.97
unless
0.96
whatever
0.96
they
0.95
Activations Density 0.044%