INDEX
Explanations
phrases that start with "Those."
the word "Those" at various activation intensities
New Auto-Interp
Negative Logits
shapeshifter
-0.82
ILY
-0.81
ointment
-0.75
orate
-0.74
..."
-0.69
irth
-0.69
zo
-0.69
Drag
-0.67
½
-0.67
pless
-0.67
POSITIVE LOGITS
wishing
0.89
pesky
0.82
kinds
0.77
interested
0.77
thoughts
0.76
voices
0.74
resear
0.70
interviewed
0.70
wanting
0.67
sorts
0.67
Activations Density 0.057%