INDEX
Explanations
instances of the word "Beau" with varying activation strengths
the presence of the name "Megan" or variations of it in the text
New Auto-Interp
Negative Logits
orative
-0.63
eyes
-0.63
appers
-0.63
Cavaliers
-0.61
Attribution
-0.61
APS
-0.60
flares
-0.60
selves
-0.59
GOODMAN
-0.59
Wallet
-0.59
POSITIVE LOGITS
llah
1.22
gment
1.08
ction
1.06
qua
1.04
lette
0.98
fman
0.92
cel
0.88
lly
0.87
lla
0.87
cham
0.87
Activations Density 0.024%