INDEX
Explanations
the name "Don" with different emphasis scores
the word "don" in various contexts
New Auto-Interp
Negative Logits
Flight
-0.66
LCS
-0.65
EStream
-0.65
ULT
-0.64
TIT
-0.63
Healing
-0.61
NIGHT
-0.61
Nightmare
-0.60
Instruction
-0.60
Claw
-0.60
POSITIVE LOGITS
don
1.06
nell
1.05
nie
1.03
nel
1.01
etheless
0.99
neau
0.97
kie
0.97
ates
0.95
stant
0.94
ctory
0.94
Activations Density 0.007%