INDEX
Explanations
Summarization/core information
This neuron detects the “Yes”/“No” answer prompt—specifically the quoted options “Yes” and “No” in the question instruction.
New Auto-Interp
Negative Logits
accidental
-0.07
regain
-0.07
Ult
-0.07
_usec
-0.07
extradition
-0.06
MED
-0.06
Loan
-0.06
imus
-0.06
evity
-0.06
效
-0.06
POSITIVE LOGITS
0.07
-return
0.06
объем
0.06
0.06
(n
0.06
>r
0.06
comparer
0.06
poon
0.06
Scalars
0.06
abilidad
0.06
Activations Density 0.011%