INDEX
Explanations
Pronouns
This neuron activates on occurrences of the verb “ask,” i.e. it detects explicit requests.
New Auto-Interp
Negative Logits
sn
-0.07
Robot
-0.06
_note
-0.06
sm
-0.06
rit
-0.06
XM
-0.06
mænd
-0.06
AllowAnonymous
-0.06
_kses
-0.06
Lyons
-0.06
POSITIVE LOGITS
.readLine
0.07
asymmetric
0.06
obedient
0.06
↵↵↵↵↵↵↵↵↵
0.06
added
0.06
questa
0.06
":"+
0.06
'use
0.06
قر
0.06
_opcode
0.06
Activations Density 0.010%