INDEX
Explanations
online forum/technical discussions
This neuron detects tokens that mark the assistant speaker/role or assistant-turn indicators in a chat transcript.
New Auto-Interp
Negative Logits
LinkId
-0.07
DWORD
-0.06
计
-0.06
Pussy
-0.06
CID
-0.06
ethnicity
-0.06
target
-0.06
HID
-0.06
ouse
-0.06
plen
-0.06
POSITIVE LOGITS
invaders
0.07
свое
0.07
ativ
0.07
така
0.07
sentient
0.07
δρο
0.06
自己的
0.06
нен
0.06
/grpc
0.06
luyện
0.06
Activations Density 0.092%