INDEX
Explanations
This neuron detects the discourse marker “Again,” when a response repeats or reinforces a point.
New Auto-Interp
Negative Logits
Delayed
-0.07
Parsons
-0.06
Roboto
-0.06
감사
-0.06
áng
-0.06
postponed
-0.06
issuer
-0.06
Emblem
-0.06
OUR
-0.06
캐
-0.06
POSITIVE LOGITS
_model
0.07
rotation
0.07
influences
0.07
.datasets
0.07
(cursor
0.06
固定
0.06
_SLAVE
0.06
materials
0.06
寒
0.06
grotes
0.06
Activations Density 0.013%