INDEX
Explanations
This neuron detects discourse markers that signal a change or reversal in state (e.g. “no longer,” “has changed,” “now”).
New Auto-Interp
Negative Logits
Why
-0.07
eken
-0.07
heel
-0.07
radius
-0.06
grep
-0.06
Equivalent
-0.06
получения
-0.06
ih
-0.06
Retry
-0.06
recipro
-0.06
POSITIVE LOGITS
dece
0.06
zajist
0.06
asmine
0.06
Львів
0.06
footsteps
0.06
pří
0.06
Odkazy
0.06
шлях
0.06
_fonts
0.06
ंप
0.06
Activations Density 0.034%