INDEX
Explanations
This neuron specifically detects the token “out,” as in phrases indicating something being removed or moved out.
New Auto-Interp
Negative Logits
موارد
-0.07
ände
-0.06
integrating
-0.06
uste
-0.06
meiner
-0.06
_LOW
-0.06
over
-0.06
memcpy
-0.06
uve
-0.06
��
-0.06
POSITIVE LOGITS
out
0.10
出
0.09
Out
0.09
出
0.08
нима
0.07
jar
0.07
.out
0.07
ΑΛ
0.07
iful
0.07
μά
0.07
Activations Density 0.026%