INDEX
Explanations
The neuron is detecting occurrences of the “if” keyword (i.e. conditional statements).
New Auto-Interp
Negative Logits
うち
-0.08
nět
-0.07
EntryPoint
-0.07
kullanarak
-0.07
besteht
-0.06
lanma
-0.06
verir
-0.06
.Tele
-0.06
ниці
-0.06
�
-0.06
POSITIVE LOGITS
HUGE
0.07
ublished
0.07
AVAILABLE
0.07
Ui
0.06
TOM
0.06
_SIDE
0.06
prosecution
0.06
announcement
0.06
IF
0.06
Saw
0.06
Activations Density 0.020%