INDEX
Explanations
This neuron detects the “Cause:” label (i.e. the token indicating the start of a cause field).
New Auto-Interp
Negative Logits
스트
-0.07
========
-0.07
Connect
-0.07
.lst
-0.06
ěž
-0.06
lines
-0.06
write
-0.06
rape
-0.06
unavoidable
-0.06
press
-0.06
POSITIVE LOGITS
��
0.07
UIButton
0.07
HasColumnName
0.07
kak
0.06
obic
0.06
Mojo
0.06
Joe
0.06
ิร
0.06
�
0.06
ून
0.06
Activations Density 0.011%