INDEX
Explanations
The neuron responds when the text is describing how things are arranged or positioned (e.g. “arrangement,” “order,” “neighboring,” “placement,” etc.).
New Auto-Interp
Negative Logits
sah
-0.07
.dim
-0.07
ne
-0.07
พน
-0.06
_when
-0.06
Nick
-0.06
pod
-0.06
plug
-0.06
rom
-0.06
ิธ
-0.06
POSITIVE LOGITS
>Contact
0.07
یدا
0.06
새로운
0.06
대상
0.06
ителем
0.06
showToast
0.06
пня
0.06
一个人
0.06
içindeki
0.06
شرقی
0.06
Activations Density 0.141%