INDEX
Explanations
The neuron never fires—it doesn’t reliably detect any pattern in the text.
New Auto-Interp
Negative Logits
hlavu
-0.07
�
-0.06
razil
-0.06
sl
-0.06
Muham
-0.06
fv
-0.06
measurable
-0.06
ADATA
-0.06
dependent
-0.06
ms
-0.06
POSITIVE LOGITS
ές
0.07
_escape
0.07
_util
0.06
={},0.06
واقعی
0.06
قتل
0.06
proceeding
0.06
女
0.06
ньо
0.06
ülke
0.06
Activations Density 0.023%