INDEX
Explanations
The neuron activates on short English prepositions (like “in,” “at,” “on”) that introduce locative or temporal phrases.
New Auto-Interp
Negative Logits
риф
-0.07
TURN
-0.06
WithData
-0.06
shint
-0.06
tup
-0.06
�
-0.06
RestController
-0.06
رس
-0.06
мол
-0.06
_FORWARD
-0.06
POSITIVE LOGITS
sonrası
0.07
[+
0.07
.pre
0.06
_poly
0.06
Stone
0.06
.Cos
0.06
ráž
0.06
varias
0.06
duvar
0.06
Occup
0.06
Activations Density 0.595%