INDEX
Explanations
The neuron primarily responds to occurrences of the verb “do.”
New Auto-Interp
Negative Logits
Needless
-0.07
left
-0.07
Establish
-0.06
需要
-0.06
meg
-0.06
अर
-0.06
_Internal
-0.06
nurs
-0.06
*)&
-0.06
_Vert
-0.06
POSITIVE LOGITS
LOSS
0.07
deline
0.07
Fischer
0.06
Servlet
0.06
orque
0.06
alleles
0.06
opt
0.06
hành
0.06
oloj
0.06
.sqlite
0.06
Activations Density 0.002%