INDEX
Explanations
The neuron detects Spanish imperative‐mood verbs used to give instructions (e.g., “Utiliza,” “Aplica,” “Usa”).
New Auto-Interp
Negative Logits
_idx
-0.07
platform
-0.07
religions
-0.06
Class
-0.06
paradigm
-0.06
FU
-0.06
McGregor
-0.06
explains
-0.06
.push
-0.06
mayacak
-0.06
POSITIVE LOGITS
Bew
0.06
ความค
0.06
投注
0.06
、この
0.06
Ậ
0.06
谱
0.06
.Ultra
0.06
Lib
0.06
บล
0.06
Applies
0.06
Activations Density 0.011%