INDEX
Explanations
This neuron primarily detects occurrences of the question word “Why.”
New Auto-Interp
Negative Logits
payload
-0.07
(news
-0.07
angle
-0.06
кого
-0.06
payload
-0.06
foreach
-0.06
simulation
-0.06
acoes
-0.06
.documentElement
-0.06
помощи
-0.06
POSITIVE LOGITS
ibling
0.07
defy
0.07
vests
0.07
vio
0.06
علت
0.06
irrig
0.06
Factor
0.06
čila
0.06
Kop
0.06
sebeb
0.06
Activations Density 0.027%