INDEX
Explanations
The neuron primarily activates on words related to children or child‐focused topics.
New Auto-Interp
Negative Logits
bled
-0.07
brigade
-0.07
ith
-0.06
'b
-0.06
"]))
-0.06
AYER
-0.06
zione
-0.06
REW
-0.06
ël
-0.06
.BLL
-0.06
POSITIVE LOGITS
лючается
0.07
blue
0.07
-alert
0.06
Ultra
0.06
FetchType
0.06
populate
0.06
vů
0.06
ヒ
0.06
utan
0.06
Proto
0.06
Activations Density 0.022%