INDEX
Explanations
The neuron activates on mentions of caregivers or trusted adult figures (e.g., parents, teachers, trusted adults) offering support.
New Auto-Interp
Negative Logits
prove
-0.07
uffle
-0.07
(round
-0.07
Greeks
-0.06
/cmd
-0.06
-double
-0.06
XII
-0.06
qué
-0.06
lust
-0.06
득
-0.06
POSITIVE LOGITS
ا
0.06
289
0.06
utura
0.06
único
0.06
esk
0.06
miştir
0.06
dr
0.06
gider
0.06
expressing
0.06
Juan
0.06
Activations Density 0.022%