INDEX
Explanations
responsible
The neuron activates on words that signal duty or function—terms like “providing” and “responsible.”
New Auto-Interp
Negative Logits
Mold
-0.07
escalated
-0.07
flyers
-0.07
line
-0.06
elevate
-0.06
26
-0.06
ngừng
-0.06
Atlas
-0.06
5
-0.06
угл
-0.06
POSITIVE LOGITS
responsible
0.13
Responsible
0.10
负责
0.09
responsable
0.09
RESPONS
0.08
responsibility
0.08
resultado
0.08
左
0.08
bos
0.08
responsibly
0.08
Activations Density 0.020%