INDEX
Explanations
Health problems
This neuron responds to mentions of extreme body‐weight figures and their associated severe health or mobility consequences.
New Auto-Interp
Negative Logits
쟁
-0.07
Teach
-0.07
Size
-0.07
anken
-0.06
�
-0.06
sad
-0.06
ivid
-0.06
EQUAL
-0.06
gains
-0.06
vượt
-0.06
POSITIVE LOGITS
eleg
0.07
Р
0.06
菲
0.06
unreliable
0.06
daddy
0.06
'"+
0.06
pří
0.06
renamed
0.06
Burma
0.06
guards
0.05
Activations Density 0.006%