INDEX
Explanations
The neuron is primarily activated by mentions of “liver” (and closely related terms like “hepatic”) in the text.
New Auto-Interp
Negative Logits
endcode
-0.07
الشي
-0.07
clas
-0.07
ţi
-0.07
xấu
-0.07
iễ
-0.07
melody
-0.06
भग
-0.06
Done
-0.06
Thus
-0.06
POSITIVE LOGITS
liver
0.13
Liver
0.12
Liver
0.09
-INF
0.07
рес
0.07
Liverpool
0.07
logger
0.07
Living
0.07
刘
0.07
Liverpool
0.07
Activations Density 0.010%