INDEX
Explanations
benefit gain
The neuron activates on terms expressing advantage or improvement (e.g. “benefit,” “gain”).
New Auto-Interp
Negative Logits
دخ
-0.07
-desc
-0.06
_pass
-0.06
path
-0.06
Roberts
-0.06
Hector
-0.06
_TEMP
-0.06
destabil
-0.06
thế
-0.06
-neutral
-0.06
POSITIVE LOGITS
ruthless
0.07
setResult
0.06
oranı
0.06
(mouse
0.06
-lnd
0.06
(Sub
0.06
приготовить
0.06
эконом
0.06
cần
0.06
đem
0.06
Activations Density 0.025%