INDEX
Explanations
death and killing
The neuron activates on words describing killing or lethal methods (e.g. kill, killing, killed, death, painless, humane).
New Auto-Interp
Negative Logits
NA
-0.07
앙
-0.06
желуд
-0.06
ி
-0.06
Psych
-0.06
_GAP
-0.06
Matchers
-0.06
UIT
-0.06
sympathetic
-0.06
ESİ
-0.06
POSITIVE LOGITS
travelling
0.06
maritime
0.06
Updating
0.06
OLUM
0.06
sources
0.06
nbr
0.06
≈
0.06
Catholics
0.06
大家
0.06
installed
0.06
Activations Density 0.021%