INDEX
Explanations
separate
The neuron activates on words that denote political sovereignty or status—e.g. “independent,” “separate,” “nation,” “country,” and “state.”
New Auto-Interp
Negative Logits
incidence
-0.06
sunset
-0.06
tank
-0.06
weit
-0.06
下的
-0.06
počtu
-0.06
Purs
-0.06
Bring
-0.06
angement
-0.06
disse
-0.06
POSITIVE LOGITS
stratej
0.07
哲
0.07
대해서
0.07
živ
0.07
그래서
0.07
Installer
0.06
pdev
0.06
sidelined
0.06
ném
0.06
Lum
0.06
Activations Density 0.017%