INDEX
Explanations
The neuron primarily activates on the sentence-initial third-person male pronoun “He.”
New Auto-Interp
Negative Logits
Where
-0.07
线
-0.06
referral
-0.06
دیگر
-0.06
карти
-0.06
λιο
-0.06
enary
-0.06
samot
-0.06
(Create
-0.06
büny
-0.06
POSITIVE LOGITS
wrestler
0.07
kiss
0.07
Eag
0.06
[@
0.06
//{↵0.06
Headquarters
0.06
Marketing
0.06
년에는
0.06
участ
0.06
θμ
0.06
Activations Density 0.032%