INDEX
Explanations
pronouns
words related to historical military events and personal relationships.
This neuron activates on the third‐person plural pronoun “they.”
New Auto-Interp
Negative Logits
Locke
-0.07
доч
-0.06
indul
-0.06
rozen
-0.06
restaurant
-0.06
"We
-0.06
حف
-0.06
okens
-0.06
RTE
-0.06
Labs
-0.06
POSITIVE LOGITS
carbon
0.07
semiclassical
0.07
patient
0.06
/host
0.06
فرهنگی
0.06
_Util
0.06
_Val
0.06
няют
0.06
dto
0.06
敗
0.06
Activations Density 0.125%