INDEX
Explanations
specific nouns and activities
This neuron spikes on individual tokens that are part of named entities or proper names (e.g. titles, character or product names, specialized jargon), effectively detecting proper nouns.
New Auto-Interp
Negative Logits
socalled
0.17
فريبي
0.15
groupBox
0.14
atthe
0.14
determinadas
0.14
cannot
0.14
'='
0.14
geheel
0.14
yattha
0.14
Elektrokhimiya
0.13
POSITIVE LOGITS
ד
0.23
ra
0.21
ן
0.20
গতকাল
0.19
ด
0.19
尔
0.19
ק
0.18
நேற்று
0.18
0
0.18
ע
0.18
Activations Density 0.703%