INDEX
Explanations
This neuron activates on topical content words—concrete nouns and domain-specific terms (e.g., objects, places, activities, and technical or subject-specific vocabulary).
New Auto-Interp
Negative Logits
,''
-0.07
erv
-0.07
''
-0.06
ฤด
-0.06
.""
-0.06
-0.06
.''
-0.06
��
-0.06
ufe
-0.06
ary
-0.06
POSITIVE LOGITS
itemId
0.08
AccessType
0.07
�
0.07
ốt
0.07
حرکت
0.07
ONG
0.07
Scott
0.07
ожет
0.07
.TableName
0.07
Jean
0.07
Activations Density 48.711%