INDEX
Explanations
negative sentiment
The neuron strongly activates on anonymized name‐placeholders (tokens like “NAME_1,” “NAME_2,” etc.), i.e. it detects those dummy name tokens.
New Auto-Interp
Negative Logits
arus
-0.07
threw
-0.07
حو
-0.07
JW
-0.07
beck
-0.06
holes
-0.06
,result
-0.06
ΣΤ
-0.06
xford
-0.06
data
-0.06
POSITIVE LOGITS
TagName
0.07
Tib
0.07
/:
0.06
Rename
0.06
EIF
0.06
ivr
0.06
_buff
0.06
clientId
0.06
.testng
0.06
_published
0.06
Activations Density 0.042%