INDEX
Explanations
This neuron activates on the phrase “on behalf of,” flagging expressions of representation or sponsorship.
New Auto-Interp
Negative Logits
Taş
-0.06
ارش
-0.06
гориз
-0.06
urger
-0.06
ایط
-0.06
/i
-0.06
IDisposable
-0.06
gắn
-0.06
-Dec
-0.06
"There
-0.06
POSITIVE LOGITS
behalf
0.15
.vo
0.07
ego
0.07
ля
0.07
heav
0.07
Campaign
0.07
To
0.07
번
0.07
代
0.07
representations
0.07
Activations Density 0.006%