INDEX
Explanations
The neuron activates on the word “dummy” (in any context or casing).
New Auto-Interp
Negative Logits
Catherine
-0.07
mediation
-0.07
_report
-0.07
Network
-0.07
karış
-0.06
process
-0.06
arriving
-0.06
infection
-0.06
play
-0.06
感到
-0.06
POSITIVE LOGITS
dummy
0.11
dummy
0.09
(dummy
0.08
Dummy
0.08
.stub
0.08
Null
0.07
Dummy
0.07
DMIN
0.07
stub
0.07
_dummy
0.07
Activations Density 0.003%