INDEX
Explanations
strangers
This neuron fires strongly on mentions of “strangers” (and related contexts of encountering or interacting with unknown people).
New Auto-Interp
Negative Logits
opcion
-0.07
corrupt
-0.07
Thái
-0.07
एव
-0.07
centro
-0.06
Christmas
-0.06
Gay
-0.06
amo
-0.06
陳
-0.06
Houses
-0.06
POSITIVE LOGITS
DDR
0.07
assertInstanceOf
0.06
descon
0.06
stranger
0.06
abling
0.06
omore
0.06
strangers
0.06
.By
0.06
_STYLE
0.06
(bodyParser
0.06
Activations Density 0.019%