INDEX
Explanations
The neuron specifically activates on mentions of “player” and related forms (e.g. “players,” “multiplayer”).
New Auto-Interp
Negative Logits
############
-0.07
/cm
-0.07
、『
-0.06
inserts
-0.06
gregator
-0.06
_definitions
-0.06
write
-0.06
resources
-0.06
Bookmark
-0.06
discounts
-0.06
POSITIVE LOGITS
mqtt
0.08
Players
0.08
.Player
0.07
players
0.07
♠
0.07
Liam
0.07
玩家
0.07
players
0.06
شیر
0.06
иг
0.06
Activations Density 0.014%