INDEX
Explanations
This neuron detects mentions of the game name “Battleship,” including its subword fragments (e.g. “battles” + “hip”).
New Auto-Interp
Negative Logits
/feed
-0.06
�
-0.06
wife
-0.06
Rams
-0.06
904
-0.06
attends
-0.06
�
-0.06
TableView
-0.06
dungeon
-0.06
.Pe
-0.06
POSITIVE LOGITS
ินการ
0.07
.invoke
0.07
communicating
0.07
forControlEvents
0.07
での
0.06
پای
0.06
vieux
0.06
continental
0.06
�
0.06
ahir
0.06
Activations Density 0.033%