INDEX
Explanations
The neuron fires on occurrences of “cheese” (in particular the “ese” suffix), effectively spotting when the text mentions cheese.
New Auto-Interp
Negative Logits
タ
-0.07
_allocator
-0.07
manifested
-0.07
outline
-0.07
ategor
-0.06
Sector
-0.06
(rect
-0.06
cks
-0.06
Fight
-0.06
Muk
-0.06
POSITIVE LOGITS
cheese
0.12
Cheese
0.10
cheeses
0.10
chees
0.08
ÜNİVERSİTESİ
0.08
verde
0.07
rese
0.07
cheesy
0.07
最高
0.07
standardUserDefaults
0.07
Activations Density 0.004%