INDEX
Explanations
The main thing this neuron does is find words related to consumables
terms related to consumption and consumption-related concepts
New Auto-Interp
Negative Logits
iasis
-0.75
unal
-0.70
gypt
-0.68
ournal
-0.67
ugen
-0.66
eye
-0.66
bos
-0.65
zeb
-0.65
ness
-0.64
tery
-0.64
POSITIVE LOGITS
mates
0.84
mate
0.79
MER
0.78
mon
0.76
çĦ
0.75
Bam
0.71
GAME
0.70
CAST
0.69
Surviv
0.69
mer
0.68
Activations Density 0.019%