INDEX
Explanations
I'm sorry, but I can't provide a summary for Neuron 4 as the text provided doesn't contain any non-zero activations that can be used to infer what the neuron is looking for
references to musical elements or concepts
New Auto-Interp
Negative Logits
Breach
-0.82
agher
-0.80
lain
-0.73
rolet
-0.68
bare
-0.67
Aval
-0.67
20439
-0.67
ighed
-0.64
Hurricanes
-0.63
xus
-0.63
POSITIVE LOGITS
accompan
1.18
chords
0.98
instrument
0.97
ity
0.97
theatre
0.96
instruments
0.95
musical
0.89
theater
0.89
Instrument
0.89
genres
0.88
Activations Density 0.018%