INDEX
Explanations
It seems that Neuron 4 is not finding anything specific in the text provided, as evident by all activation values being zero. Therefore, no behavior or pattern can be determined from the given activations
New Auto-Interp
Negative Logits
rer
-0.73
Travels
-0.65
LER
-0.65
Stellar
-0.64
DAM
-0.62
inka
-0.61
Conrad
-0.60
Downing
-0.59
erd
-0.58
Canal
-0.58
POSITIVE LOGITS
avorite
0.73
omed
0.69
ury
0.64
peg
0.64
uro
0.63
deen
0.63
Fors
0.63
ournal
0.62
assets
0.62
ennis
0.60
Activations Density 0.000%
No Known Activations
This feature has no known activations.