INDEX
Explanations
This neuron activates on adverbs ending in “-ly.”
New Auto-Interp
Negative Logits
seek
-0.07
Saturday
-0.07
Vanderbilt
-0.07
Modules
-0.07
.st
-0.07
098
-0.06
057
-0.06
ignores
-0.06
-length
-0.06
pageSize
-0.06
POSITIVE LOGITS
eg
0.07
0.07
tão
0.07
но
0.06
too
0.06
How
0.06
Too
0.06
HOW
0.06
acr
0.06
niž
0.06
Activations Density 0.036%