INDEX
Explanations
The neuron never activates—it does not respond to any token.
New Auto-Interp
Negative Logits
inaug
-0.07
onus
-0.07
;width
-0.07
us
-0.06
Chile
-0.06
Chat
-0.06
카
-0.06
Low
-0.06
Cole
-0.06
Jose
-0.06
POSITIVE LOGITS
Benchmark
0.06
internal
0.06
Special
0.06
flexible
0.06
INSTANCE
0.06
ViewController
0.06
Arrange
0.06
gradual
0.06
ITOR
0.06
glBind
0.06
Activations Density 0.292%