INDEX
Explanations
Data representation
This neuron does not respond to any tokens (all activations are zero).
New Auto-Interp
Negative Logits
λλην
-0.06
िलत
-0.06
suger
-0.06
메
-0.06
Yak
-0.06
builder
-0.06
日期
-0.06
Davies
-0.06
afterEach
-0.06
curso
-0.06
POSITIVE LOGITS
.How
0.07
existing
0.07
primeira
0.07
November
0.07
Activation
0.07
ギ
0.07
_dims
0.06
disp
0.06
-width
0.06
faces
0.06
Activations Density 0.015%