INDEX
Explanations
This neuron is essentially “dead” and never activates for any token.
New Auto-Interp
Negative Logits
وی
-0.07
дальней
-0.06
작
-0.06
timestamp
-0.06
Mos
-0.06
Lau
-0.06
astronomy
-0.06
生物
-0.06
Port
-0.06
議
-0.06
POSITIVE LOGITS
guardar
0.07
>-->↵
0.07
θεια
0.06
neod
0.06
avez
0.06
.slides
0.06
↵
0.06
Oct
0.06
.conn
0.06
ulf
0.06
Activations Density 0.015%