INDEX
Explanations
The neuron doesn’t actually activate on any of the input tokens—it never fires.
New Auto-Interp
Negative Logits
warts
-0.07
nictví
-0.07
ΑΣ
-0.07
školy
-0.07
.tags
-0.07
aspers
-0.07
節
-0.06
알
-0.06
ULATOR
-0.06
isempty
-0.06
POSITIVE LOGITS
_su
0.06
conhe
0.06
^
0.06
elerinin
0.06
Statement
0.06
Tcp
0.05
//(
0.05
advertisers
0.05
जम
0.05
Ade
0.05
Activations Density 0.024%