INDEX
Explanations
displays
The neuron fires on document‐formatting markers and metadata tokens (e.g. the various <|…|> tags and header/footer identifiers).
New Auto-Interp
Negative Logits
Zuk
-0.07
icích
-0.06
nodded
-0.06
-alist
-0.06
-0.06
Wednesday
-0.06
Aristotle
-0.06
adě
-0.06
yield
-0.06
achter
-0.06
POSITIVE LOGITS
.='
0.06
Δεν
0.06
Devlet
0.06
ρό
0.06
◎
0.06
={()=>0.06
่ก
0.06
ILogger
0.06
čas
0.06
inars
0.06
Activations Density 0.062%