INDEX
Explanations
This neuron responds to numeric date markers (years or century designations) in the text.
New Auto-Interp
Negative Logits
saves
-0.07
bien
-0.07
-document
-0.06
เฟ
-0.06
,O
-0.06
saved
-0.06
persuade
-0.06
٬
-0.06
從
-0.06
impass
-0.06
POSITIVE LOGITS
ange
0.06
전용
0.06
제작
0.06
:+
0.06
eyebrows
0.06
разработ
0.06
zel
0.06
"""
0.06
.latest
0.06
0.06
Activations Density 0.023%