INDEX
Explanations
The neuron strongly activates on discourse‐structuring and emphasis markers—words that introduce, enumerate, or highlight points (e.g. “One,” “Another,” “most,” “primary,” “also,” “important”).
New Auto-Interp
Negative Logits
dikke
-0.07
OMAP
-0.06
smile
-0.06
Це
-0.06
Seat
-0.06
sleeves
-0.06
plata
-0.06
ゃ
-0.06
ρυ
-0.06
validators
-0.06
POSITIVE LOGITS
deser
0.07
asurement
0.06
resize
0.06
overwrite
0.06
.today
0.06
(hours
0.06
ेज
0.06
.pres
0.06
-pr
0.06
studio
0.06
Activations Density 2.776%