INDEX
Explanations
The neuron flags authorial, first‐person commentary—especially phrases like “I thought,” “I decided,” or similar personal reflections.
New Auto-Interp
Negative Logits
erra
-0.07
adolu
-0.07
배송
-0.06
Yi
-0.06
الو
-0.06
ano
-0.06
Terra
-0.06
[↵↵
-0.06
RESSED
-0.06
ANO
-0.06
POSITIVE LOGITS
(MediaType
0.06
Kart
0.06
ражд
0.06
.UI
0.06
,或
0.06
bul
0.06
Hra
0.06
Sprite
0.06
.US
0.06
Search
0.05
Activations Density 0.019%