INDEX
Explanations
internet forum posts
The neuron selectively activates on the definite article “the.”
New Auto-Interp
Negative Logits
诺
-0.06
ornado
-0.06
的小
-0.06
\Post
-0.06
mus
-0.05
ไลน
-0.05
trưởng
-0.05
positor
-0.05
κον
-0.05
navCtrl
-0.05
POSITIVE LOGITS
libre
0.07
stackpath
0.07
Sponsored
0.07
Çağ
0.07
resembles
0.06
-olds
0.06
,value
0.06
anth
0.06
.ACT
0.06
ac
0.06
Activations Density 0.009%