INDEX
Explanations
This neuron detects French-language words (i.e. it activates on French tokens).
New Auto-Interp
Negative Logits
chicks
-0.07
eson
-0.07
dd
-0.07
.De
-0.07
ческих
-0.07
.Pop
-0.07
Blob
-0.07
Care
-0.06
Dodd
-0.06
Ta
-0.06
POSITIVE LOGITS
sebeb
0.07
ść
0.07
ammunition
0.07
vaping
0.06
challeng
0.06
.setObjectName
0.06
freopen
0.06
UseProgram
0.06
republik
0.06
μβ
0.06
Activations Density 0.045%