INDEX
Explanations
multiple languages
The neuron is primarily detecting subword tokens containing French diacritical marks (accented characters).
New Auto-Interp
Negative Logits
dosp
-0.07
SHOW
-0.06
Pam
-0.06
Kush
-0.06
Status
-0.06
spiders
-0.06
ंतर
-0.06
foul
-0.06
unkt
-0.06
Interesting
-0.06
POSITIVE LOGITS
mainwindow
0.07
Supported
0.07
rua
0.06
Textures
0.06
ész
0.06
.nz
0.06
Resource
0.06
.empty
0.06
axs
0.06
deceased
0.06
Activations Density 0.006%