INDEX
Explanations
The neuron is primarily triggered by French subword tokens—especially those containing accented letters.
New Auto-Interp
Negative Logits
kennenlernen
-0.08
deep
-0.07
_ma
-0.07
द
-0.06
CEPTION
-0.06
desper
-0.06
gran
-0.06
Border
-0.06
iết
-0.06
>All
-0.06
POSITIVE LOGITS
.group
0.07
WillAppear
0.06
NW
0.06
(source
0.06
Habitat
0.06
.Ge
0.06
.Render
0.06
RenderingContext
0.06
_tpl
0.06
NSS
0.06
Activations Density 0.047%