INDEX
Explanations
usage instructions
This neuron selectively activates on occurrences of the preposition “in.”
New Auto-Interp
Negative Logits
ابد
-0.07
j
-0.06
Friday
-0.06
.travel
-0.06
nesia
-0.06
tougher
-0.06
Plenty
-0.06
_QUESTION
-0.06
řit
-0.06
�
-0.06
POSITIVE LOGITS
faithful
0.07
_timeline
0.07
Ultimate
0.07
공
0.06
bức
0.06
accuse
0.06
uted
0.06
行政
0.06
aled
0.06
ordial
0.06
Activations Density 0.078%