INDEX
Explanations
The neuron strongly activates on occurrences of the standalone token “leg.”
New Auto-Interp
Negative Logits
SourceType
-0.07
320
-0.07
County
-0.07
Thu
-0.07
antioxidant
-0.07
696
-0.07
shut
-0.07
Huang
-0.06
Wu
-0.06
UUID
-0.06
POSITIVE LOGITS
leg
0.14
legs
0.13
Leg
0.10
Legs
0.10
knees
0.09
목
0.08
Lego
0.08
LEGO
0.08
Leg
0.08
leggings
0.08
Activations Density 0.011%