INDEX
Explanations
Identifiers involving 2 or letters
The neuron selectively activates on the placeholder token referring to the second character (NAME_2).
New Auto-Interp
Negative Logits
pain
-0.07
Martins
-0.06
món
-0.06
Ant
-0.06
analyse
-0.06
Iss
-0.06
polarization
-0.06
风
-0.06
-awesome
-0.06
_Un
-0.06
POSITIVE LOGITS
.V
0.07
월
0.07
532
0.07
work
0.07
."<
0.07
.ov
0.06
ležit
0.06
qx
0.06
κρι
0.06
result
0.06
Activations Density 0.044%