INDEX
Explanations
The neuron activates on mentions of “Cup” (and its language‐specific variants) in the names of cup competitions.
New Auto-Interp
Negative Logits
ineTransform
-0.07
aboard
-0.07
bereits
-0.07
aversal
-0.06
�
-0.06
.svg
-0.06
livest
-0.06
부
-0.06
Hollande
-0.06
الخاصة
-0.06
POSITIVE LOGITS
پي
0.07
праці
0.07
-thinking
0.06
のお
0.06
cider
0.06
recurse
0.06
uggested
0.06
_geometry
0.06
common
0.06
Troll
0.06
Activations Density 0.006%