INDEX
Explanations
The neuron activates on sports match outcome phrases—particularly words like “win” or “victory” together with the numeric score details.
New Auto-Interp
Negative Logits
Huang
-0.07
rampage
-0.06
balance
-0.06
ыва
-0.06
_ACCEPT
-0.06
bz
-0.06
/arm
-0.06
Sharon
-0.06
Atoms
-0.06
CHA
-0.06
POSITIVE LOGITS
حی
0.07
−
0.07
줄
0.06
.Horizontal
0.06
clickable
0.06
گذ
0.06
$model
0.06
Leipzig
0.06
واست
0.06
qty
0.06
Activations Density 0.017%