INDEX
Explanations
cheering, support
This neuron detects phrases expressing personal support or cheering (e.g., “root for my favorite players,” “cheering on …”).
New Auto-Interp
Negative Logits
trailer
-0.07
))))
-0.07
find
-0.06
mnoha
-0.06
睡
-0.06
/Error
-0.06
kde
-0.06
hroz
-0.06
"/"
-0.06
))),
-0.06
POSITIVE LOGITS
stood
0.07
إذ
0.07
onward
0.07
extinct
0.07
Separated
0.06
puty
0.06
.out
0.06
consolidated
0.06
dete
0.06
کت
0.06
Activations Density 0.004%