INDEX
Explanations
choreography
The neuron activates on mentions of choreography or choreographic roles (e.g. “choreographer,” “choreographing,” and related dance‐creation terms”).
New Auto-Interp
Negative Logits
riends
-0.07
reduction
-0.06
Samar
-0.06
семей
-0.06
rewind
-0.06
padx
-0.06
ödem
-0.06
.LAZY
-0.06
BITTE
-0.06
맥
-0.06
POSITIVE LOGITS
chore
0.07
HG
0.06
télé
0.06
Homework
0.06
oub
0.06
shallow
0.06
chores
0.06
_emb
0.06
Chore
0.06
.zeros
0.06
Activations Density 0.001%