INDEX
Explanations
This neuron reliably activates on words related to “respond” (e.g. “respond,” “response”) in the context of coordinating or reacting to events.
New Auto-Interp
Negative Logits
usando
-0.06
Quest
-0.06
vf
-0.06
Found
-0.06
.separator
-0.06
pneumonia
-0.06
employers
-0.06
aberr
-0.06
ensuite
-0.06
fon
-0.06
POSITIVE LOGITS
exaggerated
0.09
Fam
0.07
Msp
0.07
IRECT
0.07
jam
0.06
cooperate
0.06
Int
0.06
_REV
0.06
Fortnite
0.06
cent
0.06
Activations Density 0.014%