INDEX
Explanations
individual
This neuron activates on the word “individual” (as in “individual recognition,” “individual titles,” etc.), marking mentions of individual awards or honors.
New Auto-Interp
Negative Logits
blue
-0.06
emojis
-0.06
trolling
-0.06
893
-0.06
нь
-0.06
Amsterdam
-0.06
Gün
-0.06
"`
-0.06
ópez
-0.06
RouteServiceProvider
-0.06
POSITIVE LOGITS
評
0.07
establishments
0.07
))));↵
0.07
_zone
0.07
""),
0.07
authDomain
0.06
legis
0.06
kein
0.06
adherence
0.06
-fix
0.06
Activations Density 0.003%