INDEX
Explanations
user accounts and logins
This neuron activates on mentions of the login process—especially tokens like “login” or “login page” in a user-authentication context.
New Auto-Interp
Negative Logits
Reduce
-0.08
worst
-0.07
(site
-0.07
’est
-0.07
社
-0.07
(circle
-0.06
_walk
-0.06
Wander
-0.06
체
-0.06
яз
-0.06
POSITIVE LOGITS
}'",
0.06
businessmen
0.06
/$',
0.06
0.06
贵
0.06
Identifier
0.06
biliyor
0.06
.annotations
0.06
'}';↵
0.06
έργ
0.06
Activations Density 0.015%