INDEX
Explanations
This neuron never activates on any tokens, so it isn’t detecting any particular pattern.
New Auto-Interp
Negative Logits
خدم
-0.07
striped
-0.06
LP
-0.06
-policy
-0.06
Nez
-0.06
_teams
-0.06
rates
-0.06
tor
-0.06
sight
-0.06
turned
-0.06
POSITIVE LOGITS
اورزی
0.08
ै?
0.07
иде
0.06
brain
0.06
UserRepository
0.06
francais
0.06
civilizations
0.06
[]={0.06
制度
0.06
震
0.06
Activations Density 0.009%