INDEX
Explanations
prepositions
This neuron primarily activates on the token “In” when it appears at the beginning of a sentence.
New Auto-Interp
Negative Logits
salida
-0.07
charming
-0.07
정부
-0.07
майбут
-0.06
До
-0.06
зд
-0.06
dismissed
-0.06
Let
-0.06
*,
-0.06
Ч
-0.06
POSITIVE LOGITS
UEL
0.07
ortadan
0.06
_RES
0.06
Short
0.06
restricting
0.06
orig
0.06
Cupertino
0.06
IMARY
0.06
asyarak
0.06
meio
0.06
Activations Density 0.085%