INDEX
Explanations
This neuron activates on tokens (across different languages) that correspond to the adverb “yesterday.”
New Auto-Interp
Negative Logits
exclus
-0.06
стороны
-0.06
FilePath
-0.06
unnecessary
-0.06
brand
-0.06
には
-0.06
count
-0.06
Rectangle
-0.06
lot
-0.06
downloads
-0.06
POSITIVE LOGITS
΄
0.07
,False
0.07
ób
0.07
defaultstate
0.07
"><
0.07
.Gray
0.07
/z
0.06
�
0.06
нять
0.06
scope
0.06
Activations Density 0.051%