INDEX
Explanations
This neuron detects words and short phrases that signal the reader hasn’t yet done or seen something (negations like “haven’t,” “not,” “yet,” “already”).
New Auto-Interp
Negative Logits
DAY
-0.07
divisor
-0.07
Shoot
-0.06
Pass
-0.06
欧
-0.06
dress
-0.06
Self
-0.06
когда
-0.06
subscriptions
-0.06
Hard
-0.06
POSITIVE LOGITS
utilized
0.07
:↵
0.07
ourced
0.07
phận
0.06
đ
0.06
lenmiş
0.06
xAxis
0.06
všech
0.06
duğ
0.06
゙
0.06
Activations Density 0.010%