INDEX
Explanations
negative
This neuron fires on occurrences of the word “negative” when used in the common two-word adjective “non-negative.”
New Auto-Interp
Negative Logits
imported
-0.07
Burada
-0.06
ordes
-0.06
Reducer
-0.06
_backup
-0.06
AllWindows
-0.06
/menu
-0.06
ीदव
-0.06
Imported
-0.05
наличии
-0.05
POSITIVE LOGITS
(todo
0.07
OTOS
0.07
regions
0.07
lac
0.07
ending
0.07
าค
0.07
Craig
0.06
multiple
0.06
ends
0.06
取
0.06
Activations Density 0.002%