INDEX
Explanations
presence
This neuron activates on the phrase “presence of.”
New Auto-Interp
Negative Logits
excel
-0.08
unlucky
-0.07
stol
-0.07
ominated
-0.07
_label
-0.07
Exercise
-0.07
Experimental
-0.07
accelerated
-0.07
Split
-0.07
_EXIT
-0.07
POSITIVE LOGITS
presence
0.14
presence
0.11
Presence
0.10
друж
0.07
absence
0.07
Pen
0.07
ence
0.07
نگهد
0.07
moisture
0.07
entar
0.07
Activations Density 0.015%