INDEX
Explanations
The neuron primarily responds to words expressing loss or departure (e.g. “leave,” “lost,” “moving,” “split off”).
New Auto-Interp
Negative Logits
зобов
-0.07
DependencyProperty
-0.07
motto
-0.07
faç
-0.07
нового
-0.06
пан
-0.06
Zukunft
-0.06
tela
-0.06
ак
-0.06
or
-0.06
POSITIVE LOGITS
treat
0.07
Scale
0.07
np
0.07
συνέ
0.07
slightly
0.07
isspace
0.06
Hollow
0.06
thern
0.06
overs
0.06
rey
0.06
Activations Density 0.040%