INDEX
Explanations
dependent
The neuron fires on tokens related to “depend,” i.e. words or morphemes expressing dependence (e.g. depend, dependent, dependency).
New Auto-Interp
Negative Logits
Carlton
-0.07
uzz
-0.07
RA
-0.07
Roll
-0.07
Color
-0.07
vua
-0.07
�
-0.07
TX
-0.07
sco
-0.06
rfl
-0.06
POSITIVE LOGITS
dependent
0.13
-dependent
0.10
Dep
0.09
Dep
0.09
pend
0.08
PEND
0.08
.dep
0.08
depend
0.08
dependence
0.08
Depend
0.08
Activations Density 0.019%