INDEX
Explanations
This neuron responds to occurrences of the substring “in,” activating strongly on the standalone word “in” and on tokens beginning with “in-” (e.g. “input,” “internal”).
New Auto-Interp
Negative Logits
meis
-0.08
.ut
-0.07
.orientation
-0.06
อด
-0.06
.LinearLayoutManager
-0.06
بینی
-0.06
ags
-0.06
.acc
-0.06
_WIN
-0.06
ΩΤ
-0.06
POSITIVE LOGITS
γορ
0.07
;'↵
0.07
že
0.06
external
0.06
ecute
0.06
freq
0.06
somewhat
0.06
Hair
0.06
','%
0.06
disgusting
0.06
Activations Density 0.019%