INDEX
Explanations
This neuron detects occurrences of the verb phrase “stick to” (as in “stick to X”).
New Auto-Interp
Negative Logits
ample
-0.07
فراهم
-0.07
Age
-0.07
атів
-0.07
RF
-0.07
Early
-0.07
altung
-0.06
World
-0.06
ORLD
-0.06
NRF
-0.06
POSITIVE LOGITS
stick
0.16
Stick
0.14
sticks
0.12
stick
0.11
Stick
0.11
stuck
0.10
sticking
0.10
pick
0.10
Sticky
0.08
poking
0.08
Activations Density 0.009%