INDEX
Explanations
Telephone
The neuron specifically fires on instances of the word “Telephone,” marking telephone‐related headings or labels in the text.
New Auto-Interp
Negative Logits
sites
-0.07
seaborn
-0.07
AI
-0.06
SD
-0.06
RG
-0.06
users
-0.06
Faction
-0.06
HEADER
-0.06
хто
-0.06
mode
-0.06
POSITIVE LOGITS
telephone
0.09
telephone
0.09
automobile
0.08
="../../
0.08
Bicycle
0.07
even
0.07
oce
0.07
elev
0.07
bicycles
0.07
orth
0.07
Activations Density 0.011%