INDEX
Explanations
Abbreviation
This neuron activates on occurrences of the substring “PT” (as in “GPT”).
New Auto-Interp
Negative Logits
ilerine
-0.07
Olson
-0.07
Han
-0.07
mer
-0.07
_AUTH
-0.06
าห
-0.06
766
-0.06
Adoles
-0.06
.INVISIBLE
-0.06
bara
-0.06
POSITIVE LOGITS
upt
0.07
�
0.07
mektedir
0.07
judul
0.07
Canterbury
0.06
".");↵
0.06
mpz
0.06
patrol
0.06
ATP
0.06
<p
0.06
Activations Density 0.009%