INDEX
Explanations
This neuron seems to be a bit confused and is activating on a variety of characters and symbols without a clear pattern
special characters or unique symbols
New Auto-Interp
Negative Logits
undai
-1.01
mathemat
-0.87
sembly
-0.85
eatures
-0.83
rongh
-0.82
glomer
-0.82
ebus
-0.78
chio
-0.77
showc
-0.76
uggest
-0.75
POSITIVE LOGITS
ãģ®
1.01
ã쮿
1.00
åº
0.99
ÙĨ
0.96
Ùĩ
0.94
çİĭ
0.94
ä¹ĭ
0.93
åŃIJ
0.93
é
0.93
æŃ¦
0.93
Activations Density 0.044%