INDEX
Explanations
numerical values
The neuron selectively activates on numeric tokens (digits and numbers).
New Auto-Interp
Negative Logits
yasal
-0.07
.bit
-0.07
용
-0.07
sotto
-0.06
Horror
-0.06
-mode
-0.06
robbery
-0.06
ora
-0.06
_TWO
-0.06
sneakers
-0.06
POSITIVE LOGITS
(elem
0.06
라
0.06
prev
0.06
sign
0.06
BufferedReader
0.06
_GEN
0.06
,name
0.06
ль
0.06
amount
0.06
Ell
0.05
Activations Density 0.066%