INDEX
Explanations
The neuron activates on numeric literal tokens (especially floating-point and hex numbers).
New Auto-Interp
Negative Logits
.books
-0.07
Cher
-0.07
AG
-0.06
ipro
-0.06
Span
-0.06
/)
-0.06
Alaska
-0.06
ुण
-0.06
knockout
-0.06
polite
-0.06
POSITIVE LOGITS
모든
0.07
hotline
0.06
اهم
0.06
vip
0.06
ebiliriz
0.06
ще
0.06
.lastName
0.06
_no
0.06
incl
0.06
.TypeOf
0.06
Activations Density 0.031%