INDEX
Explanations
elements
The neuron activates on floating‐point numeric literals (decimal numbers) in the text.
New Auto-Interp
Negative Logits
’nde
-0.07
Base
-0.07
Account
-0.07
(nav
-0.06
اوت
-0.06
debit
-0.06
overlap
-0.06
ip
-0.06
cellar
-0.06
.activity
-0.06
POSITIVE LOGITS
iership
0.06
ücretsiz
0.06
compuls
0.06
irresistible
0.06
ethical
0.06
useParams
0.06
Royals
0.06
Salir
0.06
[…
0.06
Dorm
0.06
Activations Density 0.066%