INDEX
Explanations
Scientific publications/patents
The neuron activates on numeric tokens corresponding to publication years in reference citations.
New Auto-Interp
Negative Logits
電
-0.07
ارد
-0.07
comed
-0.07
ortion
-0.07
foll
-0.06
jourd
-0.06
icates
-0.06
loro
-0.06
pip
-0.06
알
-0.06
POSITIVE LOGITS
topLevel
0.06
nextState
0.06
wrapping
0.06
Hä
0.06
Shell
0.06
getInstance
0.06
refute
0.06
Giant
0.06
(Unknown
0.06
siguiente
0.06
Activations Density 0.005%