INDEX
Explanations
List of numbers
The neuron activates on tokens that are floating‐point numbers (i.e. numeric strings containing a decimal point and digits after it).
New Auto-Interp
Negative Logits
Spain
-0.07
intervene
-0.06
ían
-0.06
imagination
-0.06
stru
-0.06
miss
-0.05
sağ
-0.05
Dud
-0.05
ajust
-0.05
Swansea
-0.05
POSITIVE LOGITS
fillType
0.07
answering
0.07
환경
0.07
Razor
0.07
_web
0.07
Keeper
0.07
volunteer
0.07
met
0.07
>V
0.07
Reception
0.06
Activations Density 0.037%