INDEX
Explanations
approximation
This neuron fires on mentions of mathematical approximations or parameter‐estimation terms (e.g. “approximation,” numerical coefficients, and related technical modifiers).
New Auto-Interp
Negative Logits
len
-0.07
dokon
-0.07
hỗn
-0.06
outreach
-0.06
Fan
-0.06
Henrik
-0.06
зі
-0.06
.ButterKnife
-0.06
SimpleName
-0.05
Ix
-0.05
POSITIVE LOGITS
disappointment
0.07
Launching
0.07
Choices
0.07
rule
0.07
过来
0.07
Tribunal
0.07
GMC
0.06
DECLARE
0.06
.escape
0.06
akıl
0.06
Activations Density 0.009%