INDEX
Explanations
The neuron activates on decimal‐formatted numeric values (fractions or percentages) in the text.
New Auto-Interp
Negative Logits
arus
-0.08
:item
-0.07
스포츠
-0.07
CLUDE
-0.06
EDT
-0.06
isia
-0.06
\Queue
-0.06
omidou
-0.06
critics
-0.06
_("-0.06
POSITIVE LOGITS
infrastructure
0.06
obedience
0.06
discreet
0.06
mux
0.06
Darling
0.06
подс
0.06
natal
0.05
.Cell
0.05
申请
0.05
.ndim
0.05
Activations Density 0.004%