INDEX
Explanations
Ratings and scores
The neuron activates on numeric score or rating values (especially decimals) in the text.
New Auto-Interp
Negative Logits
ﻮ
-0.07
Tobacco
-0.07
Egg
-0.06
ander
-0.06
IndexOf
-0.06
laundering
-0.06
Piano
-0.06
conscience
-0.06
_subtitle
-0.06
ieten
-0.06
POSITIVE LOGITS
син
0.07
crowds
0.06
utiliser
0.06
synonyms
0.06
jl
0.06
{x0.06
-mini
0.06
WHEN
0.06
calcular
0.06
metaphor
0.06
Activations Density 0.016%