INDEX
Explanations
ytypical
The neuron selectively activates on the sequence “typical” (as in “atypical” or “typical”) in the text.
New Auto-Interp
Negative Logits
)("-0.07
fishes
-0.06
ût
-0.06
nov
-0.06
suggest
-0.06
jego
-0.06
rebbe
-0.06
необхідно
-0.06
<const
-0.06
Evaluate
-0.06
POSITIVE LOGITS
091
0.07
ek
0.07
КА
0.07
"<?
0.06
الجن
0.06
елі
0.06
کمی
0.06
victory
0.06
hen
0.06
یک
0.06
Activations Density 0.002%