INDEX
Negative Logits
�
-0.09
担当
-0.09
вод
-0.09
stil
-0.08
-0.08
insists
-0.08
_PICK
-0.08
�
-0.08
واضحة
-0.08
UTERS
-0.07
POSITIVE LOGITS
scaling
0.09
_final
0.09
Scaling
0.09
}/
0.08
_n
0.08
_scaled
0.08
GF
0.08
_scal
0.08
formula
0.08
Scaling
0.08
Activations Density 0.030%