INDEX
Negative Logits
有很多
-0.84
💄
-0.83
meu
-0.82
поэтому
-0.81
很多人
-0.80
mening
-0.80
👗
-0.80
rhino
-0.79
Hebrews
-0.79
svoje
-0.79
POSITIVE LOGITS
does
1.45
did
1.10
it
1.04
DOES
0.95
cedo
0.93
Does
0.88
leski
0.81
!$
0.80
sobr
0.79
ضور
0.78
Activations Density 0.071%