INDEX
Negative Logits
Heads
-0.07
말
-0.07
ゃ
-0.06
appearance
-0.06
blems
-0.06
Ze
-0.06
Estimated
-0.06
ضاء
-0.06
anchored
-0.06
Uran
-0.06
POSITIVE LOGITS
xr
0.07
154
0.06
enburg
0.06
"+
0.06
utenberg
0.06
vx
0.06
일본
0.06
164
0.06
<dd
0.06
Symfony
0.06
Activations Density 0.030%