INDEX
Negative Logits
saddle
-0.07
↵ ↵
-0.06
혁
-0.06
mús
-0.06
Thánh
-0.06
iele
-0.05
.pub
-0.05
ει
-0.05
materiál
-0.05
rss
-0.05
POSITIVE LOGITS
applying
0.08
SEX
0.07
acting
0.07
Equal
0.07
remedy
0.07
disputed
0.07
ladık
0.07
combined
0.07
ROLE
0.06
زندگی
0.06
Activations Density 0.030%