INDEX
Negative Logits
逻
-0.88
готов
-0.85
AppCompat
-0.82
𖡼
-0.81
فش
-0.80
Wilt
-0.79
made
-0.79
exploración
-0.77
пище
-0.77
ticion
-0.77
POSITIVE LOGITS
ille
0.90
貼り
0.87
regal
0.85
MPH
0.81
unearthed
0.81
coworkers
0.80
obe
0.80
onbe
0.79
Historically
0.77
neutr
0.77
Activations Density 0.061%