INDEX
Negative Logits
gewiesen
-1.02
zeigt
-0.90
殍
-0.89
czeniu
-0.86
salão
-0.85
forse
-0.84
糁
-0.84
atrician
-0.83
thèmes
-0.83
بسم
-0.83
POSITIVE LOGITS
but
1.38
behavior
1.38
shaped
1.36
circumstance
1.32
twist
1.27
behaviour
1.21
ोग
1.20
phrasing
1.20
happenings
1.20
circumstances
1.20
Activations Density 0.041%