INDEX
Negative Logits
poly
-0.09
Gay
-0.08
poly
-0.08
יך
-0.08
_poly
-0.08
Poly
-0.08
(poly
-0.08
각
-0.07
parents
-0.07
াফ
-0.07
POSITIVE LOGITS
irratti
0.08
berger
0.08
ele
0.08
Eg
0.08
Me
0.08
ági
0.08
่าว
0.08
ترین
0.07
Ele
0.07
위한
0.07
Activations Density 0.371%