INDEX
Negative Logits
orris
-0.28
enny
-0.28
ental
-0.27
èĩŁ
-0.27
antics
-0.26
ìĤ°
-0.25
ensis
-0.25
sop
-0.25
Veter
-0.25
ĥģ
-0.24
POSITIVE LOGITS
works
0.29
Works
0.29
ä»ĸ们æĺ¯
0.28
Tud
0.26
rods
0.26
Works
0.26
Tw
0.25
æĶ¿æĿĥ
0.25
æī§æĶ¿
0.25
permanent
0.24
Activations Density 0.005%