INDEX
Negative Logits
çľ¼æ³ª
-0.30
ipse
-0.29
ancy
-0.28
aji
-0.28
hip
-0.27
å¾Ĥ
-0.27
Jess
-0.27
ographs
-0.26
Burl
-0.26
lege
-0.26
POSITIVE LOGITS
å̼å¾Ĺ注æĦı
0.27
çļĦéĩįè¦ģ
0.27
another
0.27
Fact
0.27
notable
0.26
è¾ĥé«ĺ
0.26
å̼å¾Ĺä¸ĢæıIJ
0.26
滤
0.25
æľīåĪ©
0.25
fact
0.25
Activations Density 0.011%