INDEX
Negative Logits
è·ª
-0.27
ä¸Ģéĥ¨
-0.25
idenav
-0.25
Enumer
-0.25
'&#
-0.24
ÑĩÑĮ
-0.24
ipi
-0.24
AccessType
-0.24
Gale
-0.23
edin
-0.23
POSITIVE LOGITS
认为
0.39
éĥ½è®¤ä¸º
0.36
believe
0.34
believes
0.33
åĽłæŃ¤
0.32
marvin
0.31
therefore
0.30
thinks
0.29
hopes
0.27
said
0.27
Activations Density 0.001%