INDEX
Negative Logits
æĭĶ
-0.27
ummer
-0.27
éĥ¨éŨ
-0.26
Visualization
-0.26
çļĦæĹ¥åŃIJéĩĮ
-0.25
Protected
-0.25
unate
-0.25
blat
-0.24
åĪĨå±Ģ
-0.24
cow
-0.24
POSITIVE LOGITS
è°Ī论
0.28
idi
0.27
referring
0.27
车çīĮ
0.27
uf
0.26
chemistry
0.26
deutsch
0.26
ä»ĭç»į
0.26
chemistry
0.25
ç®ĢåİĨ
0.25
Activations Density 0.006%