INDEX
Negative Logits
parts
-0.07
works
-0.07
olders
-0.07
년에
-0.06
began
-0.06
sed
-0.06
black
-0.06
limb
-0.06
bilder
-0.06
多
-0.06
POSITIVE LOGITS
guarantee
0.11
guarantees
0.08
GHz
0.08
Assurance
0.07
garant
0.07
Gam
0.07
Guarantee
0.07
ataka
0.07
guaranteed
0.07
hatır
0.06
Activations Density 0.021%