INDEX
Negative Logits
Anything
0.41
planilla
0.39
proš
0.39
Anything
0.38
Shame
0.38
车
0.38
anything
0.36
})}{0.36
Denn
0.36
╋
0.36
POSITIVE LOGITS
hom
1.96
Hom
1.73
Hom
1.63
hom
1.55
homo
1.30
HOM
1.28
homogen
1.23
homogeneous
1.22
homog
1.20
homogenous
1.17
Activations Density 0.010%