INDEX
Negative Logits
ઢી
0.38
zo
0.38
ης
0.38
AAS
0.38
IGO
0.37
=
0.37
APA
0.37
Alice
0.37
preserved
0.36
zza
0.36
POSITIVE LOGITS
font
0.42
font
0.39
ഞാന്
0.39
elab
0.38
Font
0.37
ужа
0.37
数の
0.37
stilling
0.37
교육
0.36
Sick
0.36
Activations Density 0.001%