INDEX
Negative Logits
phys
0.42
physiological
0.40
outings
0.38
esent
0.38
যেন
0.38
stiffer
0.37
ports
0.37
endgame
0.36
phys
0.36
sizes
0.36
POSITIVE LOGITS
实行
0.41
ʿ
0.39
ѝ
0.34
Belediye
0.33
Donald
0.33
Hat
0.32
νε
0.32
బోర్
0.32
樾
0.32
அவருடைய
0.32
Activations Density 0.005%