INDEX
Negative Logits
0.46
becomes
0.36
irsi
0.34
ูด
0.34
dotycz
0.34
Thoughts
0.34
NOP
0.34
zov
0.33
VERE
0.33
OS
0.32
POSITIVE LOGITS
tingham
0.70
withstanding
0.68
necessarily
0.67
liking
0.58
obstante
0.57
sure
0.57
HING
0.55
ional
0.53
Not
0.53
eworthy
0.52
Activations Density 0.033%