INDEX
Negative Logits
↵
0.34
While
0.33
↵↵
0.32
However
0.30
Also
0.28
It
0.27
also
0.27
even
0.27
When
0.26
By
0.25
POSITIVE LOGITS
anarchy
0.30
isomorphisms
0.29
aberrations
0.28
drunkenness
0.26
oppress
0.26
dajj
0.26
architectures
0.25
prejudices
0.25
perturbations
0.25
hypoth
0.25
Activations Density 0.253%