INDEX
Negative Logits
irresponsible
-0.11
Mismatch
-0.10
394
-0.09
Inspiration
-0.09
depreci
-0.09
ines
-0.09
_tid
-0.09
gent
-0.09
åħ´
-0.09
instinct
-0.08
POSITIVE LOGITS
hub
0.30
Pride
0.26
pride
0.26
hub
0.23
Hub
0.22
ambition
0.21
Hub
0.20
eg
0.19
ego
0.19
cov
0.19
Activations Density 0.103%