INDEX
Negative Logits
affair
-0.07
ware
-0.07
Em
-0.06
Categories
-0.06
bild
-0.06
thicker
-0.06
assuming
-0.06
phen
-0.06
повтор
-0.06
relieve
-0.06
POSITIVE LOGITS
URNS
0.06
_DR
0.06
yc
0.06
clang
0.06
value
0.06
_LS
0.06
">
0.06
ression
0.06
Purdue
0.06
0.06
Activations Density 0.016%