INDEX
Negative Logits
.dev
-0.16
central
-0.15
ula
-0.15
as
-0.15
Guar
-0.14
ne
-0.14
past
-0.14
-commercial
-0.14
se
-0.14
sed
-0.14
POSITIVE LOGITS
eview
0.17
ltra
0.16
imoto
0.15
wu
0.15
Siz
0.15
vla
0.15
idir
0.15
thinkable
0.15
alink
0.15
.bz
0.15
Activations Density 0.008%