INDEX
Negative Logits
omba
-0.17
olut
-0.15
èĥ
-0.15
amak
-0.14
rada
-0.14
ssh
-0.14
arith
-0.13
*(*
-0.13
inition
-0.13
logy
-0.13
POSITIVE LOGITS
wr
0.16
ÏĢοÏĦε
0.15
deal
0.15
nof
0.15
Wunused
0.14
rays
0.14
uste
0.14
Deal
0.14
rop
0.14
plain
0.14
Activations Density 0.005%