INDEX
Negative Logits
addir
-0.19
ukkan
-0.18
entially
-0.16
lify
-0.15
å©·
-0.14
indo
-0.14
bench
-0.14
ãĤ»
-0.14
antz
-0.14
rior
-0.13
POSITIVE LOGITS
Äı
0.19
surname
0.17
369
0.17
esty
0.15
è»
0.14
Stout
0.14
osoph
0.14
oose
0.14
cred
0.14
ews
0.14
Activations Density 0.005%