INDEX
Negative Logits
ergic
-0.14
yny
-0.14
erman
-0.14
atte
-0.14
cky
-0.14
ÑĩиÑģ
-0.14
för
-0.14
ágenes
-0.13
erule
-0.13
drive
-0.13
POSITIVE LOGITS
ible
0.26
ulous
0.25
ibility
0.24
ibly
0.24
itor
0.23
ence
0.21
encia
0.20
ITOR
0.20
ibilit
0.20
encial
0.19
Activations Density 0.006%