INDEX
Negative Logits
born
-0.18
-American
-0.17
ritz
-0.16
nde
-0.16
aper
-0.16
ate
-0.15
istica
-0.15
tery
-0.15
æģ
-0.15
mente
-0.15
POSITIVE LOGITS
anness
0.19
Latina
0.18
latina
0.17
BirleÅŁik
0.17
antal
0.16
اÙĦÛĮ
0.15
eus
0.15
iges
0.15
eturn
0.15
alf
0.15
Activations Density 0.036%