INDEX
Explanations
proper nouns related to notable individuals and institutions
New Auto-Interp
Negative Logits
genders
-0.16
zee
-0.15
oras
-0.15
ê³¼ìłķ
-0.14
gender
-0.14
гÑĢомад
-0.14
éģĬ
-0.14
ifes
-0.14
gross
-0.13
cris
-0.13
POSITIVE LOGITS
(G
0.19
EDIA
0.17
=G
0.16
(GL
0.16
/G
0.15
g
0.15
erli
0.15
.GL
0.14
Gl
0.14
جب
0.14
Activations Density 0.222%