INDEX
Explanations
proper nouns, particularly names of people and places
New Auto-Interp
Negative Logits
ovna
-0.15
#aa
-0.15
ãģ¤ãģ¶
-0.14
%C
-0.14
ofday
-0.13
êtes
-0.13
ازÙĩ
-0.13
Ä
-0.12
/or
-0.12
Ìģt
-0.12
POSITIVE LOGITS
himself
0.23
’s
0.18
's
0.18
же
0.16
-san
0.16
çļĦéĹ®é¢ĺ
0.15
çļĦä¸Ģ个
0.15
—who
0.15
stesso
0.14
Jr
0.14
Activations Density 0.219%