INDEX
Explanations
proper nouns, particularly names and surnames
New Auto-Interp
Negative Logits
opt
-0.14
BILE
-0.14
.tm
-0.14
каз
-0.14
çłģ
-0.14
aker
-0.13
ÙħعÙĦ
-0.13
bak
-0.13
stra
-0.13
flater
-0.13
POSITIVE LOGITS
alim
0.17
atz
0.15
avez
0.15
cta
0.15
illaume
0.14
wright
0.14
廳
0.14
bih
0.13
thin
0.13
KEEP
0.13
Activations Density 0.519%