INDEX
Explanations
names and proper nouns, particularly those related to individuals and locations
New Auto-Interp
Negative Logits
.yy
-0.15
kus
-0.15
ortal
-0.14
ereg
-0.14
agara
-0.14
ạn
-0.14
eries
-0.14
ery
-0.14
æ²Ī
-0.13
YW
-0.13
POSITIVE LOGITS
ampo
0.17
cla
0.15
одо
0.15
444
0.14
disp
0.14
awn
0.14
пок
0.14
gew
0.13
0.13
жив
0.13
Activations Density 0.348%