INDEX
Explanations
proper nouns, particularly names of people and organizations
New Auto-Interp
Negative Logits
atif
-0.17
vvm
-0.15
oard
-0.15
ÑĢол
-0.15
_$_
-0.14
inode
-0.14
scenery
-0.14
iges
-0.14
INTR
-0.14
iller
-0.13
POSITIVE LOGITS
zer
0.19
cz
0.18
opoulos
0.18
ows
0.17
ovich
0.17
owski
0.17
kin
0.15
avec
0.15
ãĤ´ãĥª
0.15
owitz
0.14
Activations Density 0.192%