INDEX
Explanations
proper nouns
proper nouns that are likely associated with names or identifiers
New Auto-Interp
Negative Logits
condoms
-0.69
Carnage
-0.68
Sno
-0.68
foreseeable
-0.67
Hades
-0.66
civilian
-0.65
socks
-0.64
Pegasus
-0.64
mounts
-0.63
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.63
POSITIVE LOGITS
ultz
1.03
rick
0.96
arnaev
0.94
ucci
0.94
undy
0.94
anmar
0.94
alin
0.93
orsi
0.91
essel
0.91
iggs
0.91
Activations Density 0.108%