INDEX
Explanations
names of individuals
the article "the" and specific names
New Auto-Interp
Negative Logits
McKenzie
-0.66
Reloaded
-0.65
Domain
-0.64
Enterprise
-0.63
Fraser
-0.63
warm
-0.62
DERR
-0.61
Wonderland
-0.60
Gutenberg
-0.60
proxies
-0.59
POSITIVE LOGITS
onym
1.00
ocrat
0.90
cery
0.88
onymous
0.87
phony
0.87
phy
0.86
odic
0.84
opol
0.84
cer
0.83
swer
0.82
Activations Density 0.097%