INDEX
Explanations
names of individuals, especially related to political figures
New Auto-Interp
Negative Logits
lain
-1.03
Reviewed
-1.02
crawl
-1.01
endant
-0.97
pmwiki
-0.96
mble
-0.94
igslist
-0.94
places
-0.94
ioned
-0.91
anwhile
-0.89
POSITIVE LOGITS
Orwell
1.25
Lucas
1.13
Soros
1.09
Wallace
1.07
Zimmerman
1.06
Clo
1.06
STON
1.05
RR
1.05
Thor
1.01
Bush
1.00
Activations Density 0.773%