INDEX
Explanations
nationalities or religious affiliations
terms associated with political and ethnic identities
New Auto-Interp
Negative Logits
ravings
-1.01
ories
-1.00
isites
-0.98
atories
-0.97
onies
-0.97
boards
-0.96
ologies
-0.94
bows
-0.94
encies
-0.94
brids
-0.92
POSITIVE LOGITS
politician
1.28
thinker
1.28
journalist
1.20
who
1.19
citizen
1.18
traveler
1.17
businessman
1.16
legislator
1.15
diplomat
1.13
woman
1.12
Activations Density 0.253%