INDEX
Explanations
references to political regions and their characteristics
New Auto-Interp
Negative Logits
cis
-0.17
Clement
-0.15
imuth
-0.15
arkan
-0.15
OMB
-0.14
cus
-0.14
Catch
-0.14
Citizenship
-0.14
Cum
-0.14
apo
-0.14
POSITIVE LOGITS
capital
0.77
Capital
0.64
cap
0.61
capital
0.60
Capital
0.56
CAPITAL
0.55
capit
0.54
-capital
0.53
.cap
0.52
_cap
0.51
Activations Density 0.116%