INDEX
Explanations
words related to authoritarianism and human rights
terms associated with humanitarian and authoritarian themes
New Auto-Interp
Negative Logits
VEN
-0.77
UX
-0.77
enery
-0.70
Alloy
-0.67
URES
-0.67
TD
-0.66
Yelp
-0.66
touch
-0.64
BOOK
-0.63
FORMATION
-0.63
POSITIVE LOGITS
ians
1.17
itar
0.98
iak
0.92
ters
0.90
iano
0.89
inals
0.89
tarian
0.87
ilateral
0.87
iac
0.86
ienne
0.85
Activations Density 0.012%