INDEX
Explanations
terms related to political topics and discussions
New Auto-Interp
Negative Logits
ような
-0.68
ные
-0.66
ulatory
-0.65
чные
-0.64
готови
-0.64
illary
-0.64
chial
-0.64
amerikanischer
-0.63
ptian
-0.62
itake
-0.62
POSITIVE LOGITS
Protestantism
1.23
Judaism
1.19
Semitism
1.18
Buddhism
1.17
Communism
1.12
Catholicism
1.10
Feminism
1.09
dentistry
1.09
liberalism
1.09
communism
1.08
Activations Density 1.389%