INDEX
Explanations
references to historical political events and regimes
New Auto-Interp
Negative Logits
ahoma
-0.15
Copenhagen
-0.15
783
-0.15
morgan
-0.15
é¼»
-0.14
TSA
-0.14
VERTISE
-0.14
imuth
-0.13
enza
-0.13
Cunning
-0.13
POSITIVE LOGITS
fasc
0.38
Fasc
0.36
Muss
0.32
fascism
0.29
fascist
0.29
193
0.26
Franco
0.24
Fal
0.24
Fal
0.24
Hitler
0.23
Activations Density 0.015%