INDEX
Explanations
references to Nazi Germany and its leaders
New Auto-Interp
Negative Logits
IsContent
-0.69
الإنجليزية
-0.66
beginnetje
-0.66
kke
-0.65
ništ
-0.63
RenderAtEndOf
-0.63
发表于
-0.63
resourceCulture
-0.62
oznam
-0.61
input
-0.60
POSITIVE LOGITS
Hitler
1.09
Nazi
1.05
Hitler
1.00
Nazi
0.94
nazi
0.90
fascist
0.86
Cæsar
0.84
Nazis
0.82
nazis
0.78
fascism
0.77
Activations Density 0.027%