INDEX
Explanations
mentions of the word "Nazi" or related terms
references to the Nazi regime and its associated concepts
New Auto-Interp
Negative Logits
Dub
-0.77
20439
-0.77
tis
-0.74
cially
-0.73
olulu
-0.72
WHERE
-0.71
changes
-0.69
Interstitial
-0.69
Asia
-0.68
area
-0.68
POSITIVE LOGITS
Hitler
1.15
ocaust
1.11
chwitz
1.04
Germany
0.99
Holocaust
0.98
Adolf
0.96
Reich
0.96
salute
0.94
Nazi
0.93
extermination
0.93
Activations Density 0.047%