INDEX
Explanations
mentions of the Nazi regime and related terms
references to the Nazi regime and related historical context
New Auto-Interp
Negative Logits
pole
-0.81
pring
-0.77
tis
-0.77
area
-0.76
20439
-0.76
Dub
-0.76
changes
-0.73
notes
-0.72
Interstitial
-0.71
Asia
-0.69
POSITIVE LOGITS
Hitler
1.04
ocaust
1.01
chwitz
0.97
Holocaust
0.92
Germany
0.89
salute
0.87
extermination
0.87
Nazi
0.86
Adolf
0.85
wald
0.85
Activations Density 0.082%