INDEX
    Explanations

    mentions of the Nazi regime and related terms

    references to the Nazi regime and related historical context

    New Auto-Interp
    Negative Logits
    pole
    -0.81
    pring
    -0.77
    tis
    -0.77
    area
    -0.76
    20439
    -0.76
    Dub
    -0.76
    changes
    -0.73
    notes
    -0.72
    Interstitial
    -0.71
    Asia
    -0.69
    POSITIVE LOGITS
     Hitler
    1.04
    ocaust
    1.01
    chwitz
    0.97
     Holocaust
    0.92
     Germany
    0.89
     salute
    0.87
     extermination
    0.87
     Nazi
    0.86
     Adolf
    0.85
    wald
    0.85
    Act Density 0.082%

    No Known Activations