INDEX
    Explanations

    references to Nazi Germany and its historical actions

    New Auto-Interp
    Negative Logits
    Hentet
    -0.39
    tagext
    -0.37
    Liver
    -0.36
    flux
    -0.36
    addPreferredGap
    -0.36
     nucle
    -0.35
    úgó
    -0.35
     australiano
    -0.35
    IBS
    -0.34
     adelant
    -0.34
    POSITIVE LOGITS
     Hitler
    1.05
    Hitler
    0.93
     Nazi
    0.88
     NSD
    0.83
     Mussolini
    0.82
     Fas
    0.77
     fascist
    0.77
     SS
    0.76
     Reich
    0.75
     Führer
    0.74
    Act Density 0.444%

    No Known Activations