INDEX
    Explanations

    mentions of the term "Nazi" and its variations related to extremist ideologies

    New Auto-Interp
    Negative Logits
    andon
    -0.16
    quil
    -0.16
    aiser
    -0.14
    InternalServerError
    -0.14
    adge
    -0.14
    .brand
    -0.14
    nia
    -0.14
    itzer
    -0.14
    ovah
    -0.13
    365
    -0.13
    POSITIVE LOGITS
    areth
    0.36
    ional
    0.26
    ionale
    0.25
    ionales
    0.20
    urally
    0.18
    arius
    0.17
    DAQ
    0.17
    ario
    0.16
    daq
    0.16
    arov
    0.16
    Act Density 0.006%

    No Known Activations