INDEX
    Explanations

    terms related to racial and religious extremism

    New Auto-Interp
    Negative Logits
    akis
    -0.18
    rrha
    -0.17
     nackt
    -0.16
    ffect
    -0.15
    vale
    -0.15
    icamente
    -0.15
    gay
    -0.14
    ilde
    -0.14
     Gay
    -0.14
     lạ
    -0.14
    POSITIVE LOGITS
     plac
    0.16
     Ves
    0.15
     Borders
    0.15
    ·
    0.15
    å¹
    0.15
     Caul
    0.14
     host
    0.14
    ç©į
    0.14
    aus
    0.14
    /tos
    0.14
    Act Density 0.185%

    No Known Activations