INDEX
    Explanations

    references to political and social issues

    New Auto-Interp
    Negative Logits
     antic
    -0.15
    ìĦŃ
    -0.15
    avra
    -0.15
    iques
    -0.15
    IQUE
    -0.15
    arkan
    -0.14
    taboola
    -0.14
    rál
    -0.14
    fare
    -0.14
    ãĥ£
    -0.14
    POSITIVE LOGITS
    sono
    0.18
    lum
    0.15
    hower
    0.15
    zew
    0.14
    ÙĬÙĩ
    0.14
    iez
    0.13
    azole
    0.13
    OnClick
    0.13
    uD
    0.13
    ware
    0.13
    Act Density 0.159%

    No Known Activations