INDEX
    Explanations

    references to political groups or collectives

    New Auto-Interp
    Negative Logits
    obby
    -0.16
    reff
    -0.14
    asters
    -0.14
    ACHI
    -0.14
    amin
    -0.14
    ifar
    -0.14
    aju
    -0.13
    chai
    -0.13
    pak
    -0.13
     unchanged
    -0.13
    POSITIVE LOGITS
    ombine
    0.17
    ToFit
    0.17
    ovsky
    0.16
    غر
    0.15
    OfSize
    0.15
    ropoda
    0.15
    illin
    0.15
    insky
    0.15
    edl
    0.15
    ä¸
    0.15
    Act Density 0.001%

    No Known Activations