INDEX
    Explanations

    terms related to societal issues and injustice

    New Auto-Interp
    Negative Logits
    til
    -0.18
    enco
    -0.17
    akh
    -0.16
    piler
    -0.16
     Král
    -0.16
     pip
    -0.15
    ÏĦÏīν
    -0.15
    ATER
    -0.15
    ARGER
    -0.15
    ffe
    -0.15
    POSITIVE LOGITS
    ideo
    0.15
    ood
    0.15
    ennon
    0.15
    upa
    0.14
    .VisualBasic
    0.14
    quit
    0.14
    ëį¸
    0.14
     favorite
    0.14
    ena
    0.13
     continuously
    0.13
    Act Density 0.002%

    No Known Activations