INDEX
    Explanations

    instances of characters taking a stand or defending their beliefs

    New Auto-Interp
    Negative Logits
     separately
    -0.15
    onth
    -0.15
    ÏĦαν
    -0.14
    iw
    -0.14
    ildo
    -0.14
    vek
    -0.14
    CTX
    -0.14
     separate
    -0.14
    ullo
    -0.14
    onom
    -0.13
    POSITIVE LOGITS
     against
    0.17
    against
    0.15
    ilos
    0.15
    ammer
    0.15
     пози
    0.15
    erez
    0.15
     sahip
    0.15
     <!--[
    0.14
    ç«Ļ
    0.14
     statements
    0.14
    Act Density 0.033%

    No Known Activations