INDEX
    Explanations

    phrases expressing obligation or necessity

    New Auto-Interp
    Negative Logits
    olem
    -0.16
    mada
    -0.15
    oras
    -0.15
    elage
    -0.15
    successfully
    -0.14
    uhe
    -0.14
    окол
    -0.14
    eyim
    -0.14
    phia
    -0.14
    gree
    -0.14
    POSITIVE LOGITS
     ashamed
    0.20
     avoided
    0.20
     kept
    0.18
     given
    0.16
     approached
    0.16
     careful
    0.16
     examined
    0.15
     warning
    0.15
     viewed
    0.15
     carefully
    0.15
    Act Density 0.116%

    No Known Activations